
Together.ai focuses on advancing AI technology through open-source projects and innovations in large language models, aiming to make AI more accessible and efficient for various applications.
Together.ai stands out in the AI market primarily through its innovative contributions to sequence modeling architectures with projects like StripedHyena models. These models offer a competitive alternative to traditional Transformers, focusing on computational efficiency and enhanced performance. Unlike conventional models, StripedHyena combines gated convolutions and attention, achieving superior performance in handling both short-context tasks and processing lengthy prompts.
Target users include AI researchers, developers involved in machine learning projects, and organizations looking for efficient AI models for a range of applications from natural language processing to customer service automation.

Together.ai offers four products:
Together Inference: Run leading open-source models like Llama-2 on the fastest inference stack available, up to 3x faster1 than TGI, vLLM, or other inference APIs like Perplexity, Anyscale, or Mosaic ML.
Together Fine-tuning: Customize leading open-source models with your own private data.
Together Custom models: designed to design and train own state-of-the-art AI models from scratch. Users retain full ownership of the model that is created, and they can run their model wherever they please.
Together GPU Clusters: high-end compute clusters for training and fine-tuning. Together Clusters are ready-to-go with the blazing fast Together Training stack. Together GPU Clusters has a >95% renewal rate. The fastest network for distributed training — 3.2Tbps Infiniband. State-of-the-art training clusters with the fastest compute available — Nvidia H100 and A100 GPUs.
Inference pricing: Over 100 leading open-source Chat, Language, Image, Code, and Embedding models are available through the Together Inference API. For these models you pay just for what you use. Prices are per 1 million tokens including input and output tokens for Chat, Language and Code models, only including input tokens for Embedding models, and based on image size and steps for Image models. Special promotional pricing for Llama-2 and CodeLlama models. Each model has different pricing tables. Check here for more details.
Fine-tuning pricing: Pricing for fine-tuning is based on model size, dataset size, and the number of epochs. Try the interactive calculator.
Together GPU Clusters Pricing: Together Compute provides private, state of the art clusters with H100 and A100 GPUs, connected over fast 200 Gbps non-blocking Ethernet or up to 3.2 Tbps InfiniBand networks.
Together.ai has introduced several innovative projects, such as:
Medusa: A framework for accelerating LLM generation by predicting several subsequent tokens in parallel, offering a 2x to 3x boost in speed.
RedPajama: An effort to produce a reproducible, fully-open leading language model, starting with a 1.2 trillion token dataset.
StripedHyena Models: These models introduce a novel architecture that is competitive with Transformer models, particularly in efficiency and handling long sequences.
Together.ai's collaboration with Ontocord.ai, ETH DS3Lab, Stanford CRFM, and Hazy Research highlights its commitment to building a community-driven AI research ecosystem. The focus on open-source projects like RedPajama and Medusa underscores an initiative to democratize AI technology and make advanced AI models more accessible for commercial and research purposes.
Website: https://www.together.ai/
Twitter: https://twitter.com/togethercompute
Discord: https://discord.com/invite/9Rk6sSeWEG
We dedicated to AI-generated art and AI tools, InFancy.AI is committed to sharing and exploring models, prompts, and the latest developments in AI. Join us now!

InFancy.AI
All comments (0)