In this image received on Feb. 16, 2026, Prime Minister Narendra Modi during the inauguration of India AI Impact Expo, at Bharat Mandapam in New Delhi. Sarvam AI co-founder Pratyush Kumar also seen.
| Photo Credit:
PMO via PTI Photo
Sarvam AI on Tuesday unveiled two new large language models (LLM) at the India AI Summit — a 30-billion-parameter model and a 150-billion-parameter model — both of which have outperformed comparable models in their respective size categories across key benchmarks.
In April 2025, the Government of India selected Sarvam under the IndiaAI Mission to develop the country’s first sovereign large language model, alongside 11 other startups chosen to advance India’s AI ecosystem.
Production-ready
The 30B-parameter model is pre-trained on 16 trillion tokens and supports a context length of 32,000 tokens, enabling long-running conversations and agentic workflows. Its relatively small set of activated parameters keeps inference costs low.
Speaking at the summit, Pratyush Kumar, the cofounder of Sarvam, explained, “The larger the model, more the parameters. While the model is more capable, it is also harder to train and longer to run in production. However, a 30 billion parameter model today is relatively small. Ours is a mixture-of-experts (MoE) model and has just 1 billion activated parameters, meaning that in generating every output token, it only activates 1B parameters.”
Advanced reasoning
Sarvam 30B also outperforms peers in its class on efficient reasoning, he said. The model is designed as a real-time conversational engine for production applications, supporting Indian languages and user-facing dialogue experiences.
Across global benchmarks spanning mathematics, coding, and knowledge tasks, Sarvam 30B delivers strong results among, either surpassing or remains competitive with models like OpenAI’s GPT-OSS-20B, Alibaba Cloud’s Qwen3-30B, Mistral-3-2-24B and Google’s Gemma 27B on benchmarks including Math500 which tests advanced mathematics; HumanEval and MBPP which assess code generation and programming ability; Live Code Bench v6 which measures real-world coding performance; and MMLU and MMLU Pro which evaluate broad, multidisciplinary knowledge and advanced reasoning.
Sarvam 105B, the company’s larger MoE model with 9 billion active parameters and a 128,000-token context window, is built for complex reasoning. It delivers strong performance across math, coding, and Indian languages, supports software engineering tasks such as bug fixes and code generation, and performs on par with leading open- and closed-source frontier models in its class.
On benchmarks, Sarvam performs on par with models such as GPT-OSS-120B, Qwen3-Next-80B, and Zhipu AI’s GLM-4.5-Air across GPQA Diamond, which evaluates graduate-level, multi-step reasoning across domains, Beyond AIME, which tests advanced mathematical problem-solving beyond Olympiad-level difficulty, and MMLU Pro.
Kumar noted that Sarvam 105B performs better with Indian languages, compared to bigger and more expensive models like Gemini 2.5 Flash. On most benchmarks, the model also beats DeepSeek-R1, a 600B parameter model released a year ago. “While these models can evolve fast, Sarvam 105B was trained from scratch, is one-sixth the size, and yet is providing intelligence competitive to the earlier version of DeepSeek,” he said.
India-focussed edge
Srinivas Padmanabhuni, the CTO of AiEnsured, a testing suite for AI products, noted that the Sarvam models can potentially expand access to underprivileged, colloquial audiences and local use cases.
“In the context of Indian language–based chatbots, whether for customer service or queries related to government services, these can be delivered to bottom-of-the-pyramid users in local Indian languages. Similarly, AI can democratise legal reasoning and research, making them accessible to the bottom-of-the-pyramid and regional language audience,” he said.
Sarvam’s models have outperformed larger ones like Gemini due to stronger handling of Indian language context, including code-mixed formats like Hinglish, which can be challenging for Gemini Flash models.
In addition to targeted training on local data, Sarvam benefits from optimised architectures such as mixture-of-experts designs. Meanwhile, many global models lack a deep Indian linguistic and cultural context, giving Sarvam a clear USP in localised language and speech understanding.
All three models, including the 3B variant, were trained using compute allocated under the India AI Mission, with infrastructure support from Yotta and Nvidia.
Sarvam will open-source its 30B and 105B models, enabling developers to build applications and conversational experiences on top of them.
Published on February 18, 2026
