Models

Gemini 3.1 Flash-Lite

Most cost-efficient Gemini model, optimized for high-volume agentic tasks, translation, and simple data processing.

Gemini 3.1 Pro Preview

Latest performance and intelligence improvements to the best Gemini model family for multimodal understanding and agentic capabilities.

Gemini 2.5 Pro

State-of-the-art multipurpose model excelling at coding and complex reasoning tasks.

Gemini 2.5 Flash

Hybrid reasoning model with 1M token context and thinking budgets.

Gemini 2.5 Flash-Lite

Smallest and most cost effective Gemini model, built for at-scale usage.

Allam 2 7B

Arabic and English 7B model from Saudi Data and AI Authority.

Groq Compound

Groq's own compound agentic model combining multiple inference steps.

Groq Compound Mini

Smaller and faster version of Groq's compound agentic model.

OpenAI GPT OSS 120B (Groq)

OpenAI open-source 120B model hosted on Groq hardware.

OpenAI GPT OSS 20B (Groq)

OpenAI open-source 20B model hosted on Groq hardware.

Mistral 7B

Mistral's original open-weight 7B model.

Mixtral 8x7B

Sparse mixture-of-experts model with 8 experts of 7B each.

Mixtral 8x22B

Largest open Mistral MoE model with 8 experts of 22B each.

Mistral Nemo

12B model built with Nvidia, strong multilingual and coding performance.

Devstral Small

Agentic coding model from Mistral, optimised for software engineering tasks.

Baidu: CoBuddy

Code generation model from Baidu, optimized for coding tasks and AI Agent workflows.

NVIDIA: Nemotron 3 Nano Omni

Open multimodal model designed for enterprise agent systems. Accepts text, image, video, and audio.

Poolside: Laguna XS.2

Efficient coding agent model with tool calling and reasoning in a compact footprint.

Poolside: Laguna M.1

Flagship coding agent model from Poolside, optimized for complex software engineering tasks.

DeepSeek: V4 Flash

Fast DeepSeek V4 Flash model with a 1M token context window.

Google: Gemma 4 26B A4B

Google Gemma 4 26B mixture-of-experts model.

Google: Gemma 4 31B

Google Gemma 4 31B instruction-tuned model.

Arcee AI: Trinity Large Thinking

Large thinking model from Arcee AI with extended reasoning capabilities.

NVIDIA: Nemotron 3 Super 120B

Large-scale NVIDIA Nemotron model with 1M token context.

MiniMax: M2.5

MiniMax M2.5 large language model.

LiquidAI: LFM2.5 Thinking

Small 1.2B thinking model from LiquidAI.

LiquidAI: LFM2.5 Instruct

Small 1.2B instruct model from LiquidAI.

NVIDIA: Nemotron 3 Nano 30B

NVIDIA Nemotron 3 Nano 30B A3B model.

NVIDIA: Nemotron Nano 12B VL

NVIDIA Nemotron Nano 12B vision-language model.

Qwen: Qwen3 Next 80B A3B

Qwen3 Next 80B A3B instruct model.

NVIDIA: Nemotron Nano 9B V2

NVIDIA Nemotron Nano 9B V2 model.

OpenAI: GPT OSS 120B

OpenAI open-source 120B model available via OpenRouter.

OpenAI: GPT OSS 20B

OpenAI open-source 20B model available via OpenRouter.

Z.ai: GLM 4.5 Air

GLM 4.5 Air model from Z.ai.

Qwen: Qwen3 Coder 480B

Qwen3 Coder 480B A35B — large coding-optimised model with 1M context.

Venice: Uncensored

Dolphin Mistral 24B Venice edition — uncensored model.

Meta: Llama 3.3 70B

Meta Llama 3.3 70B Instruct via OpenRouter free tier.

Meta: Llama 3.2 3B

Meta Llama 3.2 3B Instruct — compact and fast via OpenRouter free tier.

Nous: Hermes 3 405B

Nous Research Hermes 3 405B Instruct via OpenRouter free tier.

Qwen 3 32B

Qwen3 32B with hybrid thinking mode

Llama 3.1 8B

Efficient open-source model for various tasks

Llama 3.3 70B

Powerful open-source model with strong performance

Llama 4 Scout

Fast inference model optimized for Groq hardware

Gemini 3 Pro

Advanced multimodal model with extended thinking

Gemini 3 Flash

Fast and efficient multimodal model

Mistral Large Latest

Top-tier reasoning model for high-complexity tasks

Mistral Small Latest

Cost-efficient model for simple tasks

Inactive Models

Kimi K2

inactive

Advanced reasoning model from Moonshot AI

Llama 4 Maverick

inactive

High-performance model with advanced capabilities on Groq