# free-llm-api-resources **Repository Path**: xxj_2002/free-llm-api-resources ## Basic Information - **Project Name**: free-llm-api-resources - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2026-03-28 - **Last Updated**: 2026-03-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Free LLM API resources This lists various services that provide free access or credits towards API-based LLM usage. > [!NOTE] > Please don't abuse these services, else we might lose them. > [!WARNING] > This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot) - [Free Providers](#free-providers) - [OpenRouter](#openrouter) - [Google AI Studio](#google-ai-studio) - [NVIDIA NIM](#nvidia-nim) - [Mistral (La Plateforme)](#mistral-la-plateforme) - [Mistral (Codestral)](#mistral-codestral) - [HuggingFace Inference Providers](#huggingface-inference-providers) - [Vercel AI Gateway](#vercel-ai-gateway) - [Cerebras](#cerebras) - [Groq](#groq) - [Cohere](#cohere) - [GitHub Models](#github-models) - [Cloudflare Workers AI](#cloudflare-workers-ai) - [Google Cloud Vertex AI](#google-cloud-vertex-ai) - [Providers with trial credits](#providers-with-trial-credits) - [Fireworks](#fireworks) - [Baseten](#baseten) - [Nebius](#nebius) - [Novita](#novita) - [AI21](#ai21) - [Upstage](#upstage) - [NLP Cloud](#nlp-cloud) - [Alibaba Cloud (International) Model Studio](#alibaba-cloud-international-model-studio) - [Modal](#modal) - [Inference.net](#inferencenet) - [Hyperbolic](#hyperbolic) - [SambaNova Cloud](#sambanova-cloud) - [Scaleway Generative APIs](#scaleway-generative-apis) ## Free Providers ### [OpenRouter](https://openrouter.ai) **Limits:** [20 requests/minute
50 requests/day
Up to 1000 requests/day with $10 lifetime topup](https://openrouter.ai/docs/api-reference/limits) Models share a common quota. - [Gemma 3 12B Instruct](https://openrouter.ai/google/gemma-3-12b-it:free) - [Gemma 3 27B Instruct](https://openrouter.ai/google/gemma-3-27b-it:free) - [Gemma 3 4B Instruct](https://openrouter.ai/google/gemma-3-4b-it:free) - [Hermes 3 Llama 3.1 405B](https://openrouter.ai/nousresearch/hermes-3-llama-3.1-405b:free) - [Llama 3.1 405B Instruct](https://openrouter.ai/meta-llama/llama-3.1-405b-instruct:free) - [Llama 3.2 3B Instruct](https://openrouter.ai/meta-llama/llama-3.2-3b-instruct:free) - [Llama 3.3 70B Instruct](https://openrouter.ai/meta-llama/llama-3.3-70b-instruct:free) - [Mistral Small 3.1 24B Instruct](https://openrouter.ai/mistralai/mistral-small-3.1-24b-instruct:free) - [Qwen 2.5 VL 7B Instruct](https://openrouter.ai/qwen/qwen-2.5-vl-7b-instruct:free) - [allenai/molmo-2-8b:free](https://openrouter.ai/allenai/molmo-2-8b:free) - [arcee-ai/trinity-large-preview:free](https://openrouter.ai/arcee-ai/trinity-large-preview:free) - [arcee-ai/trinity-mini:free](https://openrouter.ai/arcee-ai/trinity-mini:free) - [cognitivecomputations/dolphin-mistral-24b-venice-edition:free](https://openrouter.ai/cognitivecomputations/dolphin-mistral-24b-venice-edition:free) - [deepseek/deepseek-r1-0528:free](https://openrouter.ai/deepseek/deepseek-r1-0528:free) - [google/gemma-3n-e2b-it:free](https://openrouter.ai/google/gemma-3n-e2b-it:free) - [google/gemma-3n-e4b-it:free](https://openrouter.ai/google/gemma-3n-e4b-it:free) - [liquid/lfm-2.5-1.2b-instruct:free](https://openrouter.ai/liquid/lfm-2.5-1.2b-instruct:free) - [liquid/lfm-2.5-1.2b-thinking:free](https://openrouter.ai/liquid/lfm-2.5-1.2b-thinking:free) - [moonshotai/kimi-k2:free](https://openrouter.ai/moonshotai/kimi-k2:free) - [nvidia/nemotron-3-nano-30b-a3b:free](https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free) - [nvidia/nemotron-nano-12b-v2-vl:free](https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free) - [nvidia/nemotron-nano-9b-v2:free](https://openrouter.ai/nvidia/nemotron-nano-9b-v2:free) - [openai/gpt-oss-120b:free](https://openrouter.ai/openai/gpt-oss-120b:free) - [openai/gpt-oss-20b:free](https://openrouter.ai/openai/gpt-oss-20b:free) - [qwen/qwen3-4b:free](https://openrouter.ai/qwen/qwen3-4b:free) - [qwen/qwen3-coder:free](https://openrouter.ai/qwen/qwen3-coder:free) - [qwen/qwen3-next-80b-a3b-instruct:free](https://openrouter.ai/qwen/qwen3-next-80b-a3b-instruct:free) - [tngtech/deepseek-r1t-chimera:free](https://openrouter.ai/tngtech/deepseek-r1t-chimera:free) - [tngtech/deepseek-r1t2-chimera:free](https://openrouter.ai/tngtech/deepseek-r1t2-chimera:free) - [tngtech/tng-r1t-chimera:free](https://openrouter.ai/tngtech/tng-r1t-chimera:free) - [upstage/solar-pro-3:free](https://openrouter.ai/upstage/solar-pro-3:free) - [z-ai/glm-4.5-air:free](https://openrouter.ai/z-ai/glm-4.5-air:free) ### [Google AI Studio](https://aistudio.google.com) Data is used for training when used outside of the UK/CH/EEA/EU.

Model Name	Model Limits
Gemini 3 Flash	250,000 tokens/minute 20 requests/day 5 requests/minute
Gemini 2.5 Flash	250,000 tokens/minute 20 requests/day 5 requests/minute
Gemini 2.5 Flash-Lite	250,000 tokens/minute 20 requests/day 10 requests/minute
Gemma 3 27B Instruct	15,000 tokens/minute 14,400 requests/day 30 requests/minute
Gemma 3 12B Instruct	15,000 tokens/minute 14,400 requests/day 30 requests/minute
Gemma 3 4B Instruct	15,000 tokens/minute 14,400 requests/day 30 requests/minute
Gemma 3 1B Instruct	15,000 tokens/minute 14,400 requests/day 30 requests/minute

### [NVIDIA NIM](https://build.nvidia.com/explore/discover) Phone number verification required. Models tend to be context window limited. **Limits:** 40 requests/minute - [Various open models](https://build.nvidia.com/models) ### [Mistral (La Plateforme)](https://console.mistral.ai/) * Free tier (Experiment plan) requires opting into data training * Requires phone number verification. **Limits (per-model):** 1 request/second, 500,000 tokens/minute, 1,000,000,000 tokens/month - [Open and Proprietary Mistral models](https://docs.mistral.ai/getting-started/models/models_overview/) ### [Mistral (Codestral)](https://codestral.mistral.ai/) * Currently free to use * Monthly subscription based * Requires phone number verification **Limits:** 30 requests/minute, 2,000 requests/day - Codestral ### [HuggingFace Inference Providers](https://huggingface.co/docs/inference-providers/en/index) HuggingFace Serverless Inference limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB. **Limits:** [$0.10/month in credits](https://huggingface.co/docs/inference-providers/en/pricing) - Various open models across supported providers ### [Vercel AI Gateway](https://vercel.com/docs/ai-gateway) Routes to various supported providers. **Limits:** [$5/month](https://vercel.com/docs/ai-gateway/pricing) ### [Cerebras](https://cloud.cerebras.ai/)

Model Name	Model Limits
gpt-oss-120b	30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Qwen 3 235B A22B Instruct	30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Llama 3.3 70B	30 requests/minute 64,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Qwen 3 32B	30 requests/minute 64,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Llama 3.1 8B	30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Z.ai GLM-4.6	10 requests/minute 60,000 tokens/minute 100 requests/hour 100,000 tokens/hour 100 requests/day 1,000,000 tokens/day

### [Groq](https://console.groq.com)

Model Name	Model Limits
Allam 2 7B	7,000 requests/day 6,000 tokens/minute
Llama 3.1 8B	14,400 requests/day 6,000 tokens/minute
Llama 3.3 70B	1,000 requests/day 12,000 tokens/minute
Llama 4 Maverick 17B 128E Instruct	1,000 requests/day 6,000 tokens/minute
Llama 4 Scout Instruct	1,000 requests/day 30,000 tokens/minute
Whisper Large v3	7,200 audio-seconds/minute 2,000 requests/day
Whisper Large v3 Turbo	7,200 audio-seconds/minute 2,000 requests/day
canopylabs/orpheus-arabic-saudi
canopylabs/orpheus-v1-english
groq/compound	250 requests/day 70,000 tokens/minute
groq/compound-mini	250 requests/day 70,000 tokens/minute
meta-llama/llama-guard-4-12b	14,400 requests/day 15,000 tokens/minute
meta-llama/llama-prompt-guard-2-22m
meta-llama/llama-prompt-guard-2-86m
moonshotai/kimi-k2-instruct	1,000 requests/day 10,000 tokens/minute
moonshotai/kimi-k2-instruct-0905	1,000 requests/day 10,000 tokens/minute
openai/gpt-oss-120b	1,000 requests/day 8,000 tokens/minute
openai/gpt-oss-20b	1,000 requests/day 8,000 tokens/minute
openai/gpt-oss-safeguard-20b	1,000 requests/day 8,000 tokens/minute
qwen/qwen3-32b	1,000 requests/day 6,000 tokens/minute

### [Cohere](https://cohere.com) **Limits:** [20 requests/minute
1,000 requests/month](https://docs.cohere.com/docs/rate-limits) Models share a common monthly quota. - c4ai-aya-expanse-32b - c4ai-aya-expanse-8b - c4ai-aya-vision-32b - c4ai-aya-vision-8b - command-a-03-2025 - command-a-reasoning-08-2025 - command-a-translate-08-2025 - command-a-vision-07-2025 - command-r-08-2024 - command-r-plus-08-2024 - command-r7b-12-2024 - command-r7b-arabic-02-2025 ### [GitHub Models](https://github.com/marketplace/models) Extremely restrictive input/output token limits. **Limits:** [Dependent on Copilot subscription tier (Free/Pro/Pro+/Business/Enterprise)](https://docs.github.com/en/github-models/prototyping-with-ai-models#rate-limits) - AI21 Jamba 1.5 Large - Codestral 25.01 - Cohere Command A - Cohere Command R 08-2024 - Cohere Command R+ 08-2024 - DeepSeek-R1 - DeepSeek-R1-0528 - DeepSeek-V3-0324 - Grok 3 - Grok 3 Mini - Llama 4 Maverick 17B 128E Instruct FP8 - Llama 4 Scout 17B 16E Instruct - Llama-3.2-11B-Vision-Instruct - Llama-3.2-90B-Vision-Instruct - Llama-3.3-70B-Instruct - MAI-DS-R1 - Meta-Llama-3.1-405B-Instruct - Meta-Llama-3.1-8B-Instruct - Ministral 3B - Mistral Medium 3 (25.05) - Mistral Small 3.1 - OpenAI GPT-4.1 - OpenAI GPT-4.1-mini - OpenAI GPT-4.1-nano - OpenAI GPT-4o - OpenAI GPT-4o mini - OpenAI Text Embedding 3 (large) - OpenAI Text Embedding 3 (small) - OpenAI gpt-5 - OpenAI gpt-5-chat (preview) - OpenAI gpt-5-mini - OpenAI gpt-5-nano - OpenAI o1 - OpenAI o1-mini - OpenAI o1-preview - OpenAI o3 - OpenAI o3-mini - OpenAI o4-mini - Phi-4 - Phi-4-mini-instruct - Phi-4-mini-reasoning - Phi-4-multimodal-instruct - Phi-4-reasoning ### [Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai) **Limits:** [10,000 neurons/day](https://developers.cloudflare.com/workers-ai/platform/pricing/#free-allocation) - @cf/aisingapore/gemma-sea-lion-v4-27b-it - @cf/ibm-granite/granite-4.0-h-micro - @cf/openai/gpt-oss-120b - @cf/openai/gpt-oss-20b - @cf/qwen/qwen3-30b-a3b-fp8 - DeepSeek R1 Distill Qwen 32B - Deepseek Coder 6.7B Base (AWQ) - Deepseek Coder 6.7B Instruct (AWQ) - Deepseek Math 7B Instruct - Discolm German 7B v1 (AWQ) - Falcom 7B Instruct - Gemma 2B Instruct (LoRA) - Gemma 3 12B Instruct - Gemma 7B Instruct - Gemma 7B Instruct (LoRA) - Hermes 2 Pro Mistral 7B - Llama 2 13B Chat (AWQ) - Llama 2 7B Chat (FP16) - Llama 2 7B Chat (INT8) - Llama 2 7B Chat (LoRA) - Llama 3 8B Instruct - Llama 3 8B Instruct (AWQ) - Llama 3.1 8B Instruct (AWQ) - Llama 3.1 8B Instruct (FP8) - Llama 3.2 11B Vision Instruct - Llama 3.2 1B Instruct - Llama 3.2 3B Instruct - Llama 3.3 70B Instruct (FP8) - Llama 4 Scout Instruct - Llama Guard 3 8B - Mistral 7B Instruct v0.1 - Mistral 7B Instruct v0.1 (AWQ) - Mistral 7B Instruct v0.2 - Mistral 7B Instruct v0.2 (LoRA) - Mistral Small 3.1 24B Instruct - Neural Chat 7B v3.1 (AWQ) - OpenChat 3.5 0106 - OpenHermes 2.5 Mistral 7B (AWQ) - Phi-2 - Qwen 1.5 0.5B Chat - Qwen 1.5 1.8B Chat - Qwen 1.5 14B Chat (AWQ) - Qwen 1.5 7B Chat (AWQ) - Qwen 2.5 Coder 32B Instruct - Qwen QwQ 32B - SQLCoder 7B 2 - Starling LM 7B Beta - TinyLlama 1.1B Chat v1.0 - Una Cybertron 7B v2 (BF16) - Zephyr 7B Beta (AWQ) ### [Google Cloud Vertex AI](https://console.cloud.google.com/vertex-ai/model-garden) Very stringent payment verification for Google Cloud.

Model Name	Model Limits
Llama 3.2 90B Vision Instruct	30 requests/minute Free during preview
Llama 3.1 70B Instruct	60 requests/minute Free during preview
Llama 3.1 8B Instruct	60 requests/minute Free during preview

## Providers with trial credits ### [Fireworks](https://fireworks.ai/) **Credits:** $1 **Models:** [Various open models](https://fireworks.ai/models) ### [Baseten](https://app.baseten.co/) **Credits:** $30 **Models:** [Any supported model - pay by compute time](https://www.baseten.co/library/) ### [Nebius](https://studio.nebius.com/) **Credits:** $1 **Models:** [Various open models](https://studio.nebius.ai/models) ### [Novita](https://novita.ai/?ref=ytblmjc&utm_source=affiliate) **Credits:** $0.5 for 1 year **Models:** [Various open models](https://novita.ai/models) ### [AI21](https://studio.ai21.com/) **Credits:** $10 for 3 months **Models:** Jamba family of models ### [Upstage](https://console.upstage.ai/) **Credits:** $10 for 3 months **Models:** Solar Pro/Mini ### [NLP Cloud](https://nlpcloud.com/home) **Credits:** $15 **Requirements:** Phone number verification **Models:** Various open models ### [Alibaba Cloud (International) Model Studio](https://bailian.console.alibabacloud.com/) **Credits:** 1 million tokens/model **Models:** [Various open and proprietary Qwen models](https://www.alibabacloud.com/en/product/modelstudio) ### [Modal](https://modal.com) **Credits:** $5/month upon sign up, $30/month with payment method added **Models:** Any supported model - pay by compute time ### [Inference.net](https://inference.net) **Credits:** $1, $25 on responding to email survey **Models:** Various open models ### [Hyperbolic](https://app.hyperbolic.xyz/) **Credits:** $1 **Models:** - DeepSeek V3 - DeepSeek V3 0324 - Llama 3.1 405B Base - Llama 3.1 405B Instruct - Llama 3.1 70B Instruct - Llama 3.1 8B Instruct - Llama 3.2 3B Instruct - Llama 3.3 70B Instruct - Pixtral 12B (2409) - Qwen QwQ 32B - Qwen2.5 72B Instruct - Qwen2.5 Coder 32B Instruct - Qwen2.5 VL 72B Instruct - Qwen2.5 VL 7B Instruct - deepseek-ai/deepseek-r1-0528 - openai/gpt-oss-120b - openai/gpt-oss-120b-turbo - openai/gpt-oss-20b - qwen/qwen3-235b-a22b - qwen/qwen3-235b-a22b-instruct-2507 - qwen/qwen3-coder-480b-a35b-instruct - qwen/qwen3-next-80b-a3b-instruct - qwen/qwen3-next-80b-a3b-thinking ### [SambaNova Cloud](https://cloud.sambanova.ai/) **Credits:** $5 for 3 months **Models:** - E5-Mistral-7B-Instruct - Llama 3.1 8B - Llama 3.3 70B - Llama 3.3 70B - Llama-4-Maverick-17B-128E-Instruct - Qwen/Qwen3-235B - Qwen/Qwen3-32B - Whisper-Large-v3 - deepseek-ai/DeepSeek-R1-0528 - deepseek-ai/DeepSeek-R1-Distill-Llama-70B - deepseek-ai/DeepSeek-V3-0324 - deepseek-ai/DeepSeek-V3.1 - deepseek-ai/DeepSeek-V3.1-Terminus - deepseek-ai/DeepSeek-V3.2 - openai/gpt-oss-120b - tbd ### [Scaleway Generative APIs](https://console.scaleway.com/generative-api/models) **Credits:** 1,000,000 free tokens **Models:** - BGE-Multilingual-Gemma2 - DeepSeek R1 Distill Llama 70B - Gemma 3 27B Instruct - Llama 3.1 8B Instruct - Llama 3.3 70B Instruct - Mistral Nemo 2407 - Pixtral 12B (2409) - Whisper Large v3 - devstral-2-123b-instruct-2512 - gpt-oss-120b - holo2-30b-a3b - mistral-small-3.2-24b-instruct-2506 - qwen3-235b-a22b-instruct-2507 - qwen3-coder-30b-a3b-instruct - qwen3-embedding-8b - voxtral-small-24b-2507