Awesome Free Models

Running AI shouldn't require a credit card. This list curates genuinely free models — open-weight models you can self-host, free API tiers from major providers, and tools to run everything locally.

🧠 Open-Weight Models

Notable open-weight models you can download and run on your own hardware.

Name	Released	Description
Llama 4 Scout / Maverick	—	Meta's latest MoE generation. Scout: 109B, 10M context. Maverick: 402B, 1M context. Native multimodal. [License]
DeepSeek V4	—	Latest generation with extreme cost-efficiency. MIT license.
DeepSeek-V4-Flash	Apr 2026	Efficiency-focused variant of DeepSeek V4. 1M token context, optimized for fast inference. MIT license.
Gemma 4 31B / 26B MoE / E4B / E2B	—	Fully permissive Apache 2.0. 256K context, native multimodal. New standard for open-weight.
GLM-5.1 (Zhipu AI)	—	744B MoE model, competitive with top proprietary models. MIT license.
MiniMax M3	—	Frontier-tier 1M context, native multimodal + computer use. MSA architecture.
Trinity (Arcee AI)	—	400B parameter enterprise model. Apache 2.0.
Step 3.7 Flash (StepFun)	May 2026	Apache 2.0. Native multimodal (image+video), strong agentic performance. Efficient enough for high-end local hardware.
Kimi K2.6 (Moonshot AI)	Apr 2026	1T-parameter MoE model. Modified MIT license. Exceptional coding (SWE-Bench ~54%) and multi-agent swarm orchestration.
Qwen 3.6-35B-A3B	Apr 2026	MoE variant with only 3B active parameters. Extremely efficient for consumer hardware. Apache 2.0.
InternLM 3 (Shanghai AI Lab)	Early 2026	Strong long-context reasoning and agentic performance. Competitive in open-weight benchmarks.
MiMo-V2.5-Pro (Xiaomi)	Apr 2026	1.02T-parameter MoE (42B active). Optimized for complex agentic tasks, coding, and long-context.
Bonsai 8B (PrismML)	Apr 2026	Groundbreaking 1-bit quantized model. Extremely efficient for edge and consumer hardware (Apple Silicon).
Mistral Small 3.1 (Mistral)	Mar 2025	Versatile 24B multimodal model. Strong text performance with native image understanding and 128K context. Apache 2.0.
Mistral Small 4 (Mistral)	Mar 2026	Hybrid MoE (6.5B active params) unifying instruction, reasoning, and multimodal capabilities. Efficient frontier-class model. Apache 2.0.
Command A+ (Cohere)	May 2026	Enterprise multimodal MoE optimized for sovereignty and multilingual RAG across 48 languages. Apache 2.0.
Hermes 4 (NousResearch)	Feb 2026	Self-improving agentic model with closed-loop learning. Curates own memory and builds skills from experience. Apache 2.0.
Snowflake Arctic	Apr 2024	Enterprise MoE model balancing high-quality performance with efficient training costs. Optimized for complex data operations. Apache 2.0.
Falcon 3 (TII)	Dec 2024	Compact high-performance model with strong reasoning. Designed for efficient deployment on resource-constrained hardware. TII Falcon-LLM License 2.0.
Apple OpenELM	Apr 2024	Family of efficient on-device SLMs using layer-wise attention scaling. Runs locally on Apple Silicon with full privacy. Apple Sample Code License.
Nemotron 3 Ultra (NVIDIA)	Jun 2026	550B MoE (55B active). Hybrid Mamba-Transformer, NVFP4 quantization. Optimized for agentic workflows. Fully open (weights, data, recipes). OpenMDW-1.1 license.

🔌 Free API Providers

Providers offering free tiers to access models via API — no local hardware required.

Name	Description
Google AI Studio	Most generous free tier. Access Gemini 2.5 Flash, Gemini 2.0 Flash, and other models. Generous rate limits for prototyping.
OpenRouter	Aggregates 400+ models from 70+ providers. Filter by "Free" to see models available at no cost. Includes experimental and subsidized open-weight models.
Groq	Extremely fast inference for text, speech, and vision-based OCR. Free tier supports GPT-OSS, Llama 4, and Whisper, but is bottlenecked by low per-minute caps (30 RPM / 8K TPM), requiring a paid upgrade for production scale.
Hugging Face Inference API	Free tier for thousands of community models. Rate-limited but excellent for testing.
NVIDIA NIM	Free API access to accelerated versions of Llama, Mistral, Gemma, Nemotron 3 Ultra, and more on NVIDIA infrastructure.
Together AI	Access 200+ open-source models including MiniMax M3, DeepSeek, Qwen. Free credits may be available for new accounts — check current promotions.
Fireworks AI	$1 free starter credits for new users. Optimized for low latency across 50+ models including GLM 5.2, Kimi K2.7, MiniMax M3. ⚠️ Moved to prepaid billing July 1, 2026.
SiliconFlow	Rising platform with free access to many open-source models.
Cloudflare Workers AI	Free tier for running 50+ open-source models at the edge on Cloudflare's global network. Pay-for-what-you-use pricing.
Replicate	Free tier with limited credits for running open-source models. Pay-per-second for GPU usage.
Poe (Quora)	Free tier with daily credits for GPT-4 mini, Claude instant, and community bots.
Cerebras	1M tokens/day free, no credit card. Ultra-fast inference on WSE chips. Access Llama 3.3 70B, GPT-OSS 120B, Qwen 3, and more via OpenAI-compatible API.
Qwen Chat (Alibaba)	Free access to Qwen 3.7-Plus, Qwen 3.6-Max, and other Qwen models via web chat and API.
Ollama Cloud	Free tier for running open-source models on Ollama's cloud infrastructure. Light usage included, 1 concurrent model. Pro ($20/mo) and Max ($100/mo) tiers available. Zero data retention.
OpenAI API	~$5 trial credits for new API accounts. Access GPT-5, GPT-4o, o4-mini, and more. Rate-limited free tier available after credits expire.
Mistral AI (Vibe)	⚠️ Pivoted to Vibe (consumer AI agent). Free tier for Vibe chat agent with limited messages. API access via Studio — enterprise pricing, contact sales. New models: Mistral OCR 4, Mistral Medium 3.5.
Model Router	Free API with intent-based routing across Groq and Cerebras models (Llama 4 Scout, DeepSeek, Qwen, Nemotron) — no credit card, no trial credits. Set `prefer=cheap
Cohere	Free trial API key for Command A+, North Mini Code, Transcribe, Embed 4, Rerank 4, and Aya models. Rate-limited, not for production.
DeepSeek Platform	Free API credits for new users (5M tokens). Access to DeepSeek V4, DeepSeek-R1, and other models. Generous free allocation.
GitHub Models	Free tier for GitHub users. Access GPT-5, GPT-4o, o4-mini, Phi-4, Llama 4, Mistral, and more with rate-limited playground and API.
Hyperbolic	⚠️ No standalone free tier — pay-per-use GPU cloud. Free credits via referral program only ($5 for referrer, $6 for referee when referee deposits $5+).
Novita AI	Free starter credits for testing 200+ models including DeepSeek V4 Pro, MiniMax M3, GLM-5.1, Kimi K2.6. Also offers Agent Sandbox and GPU Cloud. OpenAI-compatible API.
Anakin.ai	30 daily free credits for accessing multiple AI models. Web chat interface and API access. Supports GPT-4, Claude, DeepSeek, and open-weight models.
Anthropic (Claude API)	~$5 trial credits for new API accounts. Access Claude Opus 4.8, Sonnet, and Haiku models. Phone verification required.
Nebius AI	$100 free credits for new users. AI Studio with access to Llama, Qwen, DeepSeek, Nemotron 3 Ultra, and other open-weight models. Fast inference on NVIDIA H100/B200 infrastructure.
Fal.ai	Free starter credits for generative media inference. 1,000+ models for image, video, audio, and 3D. SOC 2 compliant. Pay-as-you-go beyond free tier.
Vercel AI Gateway	$5/month free credits for the AI Gateway. Proxy and cache requests across multiple LLM providers. SDK is open-source and free.
AI21 Labs	⚠️ Enterprise-focused. Maestro framework and Jamba models. No clear free tier visible — contact sales.
Amazon Bedrock	$200 AWS credits for new customers. Access to Llama, Mistral, Claude, Titan, and other foundation models via API.
Azure AI Foundry	$200 free trial credits (30 days). Access GPT-4o, Llama, Mistral, Phi, and other models via Azure's unified AI platform.
xAI (Grok)	$25 sign-up credits + $150/month with data-sharing program. Access Grok-3, Grok-3 Mini via API. No credit card required. ⚠️ Console may require alternate access.
ZeroLimitAI	Free API with auto model routing to the best available free model (Gemini 2.5 Flash, Llama 4, DeepSeek R1). No credit card, drop-in OpenAI replacement. Paid plans from $49 one-time.
Stability AI	Free API credits for image generation with Stable Diffusion and Stable Video models. Rate-limited access without credit card.
Eden AI	Free tier aggregating 500+ models from multiple providers via a single API key. Unified interface for text, image, and code generation. GDPR-compliant EU endpoint available.
SambaNova	Free tier with fast inference on custom RDU chips. Access DeepSeek, Llama 4, MiniMax M2.7, GPT-OSS 120B. Fastest inference speeds available.
Inference.net	Free tier for LLM observability and monitoring. Deploy, trace, evaluate, and train custom models. SOC 2 Type II compliant.
RunPod	⚠️ No free tier — pay-per-use for pods, serverless, and clusters.
FreeTheAi	Free OpenAI-compatible AI API gateway with 50+ active models. Discord-based key signup with daily check-in to keep access active. Streaming, tool calling, and multiple model support. No credit card.

💻 Local Inference Tools

Run models on your own machine — no API keys needed, full privacy.

Name	Description
Ollama	The easiest way to run local LLMs. One command to download and run any model. macOS, Linux, Windows. GitHub
LM Studio	Polished desktop GUI. Browse, download, and chat with models. Built-in model browser and local API server.
llama.cpp	High-performance C++ inference engine. Runs on CPU and GPU. Supports GGUF quantization. Powers most other local tools.
Jan	Open-source ChatGPT alternative for desktop. Built-in model downloader, local API server. GitHub
GPT4All	Privacy-focused local chatbot. Runs on consumer hardware. Built-in model browser. GitHub
text-generation-webui (Oobabooga)	Feature-rich web UI. Supports multiple backends (Transformers, llama.cpp, ExLlama, AutoGPTQ).
LocalAI	Drop-in OpenAI API replacement. Run models locally with an OpenAI-compatible API. GitHub
KoboldCPP	Single-file executable for running GGUF models. Focused on story generation but general-purpose.
llamafile (Mozilla)	Distributable single-file executables that run LLMs. No installation needed.
vLLM	High-throughput production inference engine. Uses PagedAttention for efficient serving.
SGLang	Fast inference framework with structured generation and RadixAttention.
TensorRT-LLM (NVIDIA)	NVIDIA's optimized inference engine. Best performance on NVIDIA GPUs.
ExLlamaV2	Fast GPTQ/EXL2 inference library. Less active; newer development on ExLlamaV3.
Aphrodite Engine	High-performance LLM serving engine with advanced quantization support.
TabbyAPI	Lightweight, fast OpenAI-compatible API server for ExLlamaV2.
LlamaEdge	Lightweight inference framework for edge devices. OpenAI-compatible API for open-source models. Runs on WasmEdge for portability. GitHub
MLC LLM	Universal deployment engine by UW/SJTU. Runs LLMs on any hardware — laptops, phones, browsers. OpenAI-compatible API.
WebLLM	In-browser LLM inference via WebGPU. Runs models directly in your browser with zero setup. No server needed.
FastChat (LMSYS)	Open platform for training, serving, and evaluating LLMs. Provides OpenAI-compatible API and web UI for local models.
Hugging Face TGI	⚠️ Archived. Use vLLM or SGLang instead.
DeepSpeed (Microsoft)	Deep learning optimization library with inference acceleration. Enables running larger models on limited hardware through ZeRO optimization.
AirLLM	Run large models (70B+) on consumer hardware with limited memory. Loads models layer-by-layer for extreme memory efficiency.
AI Toolkit for VS Code (Microsoft)	VS Code extension to browse, test, fine-tune, and deploy models locally. Integrates ONNX and llama.cpp.
Ollama Grid Search	Desktop utility for systematic model evaluation. Test multiple models, prompts, and inference parameters side-by-side via a Rust/React GUI.

💬 AI Chatbot UIs

Free, open-source web interfaces for chatting with AI models — self-host or use hosted versions.

Name	Description
Open WebUI	Feature-rich ChatGPT-like interface for Ollama and OpenAI-compatible backends. RAG, image generation, multi-user. GitHub
LibreChat	Open-source ChatGPT clone supporting 40+ providers, multi-user, plugins, and RAG. GitHub
AnythingLLM	All-in-one desktop app for chatting with documents and models. Built-in RAG pipeline. GitHub
Big-AGI	Feature-rich AI chat with personas, multi-model support, voice, and code execution. GitHub
LobeChat	Modern, extensible chat framework with plugin system and multi-provider support. GitHub
Chatbot UI	Simple, clean ChatGPT interface. Easy to self-host with any OpenAI-compatible API. ⚠️ Open-source repo unmaintained. GitHub
NextChat (ChatGPT-Next-Web)	Lightweight cross-platform chat app. Self-host on Vercel or download official desktop/mobile clients.

🖥 AI CLI Tools

General-purpose terminal-based AI tools — chat, summarization, file operations, and more.

Name	Released	Description
Gemini CLI	Feb 2025	Google's open-source terminal AI agent. 1,000 requests/day free on personal Google account. General-purpose agent for code, chat, and shell tasks. Gemini 3 models, 1M context. Apache 2.0. 106k stars.
Codex	May 2025	OpenAI's lightweight coding agent. Rust-based with OS-level sandboxing (macOS Seatbelt, Linux Landlock). AGENTS.md support, image input, subagents, MCP. Apache 2.0. 93k stars.
OpenCode	Jan 2025	Go-based terminal AI agent. Model-neutral, supports 75+ LLM providers, LSP integration, and MCP tools. Desktop app in beta. MIT. 178k stars.
Pi	2024	Open-source terminal AI agent with unified multi-provider API. Model-agnostic, extensible plugin architecture. 65k stars. MIT.
Hermes Agent	Feb 2026	Nous Research's self-improving terminal AI agent. Full TUI with slash commands, 40+ tools, persistent memory. Multi-platform gateway (Telegram, Discord, Slack, WhatsApp). Closed learning loop with autonomous skill creation. Apache 2.0. 202k stars.
Vibe CLI	2025	Mistral's open-source CLI coding agent. Free tier with Mistral Experiment tier (no credit card). Conversational iterative workflow. AGENTS.md support, skills system, voice mode, MCP. Apache 2.0. 4.6k stars.
Goose	2024	Open-source CLI agent for complex software engineering tasks. Extensible plugin system. Originally by Block, now under the Agentic AI Foundation (AAIF) at the Linux Foundation. Desktop app + CLI + API. Rust-based. Apache 2.0. 50k stars.
MiMo Code	2026	Xiaomi's terminal AI tool with persistent memory, multi-agent orchestration, and 1M-token context. Free tier available. Supports mimo-v2.5-pro, mimo-v2.5, mimo-v2-omni models. Web UI in alpha.
Tuillem	2025	3-pane terminal AI chat client written in Rust. Switch providers and models mid-conversation. Full markdown rendering, SQLite history with FTS5 search. 10 built-in themes. Plugin system. MIT.
Hai	2025	Lightweight terminal AI agent. Run commands or ask questions. Supports OpenAI, Claude, Gemini, DeepSeek. Agent mode with auto shell execution, pipe support, predefined prompts. GPL-3.0.
Freebuff	—	An AI-powered CLI, supported by ads, with multi-agent orchestration.

🤖 AI Coding Assistants

Free tools that integrate AI into your development workflow.

Name	Description
Continue.dev	Open-source AI code assistant. Chat, autocomplete, and edit with any model. GitHub
Aider	AI pair programming in the terminal. Edits code in your local git repo. Supports GPT, Claude, and local models. GitHub
Devin Desktop (formerly Windsurf/Codeium)	AI code editor with autocomplete, chat, and search. Now by Cognition. Free tier available. Pro $20/mo.
Tabby	Self-hosted AI coding assistant with no dependency on external services. GitHub
Cody (Sourcegraph)	Free tier for individuals. Chat, autocomplete, and commands with codebase context.
Llama Coder (Nutlope)	Free AI code generation tool. Generate entire apps from prompts.
Bolt.new (StackBlitz)	Free tier for AI-powered full-stack web app development in browser.
Claude Code (Anthropic)	Requires Claude subscription or API account. Terminal-based AI coding assistant.
Cursor	AI-native code editor with deep model integration and agentic features. Free tier available. Pro $20/mo.
CodeBuff	⚠️ Paid only. CLI-based AI coding assistant that understands entire codebases.
Cline	Popular autonomous VS Code agent. Creates/edits files, runs terminal commands, browses web. Open-source, BYOK (bring your own API key). GitHub
OpenHands	Autonomous AI software engineer. Navigates file systems, runs shell commands, tests code in browser. Self-hostable. GitHub
Kodu (Claude Coder)	VS Code autonomous coding agent. Builds projects from scratch, handles complex tasks with natural language.
Goose	Open-source CLI agent for complex software engineering tasks. Extensible plugin system. Originally by Block, now under the Agentic AI Foundation (AAIF) at the Linux Foundation. GitHub

📝 Code Models

Specialized for code generation, completion, and analysis.

Name	Description
DeepSeek Coder	State-of-the-art open-weight code generation. DeepSeek's coder series leads SWE-bench. MIT license.
Qwen2.5-Coder (Alibaba)	Highly capable code model series (1.5B–32B). Excellent balance of speed and quality. Apache 2.0.
Codestral (Mistral)	Mistral's dedicated code generation model — fill-in-the-middle, completion, and instruction.
CodeGemma (Google)	Google's Gemma architecture fine-tuned for code completion and instruction. Apache 2.0.
StarCoder2 (BigCode)	Transparently trained code model covering 619 languages. OpenRAIL-M license.
Yi-Coder (01.AI)	Efficient coding model with strong long-context understanding. Yi License (Apache 2.0 compatible).
Phi-4-mini (Microsoft)	Lightweight model optimized for reasoning and code. Punches above its weight class. MIT license.
Qwen3-Coder-Next (Alibaba)	Early 2026. Latest generation of Qwen's code series. Strong reasoning and long-context coding capabilities. Apache 2.0.
CodeLlama (Meta)	Aug 2023. Llama 2-based code generation pioneer. Supports infilling, completion, and instruction. Llama 2 Community License.
WizardCoder (WizardLM)	2023. Evol-Instruct fine-tuned for complex coding tasks. Strong general code generation performance. Apache 2.0.
OpenCodeInterpreter	2024. Integrates execution feedback to iteratively improve generated code. Bridges generation and execution. Apache 2.0.
Stable Code 3B (Stability AI)	Aug 2023. Lightweight 3B code model optimized for fill-in-the-middle. Efficient for local autocompletion. StabilityAI license.
CodeGeeX2 (THUDM)	2023. Multilingual code model supporting 20+ languages. Strong in both Chinese and English code tasks. Apache 2.0.
CodeT5+ (Salesforce)	2023. Encoder-decoder architecture unifying code generation, completion, and understanding. BSD-3 license.
SantaCoder (BigCode)	2023. Light 1.1B model specialized for Python, Java, and JavaScript. Fast and efficient for IDE integration.

🔍 RAG & Vector Databases

Free tools for building retrieval-augmented generation pipelines — vector storage, embedding search, and document retrieval.

Name	Description
Chroma	AI-native open-source embedding database. Runs in-process, no GPU needed. GitHub
Qdrant	High-performance vector search engine. Free tier on Qdrant Cloud or self-host via Docker. GitHub
pgvector	Vector similarity search inside PostgreSQL. Free if you already run Postgres.
LanceDB	Developer-friendly vector database built on Lance columnar format. Runs locally, no server needed. GitHub
Weaviate	Open-source vector database. Free sandbox tier on Weaviate Cloud. GitHub
Milvus (Zilliz)	Cloud-native vector database. Free tier on Zilliz Cloud or self-host. GitHub
txtai	AI-powered semantic search and RAG in a single Python package. GitHub
R2R (SciPhi)	Production-ready RAG engine with API, user management, and observability.
Docling (IBM)	Document understanding and conversion for RAG pipelines. Extracts PDFs, images, and more. GitHub
Unstructured.io	Preprocessing toolkit for documents (PDF, HTML, Word) for RAG pipelines. Free tier available.
RAGFlow	Open-source RAG engine with deep document parsing, OCR, and knowledge base management. Supports多种 document formats.
RAGatouille	Python package bringing ColBERT-style late interaction retrieval to RAG pipelines. Works as retriever and reranker. Free and open-source.
Ragas	Open-source evaluation framework for RAG pipelines. Measures retrieval accuracy, answer relevance, and faithfulness.

🧩 Agentic Frameworks

Free, open-source frameworks for building AI agents and multi-agent systems.

Name	Description
LangGraph (LangChain)	Low-level framework for building stateful, multi-agent applications. GitHub
CrewAI	Multi-agent framework for orchestrating specialized AI agents to work together. GitHub
AutoGen (Microsoft)	Extensible framework for building multi-agent conversations. ⚠️ Maintenance mode — use Microsoft Agent Framework instead. GitHub
Agno (formerly Phidata)	Full-stack AI framework for building multimodal agents with memory, knowledge, and tools. GitHub
PydanticAI	Agent framework by Pydantic with type-safe outputs and dependency injection. GitHub
Mastra	TypeScript framework for building AI applications and agent workflows. GitHub
OpenAI Agents SDK	Lightweight SDK for building single and multi-agent systems. GitHub
Semantic Kernel (Microsoft)	SDK for orchestrating AI agents with planners, memory, and connectors. GitHub
Dify	LLM app development platform with visual workflow builder and agent capabilities. GitHub
Flowise	Low-code visual LLM flow builder with drag-and-drop interface. GitHub
TaskWeaver (Microsoft)	⚠️ Archived. Code-first agent framework for planning and executing complex tasks. GitHub
Fazm	Apr 2026. Open-source local computer-use agent for macOS. Drives apps via accessibility APIs, model-agnostic, faster than screenshot-based agents.
Smolagents (Hugging Face)	Minimalist agent library where agents "think in code." Lightweight, zero boilerplate. Supports code agents and tool-calling agents.
Swarms	Enterprise-grade multi-agent orchestration framework. Scalable infrastructure for autonomous agent swarms. Highly modular.
Letta (MemGPT)	Framework for long-term agent memory. Virtual memory management that pages data in/out of context like an OS. Persistent agents.
Griptape	Enterprise agent framework with strictly typed Pipelines, Workflows, and Agents. Structure-first, production-ready.
OpenAI Swarm	Experimental lightweight multi-agent orchestration. Uses Agents and Handoffs abstractions. Educational and minimalist.
Atomic Agents	Framework inspired by Atomic Design. Compose agents from small, reusable, modular components. Testable and scalable.
PraisonAI	Low-code multi-agent framework. Define agent roles, tasks, and flows via YAML configuration. Wraps underlying agent frameworks.
Cognee	GraphRAG framework for agent knowledge management. Builds interconnected knowledge graphs from unstructured data.
AgentZero	Self-healing autonomous agent with web UI. Manages own workflows, tool use, and environment. Self-evolving capabilities.
MetaGPT	Multi-agent framework simulating a full software team. Assigns Agent, Product Manager, Engineer roles. Implements SOPs for end-to-end code generation.
ChatDev (OpenBMB)	Virtual software company driven by multi-agent collaboration. Follows waterfall model through design, coding, testing, and documentation.
AutoGPT	The original autonomous agent experiment. Sets its own goals, iterates on tasks, and executes without continuous human input. Web browsing and file management.
Bee Agent Framework (IBM)	Production-ready framework for building reliable AI agents in Python and TypeScript. Modular, with built-in observability and IBM research optimizations.
Eliza (elizaOS)	Multi-platform agent framework for creating character-driven AI agents. Handles social media interaction, complex decision-making, and autonomous behavior across platforms.
SuperAGI	⚠️ Unmaintained. Developer-focused autonomous agent platform with GUI. Built-in resource management, file handling, and multi-tasking for running agents at scale.
AgentVerse (OpenBMB)	⚠️ Unmaintained. Framework for building and evaluating multi-agent environments. Easily configure agent teams and measure collaborative performance.
Qwen-Agent (Alibaba)	Agent framework tightly integrated with the Qwen model family. Optimized for function calling, code execution, RAG, and tool use with Qwen models.
AGiXT	Extensible modular AI agent automation platform. Plugin system for swapping LLMs, memory backends, and tools. Highly customizable agent workflows.
Deus	Self-hosted personal AI assistant framework built around long-term memory, a self-improving evolution loop, and multi-agent orchestration. Backend-neutral (Claude Code, OpenAI/Codex, or fully local Ollama models), container-isolated agents, multi-channel (WhatsApp, Telegram, Slack, Discord, Gmail). MIT.

🎛 Fine-tuning Tools

Tools to fine-tune free models on your own data — all free and open-source.

Name	Description
Unsloth	Fast memory-efficient fine-tuning. 2x faster, 50% less memory. Supports QLoRA, LoRA, full fine-tune.
Axolotl	Streamlined fine-tuning framework supporting multiple model architectures and quantization methods.
LLaMA-Factory	Easy-to-use fine-tuning with web UI. Supports 100+ models, multiple training methods.
Hugging Face TRL	Transformer Reinforcement Learning library. SFT, PPO, DPOTrainer, GRPOTrainer for aligning models.
TorchTune (Meta)	Native PyTorch library for fine-tuning LLMs. Simple, extensible, efficient. ⚠️ Development wound down in 2025.
XTuner (InternLM)	Efficient fine-tuning toolkit supporting QLoRA, LoRA, and full fine-tune with multiple model architectures.
Ludwig (Predibase)	Declarative ML framework. Fine-tune models with a simple config file. GitHub
PyTorch Lightning	Free deep learning framework for training and fine-tuning. Simplifies distributed training, checkpointing, and logging. GitHub
Hugging Face Accelerate	Zero-config distributed training for PyTorch. Enables easy multi-GPU and TPU training with minimal code changes.
ColossalAI	Open-source distributed training system with parallelism strategies. Supports large model training on limited hardware.
JAX (Google)	High-performance ML framework with automatic differentiation and JIT compilation. Powers many modern training pipelines.
Ray Train	Distributed training framework built on Ray. Supports PyTorch, TensorFlow, and JAX with automatic scaling.
Determined AI	Open-source ML training platform with hyperparameter search, GPU scheduling, and experiment tracking.

✨ Prompt Engineering Tools

Free tools for testing, managing, and optimizing prompts.

Name	Description
Promptfoo	Open-source tool for prompt testing and evaluation. Systematic A/B testing of prompts. GitHub
Fabric (Daniel Miessler)	Open-source framework for augmenting humans with AI. Library of curated prompts (patterns) for common tasks.
LangFuse	Open-source LLM engineering platform with prompt management, versioning, and evaluation. GitHub
OpenPrompt (THUNLP)	⚠️ Unmaintained. Framework for prompt-learning research. Supports template and verbalizer design.
DSPy (Stanford)	Framework for algorithmically optimizing LM prompts and weights. GitHub
Agenta	Open-source LLM platform for prompt management, evaluation, and deployment. GitHub
ChainForge	Open-source visual programming environment for prompt engineering. Test prompts across multiple LLMs, compare responses, and evaluate robustness. GitHub
Latitude	Open-source prompt engineering platform with versioning, playground, evaluation, and deployment as API endpoints. GitHub
DeepEval	Open-source evaluation framework for LLM outputs. 50+ metrics, pytest integration, and CI/CD support for prompt regression testing.
PromptLayer	Prompt versioning and monitoring platform. Tracks prompt versions, cost, latency, and model behavior. Free tier with 10K calls/month.
OpenPromptHub	Community-driven prompt engineering platform. Discover, share, and contribute prompt patterns. Free and open-source.

📊 Datasets

Free, open datasets for training, fine-tuning, and evaluating models.

Name	Description
Hugging Face Datasets	The standard hub for open datasets. 150,000+ datasets across all tasks.
Common Corpus	Massive open-source dataset for training large language models. Gated dataset — requires Hugging Face login.
The Stack v2 (BigCode)	Large-scale code dataset covering 619 programming languages. Permissive license.
FineWeb (Hugging Face)	High-quality web dataset for LLM pre-training. 15T tokens.
Dolly (Databricks)	15k instruction-response pairs for fine-tuning. CC-BY-SA.
OpenAssistant Conversations	160k human-generated assistant conversations. Apache 2.0.
ShareGPT (RyokoAI)	Real user-ChatGPT conversations for fine-tuning.
UltraChat (Sean C.)	200k multi-turn conversations synthesized by ChatGPT.
No Robots (Hugging Face)	10k high-quality human-written instructions. Apache 2.0.
RLAIF-V (OpenBMB)	AI-generated preference data for RLHF. Apache 2.0.
MMLU / GSM8K	Standard benchmarks for evaluation.

☁ Model Hosting Platforms

Free platforms that host models — run inference without downloading anything.

Name	Description
Hugging Face Spaces	Free hosting for ML apps (Gradio, Streamlit). Thousands of community demos.
Hugging Face Inference Endpoints	⚠️ Paid service — pay-as-you-go starting $0.06/hr.
Google Colab (Free Tier)	Free GPU (T4, sometimes A100). Perfect for running models and fine-tuning.
Kaggle Notebooks	Free GPU (T4 x2). 30 hours/week. Good for heavier workloads.
Lightning AI Studio	Free tier with GPU access for development and prototyping.
Modal	Free monthly credits for serverless GPU compute.
Replicate (Free Tier)	Free credits for running community models.
Deepnote	Free tier with GPU for data science and ML notebooks.
Beam	$30/mo free credits for serverless GPU compute. Fast cold starts (<1s), auto-scaling, Python SDK. Open-source runtime.
Cerebrium	⚠️ Paid service — compute-based billing with sub-second cold starts.
Baseten	⚠️ Paid service — serverless GPU inference. Truss open-source framework, auto-scaling.

🏖️ Core AI Execution Sandboxes

Free, isolated sandbox environments for executing AI agent code, running untrusted scripts, and building agent workflows — no infrastructure to manage.

Name	Description
E2B	The most popular sandbox for AI agents. $100 free credit (one-time), no credit card. Firecracker microVMs, 150ms cold starts, 20 concurrent sandboxes, 1-hour sessions. Python/JS SDKs. Docker MCP Catalog (200+ tools).
Novita AI Sandbox	$100 free credits (90-day validity). 5 concurrent sandboxes, 1-hour max session, 2 vCPU / 4 GB RAM. Sub-200ms startup, per-second billing. Code execution, browser automation, computer use.
Hopx	$200 free credits, no credit card. Firecracker microVMs, ~100ms cold start. Full Linux with file/exec/PTY access. Persistent state, unlimited runtime. Python, JavaScript, Bash, Go. Per-second billing.
InstaVM	$50 free credits, no credit card. MicroVMs for AI agents with persistent state, networking, and secrets injection. Sub-200ms boot, SSH access, full Linux Desktop for browser automation.
OmniRun	25 sandbox-hours/month free, no credit card. MicroVM isolation (own kernel per sandbox). ~250ms boot. 6 languages. Network blocked by default. Claude Managed Agents compatible.
SimpleSandbox	1M credits/month free (~17 hours compute). 3 concurrent sandboxes, Firecracker microVMs, ~1s cold start. Per-second billing. 50% cheaper than E2B, no enterprise minimums.
SandboxAPI	500 executions/month free. 12 languages (Python, Node, Go, Rust, Java, etc.). gVisor isolation, streaming output, persistent sessions. MCP-native — works with Claude Desktop, Cursor, VS Code.
Tensorlake	2 concurrent sandboxes free forever. 1 core / 1 GB RAM / 10 GB disk, up to 2-hour sessions. Firecracker microVMs, SOC 2 Type 2. Unmetered sessions.
SmolVM (CelestoAI)	Free & open-source (Apache 2.0). Self-hostable microVM sandboxes for AI agents. Sub-second boot, hardware isolation, browser sandbox, file sharing, snapshots. Python SDK. Run Claude Code, Codex, or Pi pre-installed.

📚 Learning Resources

Free courses, books, and tutorials for learning AI and LLMs.

Name	Description
Fast.ai	Code-first deep learning education. Practical, free courses from fundamentals to advanced.
Hugging Face LLM Course	Comprehensive free course on transformers, tokenizers, datasets, and deployment.
DeepLearning.AI Short Courses	Free short courses on LLMs, RAG, LangChain, and AI agents.
Full Stack Deep Learning	Free course on ML engineering: training, deploying, and maintaining models.
Andrej Karpathy's Course	From-scratch neural network implementation videos.
Neural Networks: Zero to Hero	YouTube series building neural networks from scratch.
LLM University (Cohere)	Free course on LLMs, embeddings, and RAG.
Prompt Engineering Guide (DAIR.AI)	Comprehensive free guide on prompt engineering techniques.
Anthropic Cookbook	Free recipes and patterns for working with Claude.
OpenAI Cookbook	Free examples and guides for the OpenAI API.

🏆 Resources & Leaderboards

Name	Description
Perplexity	Free AI search and research assistant with real-time answers and source citations.
Hugging Face Open LLM Leaderboard	The primary benchmark for open-weight models. Updated regularly.
LMSYS Chatbot Arena	Human preference rankings of models. Best source for real-world quality comparisons.
Artificial Analysis	Independent benchmarks for speed, pricing, and quality across providers.
Hugging Face Models	Search 1M+ models. Filter by license, task, framework.
OpenRouter Models	Browse models available via API with pricing and free tiers.
Ollama Library	Browse models available for one-command local setup.
cheahjs/free-llm-api-resources	Community-maintained list of free LLM API resources.
SweetTea	⚠️ Site appears down. Community voting on model quality and preference. May be defunct.

👥 Communities

Name	Description
Hugging Face Discord	Model releases, discussions, and community support.
r/LocalLLaMA	The largest Reddit community for running local LLMs.
Ollama Discord	Ollama community for local model enthusiasts.
LM Studio Discord	LM Studio community.
Hugging Face Forums	Discussions on models, datasets, and Spaces.
r/MachineLearning	General ML/AI research and news.
Discord: AI Agents	Community for AI agent development and agentic frameworks.
r/OpenAI	Official Reddit community for OpenAI models, API discussions, and releases.
r/artificial	General AI discussion covering research, news, and ethics.
OpenAI Developer Forum	Official forum for OpenAI API developers. Share prompts, troubleshoot, and discuss best practices.
Nous Research Discord	Community for open-source AI development, Hermes models, and decentralized training (DisTrO).
Learn AI Together Discord	Active learning community with 10K+ members. Ask questions, find teammates, and share projects.

License

To the extent possible under law, the author has waived all copyright and related or neighboring rights to this work.