Claude Code API 호출 무료화 프록시 구축 가이드 (NIM, OpenRouter 등)
요약
본 프로젝트는 Claude Code가 Anthropic API를 호출할 때 사용하는 트래픽을 NVIDIA NIM, OpenRouter, DeepSeek, LM Studio 등 다양한 백엔드 제공업체로 라우팅하는 경량 프록시입니다. 이 프록시는 40 req/min 무료 할당(NVIDIA NIM 기준) 및 로컬 모델 사용을 통해 API 비용 부담 없이 Claude Code의 기능을 활용할 수 있게 합니다. 환경 변수 설정만으로 기존 Claude Code CLI나 VSCode 확장 프로그램에 수정 없이 적용 가능하며, 복잡한 속성 처리(
핵심 포인트
- NVIDIA NIM을 이용해 분당 40회 요청까지 무료로 API를 사용할 수 있습니다.
- OpenRouter, DeepSeek, LM Studio 등 5가지 이상의 다양한 백엔드 제공업체를 지원합니다.
- 환경 변수 설정만으로 Claude Code CLI 및 VSCode 확장 프로그램에 수정 없이 적용 가능합니다 (Drop-in Replacement).
- Opus/Sonnet/Haiku 모델별로 다른 API 제공업체(Provider)를 자유롭게 혼합하여 사용할 수 있습니다.
A lightweight proxy that routes Claude Code's Anthropic API calls
This project provides a lightweight proxy that routes Claude Code's Anthropic API calls to various providers, including NVIDIA NIM (40 req/min free), OpenRouter (hundreds of models), DeepSeek (direct API), LM Studio (fully local), or llama.cpp (local with Anthropic endpoints).
✨ Features
- Zero Cost: 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio.
- Drop-in Replacement: Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed.
- 5 Providers: NVIDIA NIM, OpenRouter, DeepSeek, LM Studio (local), llama.cpp (llama-server).
- Per-Model Mapping: Route Opus / Sonnet / Haiku to different models and providers. Mix providers freely.
- Thinking Token Support: Parses
<think>tags andreasoning_contentinto native Claude thinking blocks. - Heuristic Tool Parser: Models outputting tool calls as text are auto-parsed into structured tool use.
- Request Optimization: 5 categories of trivial API calls intercepted locally, saving quota and latency.
- Smart Rate Limiting: Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap.
- Discord / Telegram Bot: Remote autonomous coding with tree-based threading, session persistence, and live progress.
- Subagent Control: Task tool interception forces
run_in_background=False. No runaway subagents. - Extensible: Clean
BaseProviderandMessagingPlatformABCs. Add new providers or platforms easily.
🔑 Getting Started
API Keys & Local Setup
- Get an API key (or use LM Studio / llama.cpp locally):
- NVIDIA NIM: build.nvidia.com/settings/api-keys
- OpenRouter: openrouter.ai/keys
- DeepSeek: platform.deepseek.com/api_keys
- LM Studio: No API key needed. Run locally with LM Studio.
- llama.cpp: No API key needed. Run
llama-serverlocally.
🚀 Installation
- Install Claude Code:
pip install uv
*(If `uv` is already installed, run `uv self update` to get the latest version.)*
2. **Clone Repository & Setup:**
```bash
git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env
- Choose your provider and edit
.env:
NVIDIA NIM (40 req/min free, recommended)
NVIDIA_NIM_API_KEY="nvapi-your-key-here"
MODEL_OPUS="nvidia_nim/z-ai/glm4.7"
MODEL_SONNET="nvidia_nim/moonshotai/kimi-k2-thinking"
MODEL_HAIKU="nvidia_nim/stepfun-ai/step-3.5-flash"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback
# Global switch for provider reasoning requests and Claude thinking blocks.
ENABLE_THINKING=true
OpenRouter (hundreds of models)
OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="open_router/deepseek/deepseek-r1-0528:free"
MODEL_SONNET="open_router/openai/gpt-oss-120b:free"
MODEL_HAIKU="open_router/stepfun/step-3.5-flash:free"
MODEL="open_router/stepfun/step-3.5-flash:free" # fallback
DeepSeek (direct API)
DEEPSEEK_API_KEY="your-deepseek-key-here"
MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat" # fallback
LM Studio (fully local, no API key)
MODEL_OPUS="lmstudio/unsloth/MiniMax-M2.5-GGUF"
MODEL_SONNET="lmstudio/unsloth/Qwen3.5-35B-A3B-GGUF"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="lmstudio/unsloth/GLM-4.7-Flash-GGUF" # fallback
llama.cpp (fully local, no API key)
LLAMACPP_BASE_URL="http://localhost:8080/v1"
MODEL_OPUS="llamacpp/local-model"
MODEL_SONNET="llamacpp/local-model"
MODEL_HAIKU="llamacpp/local-model"
MODEL="llamacpp/local-model" # fallback
Mix providers:
Each MODEL_* variable can use a different provider. MODEL is the fallback for unrecognized Claude models.
NVIDIA_NIM_API_KEY="nvapi-your-key-here"
OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="nvidia_nim/moonshotai/kimi-k2.5"
MODEL_SONNET="open_router/deepseek/deepseek-r1-0528:free"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback
🔄 Migration & Authentication
- Migration:
NIM_ENABLE_THINKINGwas removed in this release. Rename it toENABLE_THINKING. - Optional Authentication (restrict access to your proxy): Set
ANTHROPIC_AUTH_TOKENin.envto require clients to authenticate:
How it works:ANTHROPIC_AUTH_TOKEN="your-secret-token-here"- If
ANTHROPIC_AUTH_TOKENis empty (default), no authentication is required (backward compatible). - If set, clients must provide the same token via the
ANTHROPIC_AUTH_TOKENheader. - The
claude-pickscript automatically reads the token from.envif configured.
- If
Example usage:
# With authentication
ANTHROPIC_AUTH_TOKEN="your-secret-token-here" \
ANTHROPIC_BASE_URL="http://localhost:8082" claude
# claude-pick automatically uses the configured token
claude-pick
Use this feature if:
- Running the proxy on a public network.
- Sharing the server with others but restricting access.
- Wanting an additional layer of security.
💻 Usage Guide
Terminal 1: Start the proxy server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
Terminal 2: Run Claude Code:
Point ANTHROPIC_BASE_URL at the proxy root URL, not http://localhost:8082/v1.
$env:ANTHROPIC_AUTH_TOKEN="freecc"; $env:ANTHROPIC_BASE_URL="http://localhost:8082"; claude
# OR (Linux/macOS)
export ANTHROPIC_AUTH_TOKEN="freecc"
export ANTHROPIC_BASE_URL="http://localhost:8082"
claude
That's it! Claude Code now uses your configured provider for free.
🖥️ IDE Setup
VSCode Extension Setup:
- Start the proxy server (same as above).
- Open Settings (
Ctrl + ,) and search forclaude-code.environmentVariables. - Click Edit in
settings.jsonand add:
AI 자동 생성 콘텐츠
본 콘텐츠는 GitHub Trending Python (daily)의 원문을 AI가 자동으로 요약·번역·분석한 것입니다. 원 저작권은 원저작자에게 있으며, 정확한 내용은 반드시 원문을 확인해 주세요.
원문 바로가기