GH Trending릴리즈2026. 04. 24. 13:42

Claude Code API 호출 무료화 프록시 구축 가이드 (NIM, OpenRouter 등)

요약

본 프로젝트는 Claude Code가 Anthropic API를 호출할 때 사용하는 트래픽을 NVIDIA NIM, OpenRouter, DeepSeek, LM Studio 등 다양한 백엔드 제공업체로 라우팅하는 경량 프록시입니다. 이 프록시는 40 req/min 무료 할당(NVIDIA NIM 기준) 및 로컬 모델 사용을 통해 API 비용 부담 없이 Claude Code의 기능을 활용할 수 있게 합니다. 환경 변수 설정만으로 기존 Claude Code CLI나 VSCode 확장 프로그램에 수정 없이 적용 가능하며, 복잡한 속성 처리(

핵심 포인트

NVIDIA NIM을 이용해 분당 40회 요청까지 무료로 API를 사용할 수 있습니다.
OpenRouter, DeepSeek, LM Studio 등 5가지 이상의 다양한 백엔드 제공업체를 지원합니다.
환경 변수 설정만으로 Claude Code CLI 및 VSCode 확장 프로그램에 수정 없이 적용 가능합니다 (Drop-in Replacement).
Opus/Sonnet/Haiku 모델별로 다른 API 제공업체(Provider)를 자유롭게 혼합하여 사용할 수 있습니다.

A lightweight proxy that routes Claude Code's Anthropic API calls

This project provides a lightweight proxy that routes Claude Code's Anthropic API calls to various providers, including NVIDIA NIM (40 req/min free), OpenRouter (hundreds of models), DeepSeek (direct API), LM Studio (fully local), or llama.cpp (local with Anthropic endpoints).

✨ Features

Zero Cost: 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio.
Drop-in Replacement: Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed.
5 Providers: NVIDIA NIM, OpenRouter, DeepSeek, LM Studio (local), llama.cpp (llama-server).
Per-Model Mapping: Route Opus / Sonnet / Haiku to different models and providers. Mix providers freely.
Thinking Token Support: Parses <think> tags and reasoning_content into native Claude thinking blocks.
Heuristic Tool Parser: Models outputting tool calls as text are auto-parsed into structured tool use.
Request Optimization: 5 categories of trivial API calls intercepted locally, saving quota and latency.
Smart Rate Limiting: Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap.
Discord / Telegram Bot: Remote autonomous coding with tree-based threading, session persistence, and live progress.
Subagent Control: Task tool interception forces run_in_background=False. No runaway subagents.
Extensible: Clean BaseProvider and MessagingPlatform ABCs. Add new providers or platforms easily.

🔑 Getting Started

API Keys & Local Setup

Get an API key (or use LM Studio / llama.cpp locally):
- NVIDIA NIM: build.nvidia.com/settings/api-keys
- OpenRouter: openrouter.ai/keys
- DeepSeek: platform.deepseek.com/api_keys
- LM Studio: No API key needed. Run locally with LM Studio.
- llama.cpp: No API key needed. Run llama-server locally.

🚀 Installation

Install Claude Code:

pip install uv

   *(If `uv` is already installed, run `uv self update` to get the latest version.)*

2. **Clone Repository & Setup:**
   ```bash
git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env

Choose your provider and edit .env:

NVIDIA NIM (40 req/min free, recommended)

NVIDIA_NIM_API_KEY="nvapi-your-key-here"
MODEL_OPUS="nvidia_nim/z-ai/glm4.7"
MODEL_SONNET="nvidia_nim/moonshotai/kimi-k2-thinking"
MODEL_HAIKU="nvidia_nim/stepfun-ai/step-3.5-flash"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback
# Global switch for provider reasoning requests and Claude thinking blocks.
ENABLE_THINKING=true

OpenRouter (hundreds of models)

OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="open_router/deepseek/deepseek-r1-0528:free"
MODEL_SONNET="open_router/openai/gpt-oss-120b:free"
MODEL_HAIKU="open_router/stepfun/step-3.5-flash:free"
MODEL="open_router/stepfun/step-3.5-flash:free" # fallback

DeepSeek (direct API)

DEEPSEEK_API_KEY="your-deepseek-key-here"
MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat" # fallback

LM Studio (fully local, no API key)

MODEL_OPUS="lmstudio/unsloth/MiniMax-M2.5-GGUF"
MODEL_SONNET="lmstudio/unsloth/Qwen3.5-35B-A3B-GGUF"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="lmstudio/unsloth/GLM-4.7-Flash-GGUF" # fallback

llama.cpp (fully local, no API key)

LLAMACPP_BASE_URL="http://localhost:8080/v1"
MODEL_OPUS="llamacpp/local-model"
MODEL_SONNET="llamacpp/local-model"
MODEL_HAIKU="llamacpp/local-model"
MODEL="llamacpp/local-model" # fallback

Mix providers:
Each MODEL_* variable can use a different provider. MODEL is the fallback for unrecognized Claude models.

NVIDIA_NIM_API_KEY="nvapi-your-key-here"
OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="nvidia_nim/moonshotai/kimi-k2.5"
MODEL_SONNET="open_router/deepseek/deepseek-r1-0528:free"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback

🔄 Migration & Authentication

Migration: NIM_ENABLE_THINKING was removed in this release. Rename it to ENABLE_THINKING.
Optional Authentication (restrict access to your proxy): Set ANTHROPIC_AUTH_TOKEN in .env to require clients to authenticate:
```
ANTHROPIC_AUTH_TOKEN="your-secret-token-here"
```
How it works:
- If ANTHROPIC_AUTH_TOKEN is empty (default), no authentication is required (backward compatible).
- If set, clients must provide the same token via the ANTHROPIC_AUTH_TOKEN header.
- The claude-pick script automatically reads the token from .env if configured.

Example usage:

# With authentication
ANTHROPIC_AUTH_TOKEN="your-secret-token-here" \
ANTHROPIC_BASE_URL="http://localhost:8082" claude
# claude-pick automatically uses the configured token
claude-pick

Use this feature if:

Running the proxy on a public network.
Sharing the server with others but restricting access.
Wanting an additional layer of security.

💻 Usage Guide

Terminal 1: Start the proxy server:

uv run uvicorn server:app --host 0.0.0.0 --port 8082

Terminal 2: Run Claude Code:
Point ANTHROPIC_BASE_URL at the proxy root URL, not http://localhost:8082/v1.

$env:ANTHROPIC_AUTH_TOKEN="freecc"; $env:ANTHROPIC_BASE_URL="http://localhost:8082"; claude
# OR (Linux/macOS)
export ANTHROPIC_AUTH_TOKEN="freecc"
export ANTHROPIC_BASE_URL="http://localhost:8082"
claude

That's it! Claude Code now uses your configured provider for free.

🖥️ IDE Setup

VSCode Extension Setup:

Start the proxy server (same as above).
Open Settings (Ctrl + ,) and search for claude-code.environmentVariables.
Click Edit in settings.json and add:

AI 자동 생성 콘텐츠

원문 바로가기