본문으로 건너뛰기

© 2026 Molayo

GH Trending릴리즈2026. 04. 24. 13:42

Claude Code API 호출 무료화 프록시 구축 가이드 (NIM, OpenRouter 등)

요약

본 프로젝트는 Claude Code가 Anthropic API를 호출할 때 사용하는 트래픽을 NVIDIA NIM, OpenRouter, DeepSeek, LM Studio 등 다양한 백엔드 제공업체로 라우팅하는 경량 프록시입니다. 이 프록시는 40 req/min 무료 할당(NVIDIA NIM 기준) 및 로컬 모델 사용을 통해 API 비용 부담 없이 Claude Code의 기능을 활용할 수 있게 합니다. 환경 변수 설정만으로 기존 Claude Code CLI나 VSCode 확장 프로그램에 수정 없이 적용 가능하며, 복잡한 속성 처리(

핵심 포인트

  • NVIDIA NIM을 이용해 분당 40회 요청까지 무료로 API를 사용할 수 있습니다.
  • OpenRouter, DeepSeek, LM Studio 등 5가지 이상의 다양한 백엔드 제공업체를 지원합니다.
  • 환경 변수 설정만으로 Claude Code CLI 및 VSCode 확장 프로그램에 수정 없이 적용 가능합니다 (Drop-in Replacement).
  • Opus/Sonnet/Haiku 모델별로 다른 API 제공업체(Provider)를 자유롭게 혼합하여 사용할 수 있습니다.

A lightweight proxy that routes Claude Code's Anthropic API calls

This project provides a lightweight proxy that routes Claude Code's Anthropic API calls to various providers, including NVIDIA NIM (40 req/min free), OpenRouter (hundreds of models), DeepSeek (direct API), LM Studio (fully local), or llama.cpp (local with Anthropic endpoints).

✨ Features

  • Zero Cost: 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio.
  • Drop-in Replacement: Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed.
  • 5 Providers: NVIDIA NIM, OpenRouter, DeepSeek, LM Studio (local), llama.cpp (llama-server).
  • Per-Model Mapping: Route Opus / Sonnet / Haiku to different models and providers. Mix providers freely.
  • Thinking Token Support: Parses <think> tags and reasoning_content into native Claude thinking blocks.
  • Heuristic Tool Parser: Models outputting tool calls as text are auto-parsed into structured tool use.
  • Request Optimization: 5 categories of trivial API calls intercepted locally, saving quota and latency.
  • Smart Rate Limiting: Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap.
  • Discord / Telegram Bot: Remote autonomous coding with tree-based threading, session persistence, and live progress.
  • Subagent Control: Task tool interception forces run_in_background=False. No runaway subagents.
  • Extensible: Clean BaseProvider and MessagingPlatform ABCs. Add new providers or platforms easily.

🔑 Getting Started

API Keys & Local Setup

  • Get an API key (or use LM Studio / llama.cpp locally):
    • NVIDIA NIM: build.nvidia.com/settings/api-keys
    • OpenRouter: openrouter.ai/keys
    • DeepSeek: platform.deepseek.com/api_keys
    • LM Studio: No API key needed. Run locally with LM Studio.
    • llama.cpp: No API key needed. Run llama-server locally.

🚀 Installation

  1. Install Claude Code:

pip install uv

   *(If `uv` is already installed, run `uv self update` to get the latest version.)*

2. **Clone Repository & Setup:**
   ```bash
git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env
  1. Choose your provider and edit .env:

NVIDIA NIM (40 req/min free, recommended)

NVIDIA_NIM_API_KEY="nvapi-your-key-here"
MODEL_OPUS="nvidia_nim/z-ai/glm4.7"
MODEL_SONNET="nvidia_nim/moonshotai/kimi-k2-thinking"
MODEL_HAIKU="nvidia_nim/stepfun-ai/step-3.5-flash"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback
# Global switch for provider reasoning requests and Claude thinking blocks.
ENABLE_THINKING=true

OpenRouter (hundreds of models)

OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="open_router/deepseek/deepseek-r1-0528:free"
MODEL_SONNET="open_router/openai/gpt-oss-120b:free"
MODEL_HAIKU="open_router/stepfun/step-3.5-flash:free"
MODEL="open_router/stepfun/step-3.5-flash:free" # fallback

DeepSeek (direct API)

DEEPSEEK_API_KEY="your-deepseek-key-here"
MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat" # fallback

LM Studio (fully local, no API key)

MODEL_OPUS="lmstudio/unsloth/MiniMax-M2.5-GGUF"
MODEL_SONNET="lmstudio/unsloth/Qwen3.5-35B-A3B-GGUF"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="lmstudio/unsloth/GLM-4.7-Flash-GGUF" # fallback

llama.cpp (fully local, no API key)

LLAMACPP_BASE_URL="http://localhost:8080/v1"
MODEL_OPUS="llamacpp/local-model"
MODEL_SONNET="llamacpp/local-model"
MODEL_HAIKU="llamacpp/local-model"
MODEL="llamacpp/local-model" # fallback

Mix providers:
Each MODEL_* variable can use a different provider. MODEL is the fallback for unrecognized Claude models.

NVIDIA_NIM_API_KEY="nvapi-your-key-here"
OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL_OPUS="nvidia_nim/moonshotai/kimi-k2.5"
MODEL_SONNET="open_router/deepseek/deepseek-r1-0528:free"
MODEL_HAIKU="lmstudio/unsloth/GLM-4.7-Flash-GGUF"
MODEL="nvidia_nim/z-ai/glm4.7" # fallback

🔄 Migration & Authentication

  • Migration: NIM_ENABLE_THINKING was removed in this release. Rename it to ENABLE_THINKING.
  • Optional Authentication (restrict access to your proxy): Set ANTHROPIC_AUTH_TOKEN in .env to require clients to authenticate:
    ANTHROPIC_AUTH_TOKEN="your-secret-token-here"
    
    How it works:
    • If ANTHROPIC_AUTH_TOKEN is empty (default), no authentication is required (backward compatible).
    • If set, clients must provide the same token via the ANTHROPIC_AUTH_TOKEN header.
    • The claude-pick script automatically reads the token from .env if configured.

Example usage:

# With authentication
ANTHROPIC_AUTH_TOKEN="your-secret-token-here" \
ANTHROPIC_BASE_URL="http://localhost:8082" claude
# claude-pick automatically uses the configured token
claude-pick

Use this feature if:

  • Running the proxy on a public network.
  • Sharing the server with others but restricting access.
  • Wanting an additional layer of security.

💻 Usage Guide

Terminal 1: Start the proxy server:

uv run uvicorn server:app --host 0.0.0.0 --port 8082

Terminal 2: Run Claude Code:
Point ANTHROPIC_BASE_URL at the proxy root URL, not http://localhost:8082/v1.

$env:ANTHROPIC_AUTH_TOKEN="freecc"; $env:ANTHROPIC_BASE_URL="http://localhost:8082"; claude
# OR (Linux/macOS)
export ANTHROPIC_AUTH_TOKEN="freecc"
export ANTHROPIC_BASE_URL="http://localhost:8082"
claude

That's it! Claude Code now uses your configured provider for free.

🖥️ IDE Setup

VSCode Extension Setup:

  1. Start the proxy server (same as above).
  2. Open Settings (Ctrl + ,) and search for claude-code.environmentVariables.
  3. Click Edit in settings.json and add:

AI 자동 생성 콘텐츠

본 콘텐츠는 GitHub Trending Python (daily)의 원문을 AI가 자동으로 요약·번역·분석한 것입니다. 원 저작권은 원저작자에게 있으며, 정확한 내용은 반드시 원문을 확인해 주세요.

원문 바로가기
1

댓글

0