Skill Vault 패턴에 대한 이전 포스트를 읽으셨다면, 세션당 오버헤드를 96% 줄였다는 것을 아실 것입니다. 몇 주 동안 사용하다 보니 여전히 토큰을 소비하고 있는 것이 무엇인지 찾아보았고, 제거할 가치가 있는 3 개의 더 많은 싱크 (sink) 를 발견했습니다. 이는 후속 글입니다: 개별적으로는 작은 승리이지만, 함께라면 능력 손실 없이 모든 세션에서 약 5,000 토큰을 추가로 절감합니다.

토큰이 여전히 숨겨진 곳
Vault 이후, 내 기준은 세션당 약 51K 토큰이었습니다. 시스템 프롬프트를 파고들어 남는 것을 확인했습니다:

Skill list (still): ~3K 토큰
Project CLAUDE.md : ~2K 토큰
claude-mem auto-injected timeline: ~2K 토큰
Plugin hook reminders: ~1K 토큰 (각 턴마다 일부 반복됨)

이 중 세 가지는 불필요하거나 과도하게 무거웠습니다. 각 것을 어떻게 제거했는지 확인해 보겠습니다.

Sink #1: Bloated Root CLAUDE.md (~1.5K 토큰)
CLAUDE.md 파일은 해당 저장소를 만나는 모든 세션에 자동 로드됩니다. 내 경우 326 줄 / 8 KB 로 성장했으며, 각 구성 요소에 대한 quickstart 명령이 포함되어 있었습니다:

Flutter dev commands
Backend npm scripts
React build steps
ML service Python setup
A duplicate gstack skill listing

문제: 나는 모든 구성 요소의 지침을 각 턴마다 로드하고 있었지만, 실제로는 그 중 하나에서만 작업하고 있었습니다.
Fix: Hierarchical CLAUDE.md Files
Claude Code 는 CLAUDE.md 를 계층적으로 로드합니다 — 현재 작업 트리에 있는 것만 가져옵니다. 따라서 하나의 굵은 루트 파일 대신 이를 분할했습니다:

khetisahayak/
├── CLAUDE.md # 55 lines — overview, ports, creds only
├── kheti_sahayak_app/CLAUDE.md # Flutter details
├── frontend/CLAUDE.md # React details
├── ml/CLAUDE.md # ML service details
└── kheti_sahayak_backend/CLAUDE.md # already existed

루트에는 이제 다음만 포함되어 있습니다:

Project overview (5 lines)
Service ports table
Test credentials
Cross-cutting auth + DB notes
Troubleshooting
구성 요소별 세부 사항은 실제로 해당 서브디렉터리에서 작업할 때만 로드됩니다.
결과: 루트 CLAUDE.md 는 326 → 55 줄로 줄어듦 (8 KB → 1.6 KB). 세션당 약 1.5K 토큰 절감.

Sink #2: Plugin SessionStart Hooks
"Helpful" 컨텍스트 주입
나는 세션 간 지속적 메모리를 위해 claude-mem 을 사용합니다. 정말 유용합니다. 하지만 SessionStart hook 이 있으며, 각 대화 상단에 최근 관측의 타임라인을 자동 주입합니다 — 약 50 엔트리에 ~2K 토큰:

S499 Indeed Auto-Apply — User asked how to automate job applications
S498 Indeed MCP Integration Query — clarifying JobSpy vs Apify
2337 12:33p ✅ systemd/install.sh — Backend Services Enabled ...
... 47 more lines

나는 거의 필요로 하지 않았습니다.

이 자동 회상 (auto-recall) 기능은 원문과 동일하게 번역합니다.

When I want past context, I call mem-search explicitly. Fix: Disable the Auto-Inject, Keep the Memory The hook config lives at: ~/.claude/plugins/cache/thedotmack/claude-mem/<version>/hooks/hooks.json The SessionStart array has three hooks. The third is the timeline injection: { "type" : "command" , "command" : "... node bun-runner.js worker-service.cjs hook claude-code context ..." } I removed just that one entry. The other two SessionStart hooks (install + worker-start) and the recording hooks ( PostToolUse , Stop , SessionEnd ) stay intact, so memory is still being captured. The MCP search server still works — I just have to ask for it.

Always back up before editing plugin internals

cp hooks.json hooks.json.bak # Remove the 3rd SessionStart hook (jq or manual edit)
Result: ~2K tokens saved per session. Memory still works on demand.
⚠️ Caveat: editing a file in ~/.claude/plugins/cache/ will be overwritten on plugin upgrade. For durability, mirror the change in your user-level ~/.claude/settings.json hooks block, or add a small post-upgrade re-patch script.

Sink #3: Round 2 of the Skill Vault (~1.4K tokens)
The original vault was a one-time bulk move. After several weeks of actual usage, I saw which skills I'd installed and never touched . The vault was overdue for a second pass.
Audit script (same as the first post, run again):
for f in ~/.claude/skills/ * /SKILL.md ; do dir = $( dirname " $f " ) name = $( basename " $dir " ) size = $( wc -c < " $f " ) echo " $size $name " done | sort -rn | head -30
I ended up vaulting 27 more skills out of 73 actively loaded: 8 marketing skills ( mkt-content , mkt-seo , mkt-social , etc.) — I do real marketing in a separate context 6 research skills ( research , research-deep , research-report , etc.) — episodic, not daily 5 niche tools ( obsidian-vault , make-pdf , pair-agent , setup-browser-cookies , open-gstack-browser ) 4 design-heavy ( design-consultation , design-html , design-shotgun , devex-review ) — restored only when designing 3 plan reviews ( plan-design-review , plan-devex-review , plan-tune ) 1 interview prep ( staff-engineer-interview ) — used a few times last quarter, not weekly
Bulk move: for s in mkt-content mkt-email mkt-growth mkt-pr mkt-review mkt-seo mkt-social cmo \n research research-add-fields research-add-items research-deep research-report edit-article \n staff-engineer-interview \n obsidian-vault make-pdf pair-agent setup-browser-cookies open-gstack-browser \n plan-design-revie

w plan-devex-review plan-tune \ design-consultation design-html design-shotgun devex-review ; do mv ~/.claude/skills/ $s ~/.claude/skills-vault/ 2>/dev/null done

The skill-vault index skill from the original post still bridges everything — Claude knows where each skill lives and restores it on demand.
Result: 73 → 46 active skills. ~1.4K tokens saved.

The Combined Result Fix Tokens Saved (per session)
Sink #1 — CLAUDE.md trim + per-component split ~1.5K
Sink #2 — claude-mem timeline disabled ~2K
Sink #3 — Round 2 skill vault ~1.4K
Total ~4.9K

On top of the original 96% reduction from the Skill Vault, this is another solid bite.
But honestly, the dollar value isn't the point.
Why This Matters Beyond Cost
Every token in your context is a token Claude has to attend over before generating its response.
The bigger your prompt, the more diluted attention becomes on the actual task.
Big context ≠ better answers.
Frequently it's the opposite.
Targeted context wins.
The pattern across all three fixes is the same:
Audit what's auto-loaded vs. what's actually useful.
You'll be surprised.
Move episodic content out of the always-on path and into on-demand access.
Trust the model to pull what it needs — when it does need the vaulted skill, the subdir's CLAUDE.md, or the memory search, it will reach for it.
Less ambient noise. Sharper signal.

Tips for Your Own Audit
Inspect, don't guess.
wc -c your CLAUDE.md files.
ls ~/.claude/skills/ .
Read your plugin hooks.json files line by line.
Hierarchy is free.
Per-directory CLAUDE.md files cost nothing when you're not in that directory.
Plugin hooks are debt.
Every SessionStart or UserPromptSubmit hook is a tax.
Audit them by hand.
Some are essential (auth, telemetry); some inject "context" that's just clutter.
Re-vault every few weeks.
Usage patterns shift.
Skills that were daily three months ago may be quarterly now.
Yesterday's must-have is tomorrow's vault candidate.
Watch for per-turn taxes.
A UserPromptSubmit hook costs N tokens every single turn .
Even a small reminder block adds up fast in a long session.

Conclusion
The Skill Vault pattern is still the heaviest hitter.
These three follow-ups are the long tail:
CLAUDE.md → split per component, slim the root.
Plugin hooks → audit auto-injected context. Disable what you don't need.
Skill vault → revisit it. Vault more.
Together: another ~5K tokens per session, zero capability lost.
When your context is tight, your model is sharp.
I'm Prakash Ponali, a Staff Engineer with 1

엔터프라이즈 eCommerces 에서 6 년 이상 활동했습니다. 현재 안드라프라데시 (Telangana) 의 telugu 화농민들을 위한 농업 보조 앱 Khetisahayak 을 개발 중입니다. LinkedIn 에서 저를 찾아보세요.

Insights

After the Skill Vault: 3 More Hidden Token Sinks in Claude Code

요약

핵심 포인트

Always back up before editing plugin internals

댓글

실용적인 우위: Chili’s가 AI 토큰 대신 강력한 WiFi와 태블릿을 선택한 이유

제로 트러스트(Zero-trust) 거버넌스 보장을 갖춘 행성 지질 조사 미션을 위한 인간 정렬 Decision Transformers

Realtek RTL8723B/RTL8723BS를 기존 RTW88 Linux 드라이버에 통합하려는 시도

일자리에는 실제로 어떤 일이 일어나고 있는가? AI의 과장된 홍보와 현실의 구분

실용적인 우위: Chili’s가 AI 토큰 대신 강력한 WiFi와 태블릿을 선택한 이유

제로 트러스트(Zero-trust) 거버넌스 보장을 갖춘 행성 지질 조사 미션을 위한 인간 정렬 Decision Transformers

Realtek RTL8723B/RTL8723BS를 기존 RTW88 Linux 드라이버에 통합하려는 시도

일자리에는 실제로 어떤 일이 일어나고 있는가? AI의 과장된 홍보와 현실의 구분