GH Trending릴리즈2026. 05. 09. 19:00

duoan/TorchCode

요약

TorchCode는 PyTorch와 같은 딥러닝 프레임워크의 핵심 구성 요소(예: Softmax, LayerNorm, Linear Layer 등)를 처음부터 직접 구현해 볼 수 있도록 설계된 구조화된 코딩 연습 환경입니다. 이 플랫폼은 Meta, Google DeepMind와 같은 최고 수준의 ML 기업들이 면접에서 요구하는 '기억으로 코드 작성' 능력을 키우는 데 초점을 맞추고 있습니다. 사용자는 클라우드나 회원가입 없이 Hugging Face Spaces 또는 Google Colab 등 접근성이 좋은 환경에서 즉시 연습할 수 있으며, 다양한 난이도와 빈도로 분류된 문제 목록을 통해 체계적인 학습 경로를 제공받습니다. 이를 통해 이론적 지식뿐만 아니라 실제 코딩 능력을 검증하고 향상시키는 데 도움을 줍니다.

핵심 포인트

ML 엔지니어에게 필수적인 '기억으로 코드 작성' 능력(Whiteboard Coding)을 집중적으로 훈련할 수 있는 환경 제공.
ReLU, Softmax, LayerNorm, Linear Layer 등 핵심 PyTorch 구성 요소를 `torch.nn` 사용 없이 직접 구현하는 연습에 초점.
Hugging Face Spaces 또는 Google Colab에서 별도의 설치나 회원가입 없이 즉시 접근하여 사용할 수 있어 진입 장벽이 매우 낮음.
문제별 난이도, 출제 빈도(🔥), 중요 개념 등을 명확히 제시하여 학습 계획을 세우기 용이함.
Docker 및 Podman 지원을 통해 로컬 환경에서도 안정적으로 실행할 수 있는 다양한 접근 방식을 제공함.

PyTorch 면접을 통과하세요.

从零开始实现算子和架构——这正是顶尖 ML 팀이 테스트하는 핵심 기술.

LeetCode 와 같은 방식이지만 텐서 (Tensor) 를 위한 것. 자체 호스팅. Jupyter 기반. 즉각적인 피드백.

Meta, Google DeepMind, OpenAI 등 Top 기업들은 ML 엔지니어에게 흰색 보드에서 기억으로 코드를 작성할 수 있는 능력을 기대합니다. 논문만 읽는 것은 충분하지 않습니다—you need to write softmax, LayerNorm, MultiHeadAttention, 그리고 전체 Transformer 블록 코드.

TorchCode 는 다음을 제공하는 구조화된 연습 환경을 제공합니다:

클라우드 없음. 회원가입 없음. GPU 필요 없음. 단순히 make run — 또는 Hugging Face 에서 즉시 시도해 보세요.

Hugging Face Spaces 에서 시작 — 브라우저에서 전체 JupyterLab 환경을 엽니다. 설치할 필요가 없습니다.

또는 Google Colab 에서 직접 모든 노트를 열 수 있습니다 — 각 노트북에는 배지가 있습니다.

Google Colab 에서 PyPI 에서 판정자를 설치하여 check(...) 를 클론 없이 실행할 수 있습니다:

!pip install torch-judge

그런 다음 노트북 셀에서:

from torch_judge import check, status, hint, reset_progress
status() # 모든 문제와 진행 상황을 나열합니다
check("relu") # "relu" 작업의 테스트를 실행합니다
...

docker run -p 8888:8888 -e PORT=8888 ghcr.io/duoan/torchcode:latest

플랫폼에 레지스트리 이미지가 사용할 수 없으면 Option 2 를 사용하세요. Apple Silicon / arm64 의 일반적인 경로입니다.

make run

는 미리 빌드된 이미지를 먼저 시도하고 필요할 때 로컬 빌드로 자동 전환합니다.

http://localhost:8888 을 엽니다 — 그것이 전부입니다. Docker 와 Podman (자동 감지) 모두 지원됩니다.

빈도: 🔥 = 면접에서 매우 유력, ⭐ = 자주 묻는 질문, 💡 = 새로운 / 차별화 요소

ML 코딩 면접의 기본. torch.nn 을 사용하지 않고 작성할 것을 요구받습니다.

#	Problem	What You'll Implement
1	ReLU	`relu(x)`
🔥	Activation functions, element-wise ops
2	Softmax	`my_softmax(x, dim)`
🔥	Numerical stability, exp/log tricks
16	Cross-Entropy Loss	`cross_entropy_loss(logits, targets)`
🔥	Log-softmax, logsumexp trick
17	Dropout	`MyDropout` (nn.Module)
🔥	Train/eval mode, inverted scaling
18	Embedding	`MyEmbedding` (nn.Module)
🔥	Lookup table, `weight[indices]`

LLM 또는 Transformer 관련 역할을 인터뷰할 경우, 이 중 적어도 하나는 반드시 등장합니다.

Attention & Transformer Components

#	Problem	What You'll Implement	Difficulty	Freq
23	Cross-Attention	`MultiHeadCrossAttention` (nn.Module)	⭐	Encoder-decoder, Q from decoder, K/V from encoder
5	Scaled Dot-Product Attention	`scaled_dot_product_attention(Q, K, V)`	🔥	`softmax(QK^T/√d_k)V`, the foundation of everything
6	Multi-Head Attention	`MultiHeadAttention` (nn.Module)	🔥	Parallel heads, split/concat, projection matrices
9	Causal Self-Attention	`causal_attention(Q, K, V)`	🔥	Autoregressive masking with `-inf`, GPT-style
10	Grouped Query Attention	`GroupQueryAttention` (nn.Module)	⭐	GQA (LLaMA 2), KV sharing across heads
11	Sliding Window Attention	`sliding_window_attention(Q, K, V, w)`	⭐	Mistral-style local attention, O(n·w) complexity
12	Linear Attention	`linear_attention(Q, K, V)`	💡	Kernel trick, `φ(Q)(φ(K)^TV)`, O(n·d²)
14	KV Cache Attention	`KVCacheAttention` (nn.Module)	🔥	Incremental decoding, cache K/V, prefill vs decode
24	RoPE	`apply_rope(q, k)`	🔥	Rotary position embedding, relative position via rotation
25	Flash Attention	`flash_attention(Q, K, V, block_size)`	💡	Tiled attention, online softmax, memory-efficient

| # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts |
|---|---|---|---|---|
| 26 | LoRA | LoRALinear (nn.Module) | ⭐ | Low-rank adaptation, frozen base + BA update |
| 27 | ViT Patch Embedding | PatchEmbedding (nn.Module) | 💡 | Image → patches → linear projection |
| 13 | GPT-2 Block | GPT2Block (nn.Module) | ⭐ | Pre-norm, causal MHA + MLP (4x, GELU), residual connections |
| 28 | Mixture of Experts | MixtureOfExperts (nn.Module) | ⭐ | Mixtral-style, top-k routing, expert MLPs |

| # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts |
|---|---|---|---|---|
| 29 | Adam Optimizer | MyAdam | ⭐ | Momentum + RMSProp, bias correction |
| 30 | Cosine LR Scheduler | cosine_lr_schedule(step, ...) | ⭐ | Linear warmup + cosine annealing |

Conclusion

이 문서에서는 Transformer 아키텍처의 핵심 구성 요소와 관련된 문제를 정리했습니다. 각 항목은 구현해야 할 코드와 주요 개념을 포함하며, 학습 및 프로젝트에 활용하실 수 있습니다.

참고: 이 번역은 원문의 내용을 충실히 반영하고 있습니다.

문제 목록

#	Problem	What You'll Implement
32	Top-k / Top-p Sampling	`sample_top_k_top_p(logits, ...)`
🔥	Nucleus sampling, temperature scaling
33	Beam Search	`beam_search(log_prob_fn, ...)`
🔥	Hypothesis expansion, pruning, eos handling
34	Speculative Decoding	`speculative_decode(target, draft, ...)`
💡	Accept/reject, draft model acceleration

| # | Problem | What You'll Implement | Difficulty | Freq | Key Concepts |
|---|---|---|---|---|
| 35 | BPE Tokenizer | SimpleBPE |
💡 | Byte-pair encoding, merge rules, subword splits | |
| 36 | INT8 Quantization | Int8Linear (nn.Module) |
💡 | Per-channel quantize, scale/zero-point, buffer vs param | |
| 37 | DPO Loss | dpo_loss(chosen, rejected, ...) |
💡 | Direct preference optimization, alignment training | |
| 38 | GRPO Loss | grpo_loss(logps, rewards, group_ids, eps) |
💡 | Group relative policy optimization, RLAIF, within-group normalized advantages | |
| 39 | PPO Loss | ppo_loss(new_logps, old_logps, advantages, clip_ratio) |
💡 | PPO clipped surrogate loss, policy gradient, trust region |

각 문제는 두 가지 노트북을 포함합니다:

File	Purpose
`01_relu.ipynb`
✏️ Blank template — write your code here
`01_relu_solution.ipynb`
📖 Reference solution — check when stuck

1. 빈 노트북 열기 → 문제 설명 읽기
2. 구현하기 → 기본 PyTorch 연산자만 사용
3. 디버깅 자유롭게 진행하기 → print(x.shape), 기울기 확인 등
...

from torch_judge import check, hint, status
check("relu") # 구현 평가
hint("causal_attention") # 전체 스포일러 없이 힌트 제공
...

총: ~12–16 시간 (3–4 주에 걸쳐). 마감 기한을 가진 면접 준비에 적합.

Week	Focus	Problems
1
🧱 Foundations	ReLU → Softmax → CE Loss → Dropout → Embedding → GELU → Linear → LayerNorm → BatchNorm → RMSNorm → SwiGLU MLP → Conv2d	2–3 hrs
2
🧠 Attention Deep Dive	SDPA → MHA → Cross-Attn → Causal → GQA → KV Cache → Sliding Window → RoPE → Linear Attn → Flash Attn	3–4 hrs
3
🏗️ Architecture + Training	GPT-2 Block → LoRA → MoE → ViT Patch → Adam → Cosine LR → Grad Clip → Grad Accumulation → Kaiming Init	3–4 hrs
4
🎯 Inference + Advanced	Top-k/p Sampling → Beam Search → Speculative Decoding → BPE → INT8 Quant → DPO Loss → GRPO Loss → PPO Loss + speed run	3–4 hrs

┌──────────────────────────────────────────┐
│ Docker / Podman Container │
│ │
...

Single container. Single port. No database. No frontend framework. No GPU.

make run # Build & start (http://localhost:8888)
make stop # Stop the container
make clean # Stop + remove volumes + reset all progress

TorchCode uses auto-discovery — just drop a new file in torch_judge/tasks/

TASK = {
"id": "my_task",
"title": "My Custom Problem",
...

No registration needed. The judge picks it up automatically.

The judge is published as a separate package so Colab/users can pip install torch-judge
without cloning the repo.

Pushing to master
after changing the package version triggers .github/workflows/pypi-publish.yml
, which builds and uploads to PyPI. No git tag is required.

Bump version in torch_judge/_version.py
(e.g. __version__ = "0.1.1")

.Configure PyPI Trusted Publisher(one-time):- PyPI → Your project
torch-judge→Publishing→Add a new pending publisher - Owner:
duoan
, Repository:TorchCode, Workflow:pypi-publish.yml, Environment: (leave empty) - Run the workflow once (push a version bump to
master
orActions → Publish torch-judge to PyPI → Run workflow); PyPI will then link the publisher.

PyPI → Your project
Release: commit the version bump and git push origin master

.Alternatively, use an API token: add repository secret PYPI_API_TOKEN
(value = pypi-...
from PyPI) and set TWINE_USERNAME=__token__
and TWINE_PASSWORD
from that secret in the workflow if you prefer not to use Trusted Publishing.

pip install build twine
python -m build
twine upload dist/*

Version is in torch_judge/_version.py
; bump it before each release.

Do I need a GPU?
No. Everything runs on CPU. The problems test correctness and understanding, not throughput.

Can I keep my solutions between runs?
Blank templates reset on every
make run
so you practice from scratch. Save your work under a different filename if you want to keep it. You can also click the 🔄 Resetbutton in the notebook toolbar at any time to restore the blank template without restarting.

Can I use Google Colab instead?
Yes! Every notebook has an
Open in Colabbadge at the top. Click it to open the problem directly in Google Colab — no Docker or local setup needed. You can also use the
Colabtoolbar button inside JupyterLab.

How are solutions graded?

AI 자동 생성 콘텐츠

원문 바로가기

duoan/TorchCode

요약

핵심 포인트

Attention & Transformer Components

Table of Contents

Table of Contents

Table of Contents

Conclusion

문제 목록

댓글