본문으로 건너뛰기

© 2026 Molayo

OpenAI헤드라인2026. 04. 24. 22:35

GPT-5.2-Codex 출시: 에이전트 코딩 및 사이버 보안 역량 대폭 강화

요약

OpenAI가 복잡한 실제 소프트웨어 엔지니어링을 위한 최신 에이전트 코딩 모델인 GPT-5.2-Codex를 공개했습니다. 이 버전은 컨텍스트 압축, 리팩토링 및 마이그레이션 같은 대규모 코드 변경 처리 능력 향상과 함께 Windows 환경에서의 성능 개선에 초점을 맞췄습니다. 특히 사이버 보안 역량이 이전 모델 대비 크게 강화되어, 실제 취약점 발견 과정에서 도움을 줄 수 있는 수준에 도달했습니다. 개발자들은 이를 통해 장기적인 코딩 작업의 신뢰성과 효율성을 높일 수 있습니다.

핵심 포인트

  • GPT-5.2-Codex는 복잡한 소프트웨어 엔지니어링 작업을 위한 최적화된 에이전트 모델로, 컨텍스트 압축과 대규모 코드 변경 처리에 강점을 가집니다.
  • SWE-Bench Pro와 Terminal-Bench 2.0에서 최고 수준의 성능을 달성하며, 현실적인 터미널 환경에서의 에이전트 코딩 능력을 입증했습니다.
  • Windows 환경 및 장기 컨텍스트 이해도가 향상되어, 대규모 리팩토링이나 코드 마이그레이션 같은 복잡한 작업을 안정적으로 수행할 수 있습니다.
  • 사이버 보안 역량이 크게 강화되었으며, 실제 취약점 연구 과정에서 도움을 줄 만큼의 수준에 도달했습니다. (단, 'High' 레벨에는 아직 미치지 못함)
  • 디자인 목업(Design mock)을 기능적 프로토타입으로 빠르게 변환하는 등 시각 정보 처리 능력이 향상되었습니다.

Introducing GPT-5.2-Codex

Today we’re releasing GPT‑5.2‑Codex, the most advanced agentic coding model yet for complex, real-world software engineering. GPT‑5.2‑Codex is a version of GPT‑5.2 further optimized for agentic coding in Codex, including improvements on long-horizon work through context compaction, stronger performance on large code changes like refactors and migrations, improved performance in Windows environments, and significantly stronger cybersecurity capabilities.

As our models continue to advance along the intelligence frontier, we’ve observed that these improvements also translate to capability jumps in specialized domains such as cybersecurity. For example, just last week, a security researcher using GPT‑5.1‑Codex‑Max with Codex CLI found and responsibly disclosed(opens in a new window) a vulnerability in React that could lead to source code exposure.

GPT‑5.2‑Codex has stronger cybersecurity capabilities than any model we’ve released so far. These advances can help strengthen cybersecurity at scale, but they also raise new dual-use risks that require careful deployment. While GPT‑5.2‑Codex does not reach a ‘High’ level of cyber capability under our Preparedness Framework, we’re designing our deployment approach with future capability growth in mind.

We're releasing GPT‑5.2‑Codex today in all Codex surfaces for paid ChatGPT users, and working towards safely enabling access to GPT‑5.2‑Codex for API users in the coming weeks. In parallel, we’re piloting invite-only trusted access to upcoming capabilities and more permissive models for vetted professionals and organizations focused on defensive cybersecurity work. We believe that this approach to deployment will balance accessibility with safety.

GPT‑5.2‑Codex builds on GPT‑5.2’s strengths in professional knowledge work and GPT‑5.1‑Codex‑Max’s frontier agentic coding and terminal-using capabilities. GPT‑5.2‑Codex is now better at long-context understanding, reliable tool calling, improved factuality, and native compaction, making it a more dependable partner for long running coding tasks, while remaining token-efficient in its reasoning.

GPT‑5.2‑Codex achieves state-of-the-art performance on SWE-Bench Pro and Terminal-Bench 2.0, benchmarks designed to test agentic performance on a wide variety of tasks in realistic terminal environments. It is also much more effective and reliable at agentic coding in native Windows environments, building on capabilities introduced in GPT‑5.1‑Codex‑Max.

With these improvements, Codex is more capable at working in large repositories over extended sessions with full context intact. It can more reliably complete complex tasks like large refactors, code migrations, and feature builds — continuing to iterate without losing track, even when plans change or attempts fail.

Stronger vision performance enables GPT‑5.2‑Codex to more accurately interpret screenshots, technical diagrams, charts, and UI surfaces shared during coding sessions.

Codex can take design mocks and quickly translate them to functional prototypes, and you can pair with Codex to take these prototypes to production.

Design mock
Prototype generated by GPT‑5.2‑Codex

When charting performance on one of our core cybersecurity evaluations over time, we see a sharp jump in capability starting with GPT‑5‑Codex, another large jump with GPT‑5.1‑Codex‑Max and now a third jump with GPT‑5.2‑Codex. We expect that upcoming AI models will continue on this trajectory. In preparation, we are planning and evaluating as though each new model could reach ‘High’ levels of cybersecurity capability, as measured by our Preparedness Framework(opens in a new window). While GPT‑5.2‑Codex has not yet reached ‘High’ level of cyber capability, we are preparing for future models that cross that threshold. Due to the increased cyber capabilities, we have added additional safeguards in the model and in the product, which are outlined in the system card.

Modern society runs on software, and its reliability depends on strong cybersecurity—keeping critical systems in banking, healthcare, communications, and essential services online, protecting sensitive data, and ensuring people can trust the software they rely on every day. Vulnerabilities can exist long before anyone knows about them, and finding, validating, and fixing them often depends on a community of engineers and independent security researchers equipped with the right tools.

On December 11, 2025, the React team published three security vulnerabilities affecting apps built with React Server Components. What made this disclosure notable was not only the vulnerabilities themselves, but how they were uncovered.

Andrew MacPherson, a principal security engineer at Privy (a Stripe company), was using GPT‑5.1‑Codex‑Max with Codex CLI and other coding agents to reproduce and study a different critical React vulnerability disclosed the week prior, known as React2Shell(opens in a new window) (CVE-2025-55182(opens in a new window)). His goal was to evaluate how well the model could assist with real-world vulnerability research.

He initially attempted several zero-shot analyses, prompting the model to examine the patch and identify the vulnerability it addressed. When that did not yield results, he shifted to a higher-volume, iterative prompting approach. When those approaches did not succeed, he guided Codex through standard defensive security workflows—setting up a local test environment, reasoning through potential attack surfaces, and using fuzzing to probe the system with malformed inputs. While attempting to reproduce the original React2Shell issue, Codex surfaced unexpected behaviors that warranted deeper investigation. Over the course of a single week, this process led to the discovery of previously unknown vulnerabilities, which were responsibly disclosed to the React team.

This demonstrates how advanced AI systems can materially accelerate defensive security work in widely used, real-world software. At the same time, capabilities that help defenders move faster can also be misused by bad actors.

As agentic systems become more capable in cybersecurity-relevant tasks, we are making it a core priority to ensure these advances are deployed responsibly—pairing every gain in capability with stronger safeguards, tighter access controls, and ongoing collaboration with the security community.

Security teams can run into restrictions when attempting to emulate threat actors, analyze malware to support remediation, or stress test critical infrastructure. We are developing a trusted access pilot to remove that friction for qualifying users and organizations and enable trusted defenders to use frontier AI cyber capabilities to accelerate cyberdefense.

Initially the pilot program will be invite-only for vetted security professionals with a track record of responsible vulnerability disclosure and organizations with a clear professional cybersecurity use case. Qualifying participants will get access to our most capable models for defensive use-cases to enable legitimate dual-use work.

If you’re a security professional or part of an organization doing ethical security work like vulnerability research or authorized red-teaming, we invite you to express interest in joining and share feedback on what you’d like to see from the program here(opens in a new window).

GPT‑5.2‑Codex represents a step forward in how advanced AI can support real-world software engineering and specialized domains like cybersecurity—helping developers and defenders tackle complex, long-horizon work, and strengthening the tools available for responsible security research.

By rolling GPT‑5.2‑Codex out gradually, pairing deployment with safeguards, and working closely with the security community, we’re aiming to maximize defensive impact while reducing the risk of misuse. What we learn from this release will directly inform how we expand access over time as the software and cyber frontiers continue to advance.

AI 자동 생성 콘텐츠

본 콘텐츠는 OpenAI Blog의 원문을 AI가 자동으로 요약·번역·분석한 것입니다. 원 저작권은 원저작자에게 있으며, 정확한 내용은 반드시 원문을 확인해 주세요.

원문 바로가기
4

댓글

0