TL;DR
On 14 May 2026, the Wall Street Journal and Calif (a Palo Alto research firm) disclosed the first public macOS kernel memory corruption exploit on Apple M5 silicon, defeating Memory Integrity Enforcement (MIE). The chain was built using Anthropic's Claude Mythos Preview.
The work took five days end-to-end. Mythos didn't act alone — Calif's engineers paired it with human expertise. The point of the experiment was the pairing.
Mythos has already found thousands of zero-days across every major OS and browser, with a 72.4% working-exploit rate against Firefox JS shell targets (Opus 4.6: 14.4%). Fewer than 1% of what it has found has been patched. Palo Alto Networks estimates a three-to-five-month window before AI-driven exploitation becomes the norm outside Project Glasswing.
Australia's APRA-regulated entities, ASD-aligned operators, and any organisation under ISO 27001 / 42001 should be re-running their patching, compensating-controls and third-party AI-risk assumptions this quarter, not next year.
1. What actually happened
The timeline
- 7 Apr
- Anthropic announces Claude Mythos Preview and Project Glasswing (Apple, AWS, Google, Microsoft, CrowdStrike, NVIDIA, Palo Alto Networks, Linux Foundation, JPMorgan, Cisco, Broadcom as founding partners). Anthropic's red team blog shows Mythos finding a 27-year-old OpenBSD bug and a 17-year-old FreeBSD remote root (CVE-2026-4747).
- 25 Apr
- Calif's Bruce Dang finds the first macOS kernel bug.
- 27 Apr
- Dion Blazakis joins Calif. A second linked vulnerability is identified. Mythos Preview is deployed to mine exploit paths.
- 1 May
- Working data-only kernel LPE chain lands. Unprivileged local user → root on macOS 26.4.1 (25E253), bare-metal M5, with kernel MIE enabled.
- 11 May
- Calif walks into Apple Park and hands over a 55-page report, laser-printed. macOS 26.5 (Tahoe) ships with a kernel-level fix credited to Calif and Anthropic.
- 14 May
- WSJ breaks the story; Calif publishes its write-up; Mashable, Engadget, TechRadar, 9to5Mac, MacRumors, SC Media, Tom's Hardware follow within hours.
What the exploit actually does
Class: data-only kernel local privilege escalation (LPE). Two vulnerabilities chained, plus several techniques.
Target: macOS 26.4.1 (25E253), bare-metal M5, kernel MIE enabled.
Entry point: unprivileged local user account, using only normal system calls — no entitlements, no kernel extensions, no special drivers.
End state: root shell on the host.
Why "data-only" matters: the chain corrupts kernel data (credential structures), not kernel code. MIE's memory-tag checks are designed primarily to stop code corruption and classic UAF/OOB code paths. Calif found a way to manipulate allocation timing and tagging windows so the credential overwrite never trips MIE's tag check.
The Mythos role: the model recognised the bug class quickly because it generalises across known classes. The MIE bypass needed human craft on top — "this is where human expertise comes in", per Calif.
"Mythos Preview is powerful: once it has learned how to attack a class of problems, it generalises to nearly any problem in that class."— Calif, 14 May 2026
What Apple actually built — and what just got beaten
Apple's Memory Integrity Enforcement (MIE) is the marquee security feature on the M5 and A19. It uses ARM Memory Tagging Extension (MTE) in hardware to tag memory allocations, then refuses reads and writes whose pointers don't carry the matching tag. Apple's own research says MIE "disrupts every public exploit chain against modern iOS", including the leaked Coruna and Darksword exploit kits.
Five years. A reported five billion dollars on the Apple Park campus alone. Hundreds of engineers. The most locked-down consumer platform on the planet. Defeated in five working days by a nine-person research firm pair-programming with a model.
This is the part nobody in Australian boardrooms has really sat with yet.
2. Why people are calling this the Manhattan moment of cyber
The phrase isn't a marketing flourish. It's load-bearing.
- Ben Seri (CTO, Zafran) — "The power to find and fix is the power to find and weaponise."
- Naveen Krishnan (Belfer Center / USN Reserve) in War on the Rocks maps Mythos onto the nuclear analogy, including Dario Amodei keeping copies of Richard Rhodes' The Making of the Atomic Bomb on Anthropic's coffee tables. Estimate: a 6–18 month window before equivalent capability exists in open-source or foreign hands.
- Center for Humane Technology — "Anthropic's Mythos has changed cybersecurity forever."
- World Economic Forum — cyber resilience has shifted from episodic and manageable to persistent and expanding.
- Palo Alto Networks — a three-to-five-month window for organisations to outpace adversaries.
- Rubrik — "Every frontier model release is now also a cyber capability release, whether the lab intended it or not."
The analogy isn't that Mythos kills people. It's that, like the bomb, it is a step change that breaks the deterrence logic that came before it. Three properties make it Manhattan-shaped:
- Asymmetric birth. A single private lab — not a state — built it. The U.S. doctrine of persistent engagement assumed nation-state attackers and nation-state defenders. Mythos collapses that.
- Proliferation curve. Anthropic's own estimate: comparable capability outside Glasswing in 6 to 18 months. Past tools like NSA's Equation Group leaked over years. Mythos compresses the arc into a single model-release cycle.
- Defender economics inversion. An attacker needs one exploitable bug. A defender needs every bug across every system, continuously, before any adversary finds any of them. Mandiant's M-Trends 2026 already shows exploits available a week before the corresponding patch ships. Mythos-class velocity widens that gap.
3. Ramifications — what actually changes
3.1 The patch cycle as we know it is dead
Anthropic has disclosed thousands of zero-days. Fewer than 1% have been patched. The disclosure architecture — researcher finds bug, vendor patches, customers deploy over weeks-to-months — was built when humans found vulnerabilities one at a time. It cannot absorb the throughput of a model that turns known CVE + patch into a working exploit in under a day for under USD 2,000.
The time-to-exploit window is now smaller than the time-to-deploy window in almost every regulated industry I work with.
3.2 Architectural diversity is back on the table
Defence in depth still works, and architectural diversity matters. An exploit against one stack does not auto-port to another. Segmentation, identity controls, egress filtering and phishing-resistant MFA all raise the marginal cost for attackers, even at AI speed. Organisations that consolidated onto a single vendor stack for the sake of one pane of glass just learned a hard lesson.
3.3 Compensating controls are the new patching
When the patch backlog is structurally larger than your deploy capacity, the only honest answer is compensating controls applied at enforcement: WAF and network-layer signatures, endpoint policy, segmentation, ingress/egress filtering. Microsoft's MAPP programme worked this way for two decades. Anthropic will be pushed to build the same for Mythos disclosures.
3.4 Third-party and AI-vendor risk is now your kernel-bug risk
The Mythos side-door incident in April — a small group accessed Mythos via a third-party vendor environment using leaked internal URL patterns and unrevoked credentials — was the warning shot. The Apple M5 work is the live demo. A model with this capability sitting behind any vendor's credential lifecycle is a sovereign-risk problem, not a procurement problem.
3.5 Insurance, attestations, and disclosure regimes will move
Expect questions inside the next 90 days from:
- Cyber insurers, on whether your stack carries AI-enabled threat modelling.
- Auditors against ISO 27001 Annex A and ISO 42001, on whether AI-vendor risk is in your ISMS scope.
- APRA-regulated boards under CPS 230 and CPS 234, on whether "material service provider" definitions now include frontier model providers.
- ASD, on adoption of the Frontier AI board guidance.
4. What this means for Australia
Australian organisations have two specific exposures the U.S. commentary doesn't dwell on.
First: we are a patch-late market. Australian mid-market enterprises and APRA-regulated entities are well behind Project Glasswing partners on patching velocity. The same Mandiant M-Trends 2026 numbers are worse for the AU mid-market when you account for change-management windows, vendor certification, and the small pool of cleared incident responders.
Second: our regulatory cadence is not designed for model-speed disclosure. CPS 230 took effect on 1 July 2025. CPS 234 is now eleven years old in spirit. The Privacy Act reform is still moving. The ASD Frontier AI guidance is excellent but voluntary. None of this presumes a world where a vendor your supplier uses can be cracked by a model in a week.
What boards should be asking this quarter:
- Which of our material service providers sit inside or adjacent to a Project Glasswing partner? Which sit outside it and have no comparable AI-defensive capability?
- What is our patch-to-deploy SLA for critical CVEs, broken down by environment (SaaS, IaaS, on-prem, OT, endpoints)?
- Do our compensating controls actually map to the asset graph? Or do we just have a list of products with green ticks?
- When a model-found zero-day drops on a library we depend on two hops down the SBOM, who owns the call?
- Where is the human-in-the-loop on agentic remediation? What is pre-authorised, what is not?
If you can't answer those five questions inside an hour with the people you have, that is the audit finding.
5. The future of cyber protection
The Mythos moment is not the end of defence — it is the end of one defensive model and the beginning of another. Four shifts are already visible.
Shift 1: From periodic patching to continuous exposure management
The vendors moving fastest (Palo Alto Networks, Rubrik, Zafran, the agentic-remediation cohort) have already pivoted to Continuous Threat Exposure Management (CTEM): live mapping of vulnerabilities to compensating controls, ranked by actual exposure path, with automated mitigation pushed through governed pipelines.
Shift 2: From products to architectures
- Memory-safe languages for new code. The economic case is now overwhelming. Rust, Swift, Go, modern C++ with hardening — anywhere you can.
- Hardware-assisted memory safety (MTE / MIE / CHERI) as a baseline expectation on new platforms, with the explicit understanding that MIE-class controls slow, not stop, model-paired attackers.
- Architectural diversity. Don't run one OS, one browser, one identity provider end-to-end through the crown jewels.
Shift 3: From human-speed response to agentic defence with governed autonomy
This is where most Australian boards are uncomfortable, and rightly so. Agentic remediation means a defender agent with the authority to push a WAF rule, revoke a credential, isolate a host, or roll a key without a human ticket. The honest answer is:
- The volume of model-discovered findings will exceed any human SOC's triage capacity inside 12 months.
- The only way to keep up is automation, and the only safe automation is policy-bounded, audited, and reversible.
- The governance work is the work. ISO 42001, the EU AI Act high-risk categories, NIST AI RMF, ASD Frontier AI — these are not compliance overhead, they are the operating manual.
Shift 4: From "the supplier list" to the five new risk surfaces
The Apple M5 incident lights up four of the five in a single story:
- Model provider risk — Anthropic itself.
- Third-party identity and credential lifecycle — Calif's Project Glasswing access route, the April side-door incident.
- Agent / inference endpoint risk — the model is the weapon and the workbench at the same time.
- Prompt-channel and telemetry risk — every red-team prompt is exfiltrable IP.
- Hardware-mitigation assumption risk — MIE was the marketed defence; it wasn't enough alone.
If your AI risk register has "vendor due diligence" as a single line item, it is not fit for purpose.
6. The bigger pattern: what this says about the next 18 months
Four things I would bet on, in order of confidence:
- More public Mythos-paired exploits against Linux kernel, Windows, Chrome, Safari, and at least one major hypervisor before end of 2026. Calif's signal — "this work is a glimpse of what is coming" — was not throwaway.
- A non-Glasswing equivalent capability — open-source, foreign, or a smaller frontier lab — public within 6–18 months, in line with Anthropic's own and Krishnan's estimates.
- A regulatory inflection in at least one major jurisdiction (EU AI Act high-risk reclassification, U.S. CISA mandatory sharing, or APRA / ASD guidance update) explicitly naming model-discovered vulnerabilities.
- An insurance market repricing event when the first model-found zero-day causes a multi-jurisdiction outage. The Mandiant data already pre-figures this.
None of this is about doom. It is about getting the next decision right.
7. What to do — this week, this quarter, this year
This week
- Pull your patching SLA dashboard. If you do not have one with time-to-deploy by criticality and environment, that is task one.
- Stand up a standing AI-vendor risk register. Frontier model providers and any vendor that uses one are now in scope.
- Re-read your CPS 230 material service provider list through the Mythos lens. Anything missing?
This quarter
- Run a Mythos-class tabletop: a Mythos-discovered zero-day drops on a library you use two hops down the dependency graph. Patch is a week away. Who decides what gets blocked at the edge in the meantime?
- Inventory compensating controls per asset class. Not products. Controls. Map them to MITRE ATT&CK and OWASP LLM Top 10.
- Get agentic remediation policy drafted, even if you don't deploy it yet. The hardest conversation is the policy one, and it cannot be done in an incident.
This year
- Move new code to memory-safe languages where feasible. Re-platform legacy hotspots on a budgeted runway.
- Adopt or align with ISO 42001 and the ASD Frontier AI guidance. Get the ISMS-modern lift done while the regulator is still in education mode.
- Build a defensive AI pilot: model-paired vulnerability scanning on your own codebase or that of your highest-risk supplier, with proper data governance.
8. The line worth repeating
Apple is the best-funded, best-engineered, most security-obsessed consumer platform builder in the world. They spent half a decade and reportedly billions on the specific defence that just got walked through in a working week. The lesson is not that Apple did it wrong. The lesson is that the gap between offensive AI and defensive AI is wider than any single vendor, however good, can close on their own.
The Manhattan Project did not end physics. It started a different one. The same is true here.
Sources
- Calif — First public macOS kernel memory corruption exploit on Apple M5 (14 May 2026)
- Wall Street Journal — Apple's Security Has Been Tough to Crack. Mythos Helped Find a Way In. (14 May 2026)
- MacRumors — Apple Alerted to macOS Security Vulnerability Uncovered With AI Tool (14 May 2026)
- 9to5Mac — Calif details how Anthropic Mythos helped build a working macOS exploit in five days (14 May 2026)
- TechRadar — Claude Mythos turns years of security research into 20-hour AI exploits
- Zafran (Ben Seri) — After Mythos: Preparing for Cybersecurity's Manhattan Project Moment (8 Apr 2026)
- War on the Rocks (Naveen Krishnan) — Anthropic's Nuclear Bomb (16 Apr 2026)
- Palo Alto Networks — Defender's Guide to the Frontier AI Impact on Cybersecurity: May 2026 Update
- World Economic Forum — Cyber resilience was always the goal. Frontier AI makes it urgent.
- Rubrik — Every AI Frontier Model is Now a Cyber Threat
- Anthropic — Project Glasswing