🇺🇸 Trump’s Manhattan AI Project, 🛠️ Claude Leads on Code, 🛍️ ChatGPT Becomes a Retail Gatekeeper
National labs get a shared AI supergrid, Claude 4.5 tops coding benchmarks, and ChatGPT starts diverting shoppers from Google and Amazon.
🎵 Podcast
Don’t feel like reading? Listen to it instead.
🖼️ This week’s image aesthetic (Flux 2 Pro): Tilt-shift scenes
📰 Latest News
Trump’s ‘Manhattan Project of AI’ Begins
Genesis Mission is being pitched as the Manhattan Project of AI for science: a US Department of Energy platform, created under Trump’s new executive order, that links national labs and partner institutions into a shared AI super-infrastructure. Researchers can tap frontier-scale models and clustered compute through DOE facilities across the country to run large, complex simulations and experiments that were previously out of reach.
Why it matters: Genesis Mission gives universities and labs a direct route into cutting-edge AI and compute without each institution needing its own mega-data centre. That can accelerate progress in climate modelling, drug design, fusion, materials, and other high-cost fields by letting more teams run bigger, riskier experiments. It is also Trump’s executive-branch answer to the AI race: a bet that funnelling public research and industrial partners through a national AI stack will keep US science ahead, while forcing private AI labs to adapt to a world where the state runs some of the biggest scientific models on earth.
Claude 4.5 Out-Codes Gemini 3 Pro
Anthropic has launched Claude Opus 4.5, its new flagship model tuned for coding, tool use and multi-agent workflows. On the SWE-bench Verified benchmark it currently leads frontier systems, scoring about 80.9% versus 76.2% for Google’s Gemini 3 Pro, with Anthropic and early testers saying it is noticeably stronger on ambiguous, multi-step debugging and real-world tickets rather than just neat coding puzzles.

Why it matters: Opus 4.5 moves Claude closer to “extra senior engineer on the team” than “autocomplete with attitude”: it can plan across whole repos, call tools, and coordinate agent-style workflows with fewer stalls and rewrites, which directly reduces review and debugging time rather than just shifting the workload. That gives Anthropic a sharper edge in the one capability enterprises actually pay for – shipping complex software faster and more reliably – and ramps up pressure on rivals to prove their own models can handle messy legacy codebases and long-running agent tasks, not just look good on benchmark charts.
🌐 More from Artificial Analysis (comprehensive)
Google Drops ‘Nano Banana Pro’: 4K, Legible Text, Live-Web Grounding
Google’s Nano Banana Pro is a new image generation model integrated into Gemini 3. It can produce 4K images, handle complex layouts, and render crisp, readable text while pulling context from the live web to ground what it creates.
Why it matters: Nano Banana Pro lets anyone go from prompt to production-ready visuals – posters, UI mocks, social graphics – with legible text and fine detail that used to require manual tweaking. Because it can also draw on web search, images can reflect up-to-date facts, brands, and real-world context, tightening the loop between research and creation. If it performs as advertised, it gives Google’s AI stack a sharper edge against rivals that still struggle with text in images and lack native search-aware generation.
OpenAI Turns ChatGPT Into a Personal Shopper
ChatGPT just turned into a shopping assistant. OpenAI’s new shopping research mode lets you describe what you want and get a personalised buyer’s guide built from live web data, instead of clicking through comparison sites. It runs on a GPT-5 mini model tuned specifically for shopping, which reads retail pages, checks specs and prices, cites sources and adapts to your feedback, with heavy usage opened up over the holidays. Your chats stay with OpenAI, not the merchants, and results are drawn from organic listings rather than paid slots.
Why it matter: This is a direct shot at Google Shopping, Amazon search and every affiliate review blog that sits between people and products. If users start asking ChatGPT “Which TV should I buy?” instead of searching, OpenAI becomes a new gatekeeper for retail traffic and spend. Merchants will increasingly optimise for being surfaced in ChatGPT’s guides, not just in search results, and OpenAI will face pressure over ranking transparency, bias and potential future monetisation. For consumers, product discovery gets easier and more conversational, but a huge amount of influence over what people buy concentrates in one AI assistant.
Karpathy Declares the ‘Decade of Agents,’ Not AGI
In a rare long-form interview, Ilya Sutskever says the AI boom has left the “Age of Scaling” (just make models bigger) and entered an “Age of Research.” Simply piling on GPUs is running into hard limits: high-quality data is finite, pre-training is hitting diminishing returns, and current frontier models are “jagged” – they can beat hard benchmarks yet still fumble basic tasks because they are over-trained to pass tests rather than truly generalise. His new company, Safe Superintelligence (SSI), is betting on a different stack: focus on value functions, generalisation, and sample efficiency to build a system that can learn almost any job faster than a human, rather than one that just “knows everything” from pre-training.
Why it matters: If Sutskever is right, the centre of gravity in AI shifts from whoever owns the biggest data centres to whoever cracks the core science of learning and alignment. That challenges the current orthodoxy of trillion-dollar capex and ever-larger models, and reframes the race as “who can invent the next paradigm” rather than “who can rent the most GPUs.” His timeline – a super-learner in roughly 5–20 years – also raises the stakes: getting generalisation and values right becomes the bottleneck for safe superintelligence, not just scaling up today’s architectures. For investors and labs, it is a warning shot that the next big leap may come from new ideas, not just more infrastructure.
🌐 More from the Dwarkesh podcast
Last weeks newsletter:








