AI vendors spent most of May making announcements—and pushing their way into almost every category here. But it’s not the only story worth watching. Doctors have used CRISPR to correct the DNA of a baby with a rare and previously untreatable condition. We won’t know whether the treatment worked for years, but the baby appears to be thriving. And a startup is now selling the ultimate in neural networks. It’s made from living (cultured) neurons and includes a life-support system that will keep the neurons going for a few weeks. I’m not entirely convinced this is real, but I still want to know when it will be able to beat AlphaGo.
Artificial Intelligence
- Anthropic has released the first two models in the Claude 4 series: Sonnet and Opus. These are hybrid reasoning models that give users control over the amount of time spent “thinking.” They can use tools in parallel and (if given local file access) remember information through a series of requests.
- The new Claude 4 models have a surprising “agentic” property: They might contact law enforcement if they think you are doing something illegal. Who needs a back door? As far as we know, this behavior has only been seen in Anthropic’s research on alignment. But we can imagine that training a model to eliminate this behavior might have its own legal consequences.
- Stitch is an experiment in using LLMs to help design and generate user interfaces. You can describe UI ideas in natural language, generate and iterate on wireframes, and eventually generate code or paste your design into Figma.
- Google’s DeepMind is experimenting with diffusion models, which are typically used for image generation, in Gemini. They claim that diffusion models can be faster and give users more control. The model isn’t publicly available, but there’s a waitlist.
- Mistral has announced Devstral, a new language model optimized for agentic coding tasks. It’s open source and small enough (24B) to run on a well-equipped laptop. It attempts to cross the gap between simply generating code and real-world software development.
- Meta has announced its Llama Startup Program, which will give startups up to $6,000/month to pay for using hosted Llama services, in addition to providing technical assistance from the Llama team.
- LangChain has announced Open Agent Platform (OAP), a no-code platform for building intelligent agents with AI. OAP is open source and available on GitHub. You can also experiment with it online.
- Google has announced Gemma 3n, a new multimodal model in their Gemma series. Gemma 3n has been designed specifically for mobile devices. It uses a technique called per-layer embeddings to reduce its memory requirements to 3 GB for a model with 8B parameters.
- The United Arab Emirates will be using AI to help draft its laws. Bruce Schneier has an excellent discussion. Using AI to write laws is neither new nor necessarily antihuman; AI can be (and has been) designed to empower people rather than to concentrate power.
- DeepMind has built AlphaEvolve, a new general-purpose model that uses an evolutionary approach to creating new algorithms and improving old ones. We’re not the only ones asking, “Is it a model? Or is it an agent?” AlphaEvolve isn’t available to the public.
- For some time, xAI’s Grok LLM was turning almost every conversation into a conversation about white genocide. This isn’t the first time Grok has delivered strange and unwanted output. Rather than being “unbiased,” it appears to be reflecting Elon Musk’s obsessions.
- Things that are easy for humans but hard for AI: LegoGPT can design a Lego structure based on a text prompt. The structure will be buildable with real Lego pieces and able to stand up when assembled. Now we only need a robot to assemble it.
- Microsoft has announced reasoning versions of its Phi-4 models. There are three versions: reasoning, mini-reasoning, and reasoning plus. All of these models are relatively small; reasoning is 14B parameters, and mini-reasoning is only 3.8B.
- Google has released Gemini 2.5 Pro Preview (I/O Edition). It promises improved performance when generating code, and has a video-to-code capability that can generate applications from YouTube videos.
- If you’re confused by OpenAI’s naming conventions (or lack thereof), the company’s posted a helpful summary of all its models and recommendations about when each model is appropriate.
- A new automated translation system can track multiple speakers and translate multiple languages simultaneously. One model tracks the location and voice characteristics of individual speakers; another does the translation.
- Mistral has announced Le Chat Enterprise, an enterprise solution for chat-based AI. The chat can run on-premises, and can connect to a company’s documents, data sources, and other tools.
- Semantic caching is a way of improving performance and reducing cost for AI. It’s essentially caching prompts and responses and returning a response from the cache whenever the prompt is similar.
- Anthropic has announced Claude Integrations. Integrations uses MCP to connect Claude to existing apps and services. Supported integrations include consumer applications like PayPal, tools like Confluence, and providers like Cloudflare.
- Google has updated its Music AI Sandbox with new models and new features. Unlike music generators like Suno, the Music AI Sandbox is designed as a creative tool for musicians to work with: editing, extending, and generating musical clips.
- Video deepfakes can now have a heartbeat. One way of detecting deepfakes has been to look for the subtle changes in skin color that are caused by a heartbeat. Now deepfakes can get around that test by simulating a pulse.
- Google has built DolphinGemma, a language model trained on dolphin vocalizations. While the model can predict the next sound in a sequence, we don’t yet know what they are saying; this will help us learn!
- The SHADES dataset has been designed to help model developers find and eliminate harmful stereotypes and other discriminatory behavior. Shades is multilingual; it was built by observing how models respond to stereotypes. The dataset is available from Hugging Face.
Programming
- Microsoft has open-sourced the Windows Subsystem for Linux (WSL).
- Jules is Google’s entry in the agent-enabled coding space. It uses Gemini and proclaims, “Jules does the coding tasks you don’t want to do.” Of course it integrates with GitHub, tests your code in a Cloud VM, creates and runs tests, and shows its reasoning.
- Hardware description languages are difficult and opaque; they look little like any higher-level language in use. Spade is a new HDL that was designed with modern high-level programming languages in mind; it’s heavily influenced by Rust.
- OpenAI has released Codex, a coding agent based on a new version of o3 that has had specialized training for programming. It can pull a codebase from a Git repo, write new code, generate pull requests, and use a sandbox for testing. It’s only available to Pro subscribers.
- When generating code, LLMs have a problematic tendency to write too much, to favor verbose and overengineered solutions. Fred Benenson discusses the problem and offers some solutions.
- Nix is a dependency manager that can do a lot to improve supply chain security. Its goal is to prove the integrity of the sources used to build software, track all the sources and toolchains used in the build, and export the sources used in each release to facilitate third-party audits.
- OpenAI has announced a connector that allows ChatGPT’s deep research feature to investigate code on GitHub. How will deep research perform on legacy codebases? We’ll see.
- There’s a proposal for explicit resource management in JavaScript. using and await declarations ensure that resources are disposed of when they go out of scope.
- DeepWiki is a “free encyclopedia of all GitHub repos.” You get an (apparently) AI-generated summary of the repository, plus a chatbot about how to use the repo.
- A “code smells” catalog is a nice and useful piece of work. The website is a bit awkward, but it’s searchable and has detailed explanations of software antipatterns, complete with examples and solutions.
- For those who don’t remember their terminal commands: Zev is a command line tool that uses AI (OpenAI, Google Gemini, Azure OpenAI, or Ollama) to take a verbal description of what you want to do and convert it to a command. You can either copy/paste the command or execute it via a menu.
- Docker has introduced Docker Model Runner, another way to run large language models locally. Running a model is as simple as running a container.
Web
- CSS Minecraft is a Minecraft clone that runs in the browser, implemented entirely in HTML and CSS. No JavaScript is involved. Here’s an explanation of how it works.
- Microsoft has announced NLWeb, a project that allows websites to integrate MCP support easily. The result: Any website can become an AI app.
- 10Web has built a no-code generative AI application for building ecommerce sites. What distinguishes it is that it generates code that can run on WordPress, and allows customers to “white-label” new sites by exporting that ability to prompt.
- What if your browser had agentic AI completely integrated? What if it was built around AI from the start, not as an add-on? It might be like Strawberry.
- A survey of web developers says that, while most developers are using AI, under 25% of their code is generated by AI. A solid majority (76%) say more than half of AI-generated code needs to be refactored before it can be used.
Security
- The secure messaging application Signal has added a feature that prevents Microsoft’s Recall from taking screenshots of the app. It’s an interesting hack that uses Windows’ built-in DRM to disable screenshots on a per-app basis.
- How do you distinguish good bots and agents from malicious ones? Cloudflare suggests using cryptography—specifically, the HTTP Message Signature standard. OpenAI is already doing so.
Quantum Computing
- Researchers have demonstrated quantum error correction for qudits—like qubits, but with three or more states rather than two.
Biology
- Cortical Cloud claims to be a programmable biological computer: lab-grown neurons with a digital interface and a life-support system in a box. When will it be able to play chess?
Virtual and Augmented Reality
- Google glasses are back? Google announced a partnership with Warby Parker to build Android XR AR/VR-enabled glasses incorporating AI. The AI will run on your (Android) phone.