This month, we’ll give AI a rest. Alex Russell has finished an excellent series of posts titled “Reckoning.” It’s a must-read for web developers. If you want to understand why our networks and laptops are much faster than they were 15 or 20 years ago, but the web is slower, it comes down to one thing: bloated JavaScript. Alex offers a way out.
The last XOXO festival is over, and I regret having missed it. XOXO has been a gathering place for experimental artists and others who live and work online. Judging from the many comments on Mastodon, Bluesky, and Twitter, it was great. A key theme was building networks that are healthy, and where people are respected. Look for the videos when they’re posted—I will certainly have them in next month’s trends.
And we have a rare legal section with items on AI regulation, Telegram, and open source licenses.
AI
- Anthropic has published the system prompts for its Claude models. It’s interesting that Claude prefers short responses; it avoids words like “Certainly”; and it can break a longer task into parts, executing each part separately.
- Many developers report huge time savings when using generative AI to understand or update legacy code. Andy Jassy, Amazon’s CEO, has claimed that the company saved 4,500 developer-years by using AI to upgrade 30,000 Java applications from Java 8 to Java 17.
- Generative AI can be used to improve data quality by automating the process of tagging data with metadata.
- Andrew Ng has released a model that lets you experiment with geoengineering by simulating the injection of reflective aerosols into the atmosphere.
- Alibaba has released two new models, Qwen2-Math and Qwen2-Audio, which are (not surprisingly) designed for working with math and audio. We’d like to see the Math model in a matchup against DeepMind’s AlphaProof.
- The Open Source Initiative (OSI) has released version 0.0.9 of their definition of Open Source AI. Their definition requires that training data be recognized as part of an open source system. A stable version of the definition will be announced in October 2024.
- Google has released Imagen 3, its state-of-the-art text-to-image model. You can try it on ImageFX or Vertex.
- Ideogram 2 is another impressive text-to-image model. There’s an iOS app and an API. “Magic prompting” lets you upload an image and generate a description, then use the description as the basis for future prompts. That’s very useful, though it could be the basis for copyright trouble.
- Claude now has user-controllable prompt caching: using the API, users can mark portions of long prompts (including data) to be cached. Cached items don’t need to be recomputed again. Prompts that reuse data in the cache cost much less to run.
- The AI Scientist, an AI system designed to do autonomous scientific research, unexpectedly modified its own code to give it more time to run.
- Nick Hobbs argues that we need AI designers—designers who specialize in designing for AI, who are intimately familiar with AI and its capabilities—to create genuinely innovative new products.
- Simon Willison’s discussion of GPT-4o’s system card is fascinating. Among other things, OpenAI tested whether it could accomplish capture-the-flag challenges against a Linux system. It failed (mostly).
- DeepMind has developed a robotic, AI-driven Ping-Pong player. It is good but not yet a threat to advanced human players.
- The GPT API now features Structured Outputs, which causes it to deliver results according to a user-specified JSON schema. In addition to simplifying software development, the constraint that GPT can only generate output tokens that fit the schema improves its accuracy.
- For about a year, OpenAI has had a watermarking system for GPT that can detect whether a text was written by its AI. The company hasn’t deployed it. It is apparently easy to defeat (by rewriting text with another LLM); OpenAI also feels it would make using GPT less attractive.
- Flux.1 is a new family of open-weight text-to-image models that claims state-of-the-art performance.
- Whisper-Medusa is a speech-to-text model that claims to be twice as fast as OpenAI’s Whisper with equivalent accuracy.
- Google has released Gemma 2 2B, the smallest in its Gemma family of small language models. Google has also released ShieldGemma, a specialized model for detecting harmful inputs and outputs, and Gemma Scope, a set of sparse autoencoders designed to help researchers understand what happens inside language models.
- OpenAI has finally released the voice-enabled ChatGPT bot to a limited group of ChatGPT Plus subscribers. General release to all subscribers should take place this fall. The feature was announced in May but held for further work on safety.
- OpenAI has announced SearchGPT, an AI-based search engine. It isn’t yet available, but you can sign up on a waiting list.
- Angels and Cows is a sci-fi/fantasy novel that was written to help people think about applying artificial intelligence to social justice issues. It is an attempt at a new kind of scientific communications. A free PDF is available.
- Spreadsheets are difficult for language models; their structure doesn’t lend itself to encoding for processing by LLMs. Microsoft has developed a SpreadsheetLLM that has been designed for working with spreadsheets.
Programming
- If you like two-dimensional visual programming, you might like Turnstyle, a new esoteric language. It’s challenging, weird, and fun (if you like that kind of fun).
- The new release of Istio, a part of the Kubernetes ecosystem, no longer requires sidecars to manage networking and security for each service. The new alternative is called “ambient mesh” and is based on a zero-trust tunnel. Is this actually simpler and faster? We’ll find out.
- For the command line users, here’s a modern Unix tool list. Most of the items are rethinkings of familiar tools: bat is cat with syntax highlighting, atuin is shell history backed by a database. It’s definitely worth exploring.
- If you don’t understand why things don’t always add up when you use floating point numbers—and in particular, if you don’t understand why you should never use floating point when working with money—you should read the Floating Point Guide.
- AudioFlux is a library of tools for analyzing music and other kinds of audio.
- Postgres.new gives you the PGlite database running in a browser sandbox with generative AI support. PGlite is an implementation of Postgres in WebAssembly. The developers promise that, soon, it will be possible to deploy to AWS S3.
- A new DARPA program, Translating All C to Rust (TRACTOR), intends to develop AI tools for making legacy C code memory-safe by translating it to Rust. While such translation is already possible, it is error-prone. TRACTOR aims to improve the quality and reliability of translations.
- The WhenFS turns your Google Calendar into a mountable filesystem. Nobody really knows why would you want to do this. File this in the same mental space as esoteric programming languages.
- “age is a simple, modern and secure file encryption tool, format, and Go library.” The GitHub repo says it all. Don’t capitalize the a. age can generate keys for SSH and can interoperate with YubiKey.
- Netflix has open-sourced Maestro, a data services engine. Maestro supports large-scale data workflows, like data pipelines.
- AWS is quietly deprecating services that have insufficient usage, of which CodeCommit is the most notable. The company won’t be adding new users, nor will it be updating the services, except for security patches. This is a significant change in AWS’s policies.
- Score is a new tool for infrastructure-centric development. It allows developers to write software while reducing the need to understand deployment. It ensures that new code is compatible with infrastructure—for example, managing the transition from Docker to Kubernetes.
- Chattr is a Linux command utility that lets you manipulate file permissions in ways that go beyond the standard Unix chmod. Among other things, it can prevent accidentally overwriting files.
- Orca is a runtime environment for WebAssembly that doesn’t require the web. It allows you to write Wasm applications that run on a laptop.
Security
- Another step toward the end of passwords? Docker is deprecating the use of passwords. In September, it will require single sign-on (SSO) and personal access tokens (PATs) for command line access.
- Palo Alto Networks’ “Ransomware Review” for the first half of 2024 shows that the increase in ransomware attacks it reported earlier in 2024 wasn’t a fluke; they have continued.
- A talk at Defcon 2024, Living off Microsoft Copilot, demonstrated a number of exploits against Copilot. They included remote code execution, data exfiltration, and phish generation.
- The AI Risk Repository includes a database of known AI risks, a taxonomy of causes of AI risk, and a taxonomy of the risks themselves. It is free to copy and use.
- A talk at Black Hat has demonstrated that Microsoft Copilot bots are insecure because of inadequate defaults and other problems that leave them open to attack. Many of these issues have been fixed since the talk, but changes only affect new installations.
- Google has announced that it will prompt users who have enabled enhanced safe browsing to give it the passwords needed to decode password-protected files so the files can be examined. Password-protected files are often used to deliver malware.
Web
- Who is watching you? The Surveillance Watch project has published a map of companies engaged in data collection and surveillance, their affiliates, their targets, and more. It’s an impressive map of the surveillance industry.
- Google is rolling out an experimental feature in Chrome for Android that will redact credit card data, passwords, and other sensitive information when a screen is shared.
- “Reckoning,” Alex Russell’s four-part series on JavaScript framework culture and the damage it’s done to the web is a must-read. Russell recommends the GOV.UK Service Manual, particularly the page about progressive enhancement, as a guide to building responsive services.
- Project Wallace looks like an excellent set of analysis tools for web developers working with CSS.
- Google is making it easier for victims to request removal of nonconsensual deep fake images. Their search algorithm is also being modified to downrank deep fake images.
Hardware
- Scientists at Peking University have developed a highly efficient tensor processing unit (TPU) based on carbon nanotubes.
- Looking Glass has a 3D holographic display the size of a cell phone at a reasonable ($299) price, in addition to laptop- and monitor-sized models. It can display 3D images from Apple’s Vision Pro. Can we imagine VR without headsets?
- Wireless bicycle shifters? If you have them (and some high-end bicycles do), they can be hacked. That could become an issue if you’re a racer.
- NVIDIA may have competition. Groq, a startup that makes language processing units (LPUs), has hardware and cloud offerings that look very competitive for doing inference with large language models.
- Privacy Lens is a new privacy-enhancing camera with two lenses: a traditional lens and a thermal lens. It uses the thermal lens to turn human images into stick figures with identifying information (e.g., skin and hair color) removed.
Law
- OpenAI and Anthropic have agreed to give advanced access to their models to the US Artificial Intelligence Safety Institute so they can be tested for safety.
- The California state legislature has passed a bill regulating AI development. The bill is intended to defend against threats to public safety and requires implementation of a “kill switch.”
- Pavel Durov, the founder of the Telegram messaging app, has been arrested and indicted in France. Charges include spreading images of child abuse and drug trafficking using the app.
- A court has dismissed DMCA-related portions of the plaintiff’s complaint in a landmark lawsuit involving the use of open source code by generative AI. Claims about breach of contract by open source license violations are still in play.
Biology
- The GROVER project attempts to treat DNA as a language and uses AI to decode that language.