4.1 C
United Kingdom
Monday, December 23, 2024

Radar Trends to Watch: July 2024 – O’Reilly


Can anything that is programmable have an embedded LLM? Apparently so. llama.ttf is a plain old font that looks like OpenSans—you can download it and install it. We don’t know whether this is a massive vulnerability or a way to embed AI in almost any text-based application. Probably both.

It may just be a coincidence, but several new programming languages have come on the scene in the past month or two. Why? This probably isn’t backlash against automated programming (an LLM obviously can’t be trained for a language without much public source code).


Learn faster. Dig deeper. See farther.

AI

  • This is crazy. llama.ttf is a font that embeds a (small) large language model. The font itself can do automatic text generation. It relies on the HarfBuzz font shaping engine, which uses Wasm. It works in Gimp; it may also work in Chrome and Firefox. Everything runs locally.
  • Researchers are finding that conversations with an AI can help to dispel beliefs in conspiracy theories. (This paper notes that it is preliminary and not yet peer-reviewed. All of the data for their research will be posted when it is complete.)
  • The Podman AI Lab is a good way for Linux users to experiment with running AI locally. Unsurprisingly, Podman is container-based, which simplifies moving models from a desktop environment into a production environment.
  • Researchers have developed a BERT-based model to detect malicious LLM prompts that instruct the LLM to generate phishing websites or emails.
  • Anthropic has released a new version of their mid-sized language model, Claude 3.5 Sonnet. It claims performance similar to GPT-4o and Gemini 1.5. A new Artifacts window allows interaction with Claude output.
  • What if non-player characters in games were able to converse with you and with each other? What if the structure of a game unfolded dynamically in response to the play, rather than being scripted in advance? That’s something generative AI could bring to games.
  • An AI system has been trained to count flowers. This ability has important applications in farming: if you can count the flowers, you can predict crop yields.
  • Anthropic has provided documentation about building evaluations and test cases for Claude.
  • Facebook/Meta has developed AudioSeal, a system for watermarking AI-generated audio content. It is 90% to 100% reliable and can identify small clips inserted into a larger file. AudioSeal is free and available on GitHub, though Meta itself isn’t using it yet.
  • Cohere’s latest model, Command R, can use multiple external tools to respond to prompts. It is capable of multistep tool use, in which it creates a plan to solve the problem, uses APIs (described in a JSON document) to access external tools, and generates a result.
  • Can a large language model write comedy? Apparently not. Ideas aren’t original, models have trouble with incongruity and surprise, and they don’t have a sense for the longer arc of a dialogue.
  • LumaLabs DreamMachine is an impressive generative AI tool for creating short video from a text prompt.
  • NVIDIA has announced a set of models for generating synthetic training data. Usage of the models is governed by a new Open Model License Agreement which specifies that models can be used commercially, derivative models can be distributed freely, and NVIDIA claims no ownership to the models’ outputs.
  • AI detectors, most of which have very high error rates despite vendors’ claims, are being used to incorrectly blackball professional writers.
  • A new strategy for chunking content for RAG is using sentence-BERT (SBERT) to compute embeddings for each sentence, then using the embedding to detect shifts in topic.
  • If you’re building an AI application with RAG, consider the use of graph databases (GraphRAG) rather than vector databases. Graphs are harder to work with, but are better at providing appropriate context for answering complex queries.
  • Google’s experimental Illuminate turns academic papers into interview-like discussions that summarize the important points. It’s currently in private beta.
  • Anthropic’s essay about Claude’s Character is a thoughtful piece about how they are giving Claude 3 character traits and how character traits affect the way Claude interacts with the world.
  • Honeycomb’s VP of Engineering suggests that AI will be more helpful in developing frontend code than backend and that chats are often an inefficient user interface.
  • OpenAI and Anthropic have posted research on extracting features from their GPT and Claude models. This represents a significant step towards interpretability or understanding why a model gives a specific response to a prompt.
  • What is your dog trying to say? A research project is using AI to decode dog vocalizations.
  • Here’s a look at the transformer-based language model that Apple is incorporating into macOS. It’s very small (apparently 34M parameters) and appears to be based on GPT-2–clearly designed to run on devices like phones without draining the battery.
  • Codestral is a new language model from Mistral that specializes in code generation. It has been trained on over 80 languages and claims to produce better results than other language models.
  • Language models can create database schemas—but schemas rarely remain unchanged forever. How well can language models modify a schema?
  • Can AI help researchers digitize smells? Digitization has worked well for sight and sound. Smell lags far behind.
  • Could AI be used to improve privacy? Researchers have developed applications for removing sensitive parts of images and replacing them with something similar but different—sort of an automated Photoshop.
  • Llama3-V is a multimodal model built on top of Llama3. It claims performance similar to GPT-4o, at 1/100th the size. What’s more impressive is that it was trained for under $500.
  • Using knowledge graphs in the context of RAG makes intuitive sense. In a comparison between “GraphRAG” and “traditional rag,” GraphRag comes out on top because the graph encodes relationships between the data.

Programming

  • Gleam is a new type-safe functional language. It compiles to Erlang and JavaScript. Its goal is to enable developers to build large concurrent systems.
  • OpenTofu, the open source fork of Terraform, now has State Encryption, a highly requested feature that has been on Terraform’s queue for nearly a decade.
  • Starlark is a small Python-like language for describing how to build code. Starlark’s design principles include determinism and safety; executing a program multiple times gives identical results and it cannot access the network, filesystem, or system clock.
  • One of WebAssembly’s co-creators has developed a new programming language called Virgil. It’s designed as a systems programming language, though it is intended not to compete with Zig (whatever that means).
  • The James Webb Space Telescope is a step forward for digital twin technology. A digital model of the telescope was used to choreograph its self-assembly in outer space. The twin continues to monitor the telescope and predict the effects of software updates.
  • Mesop is a very lightweight Python framework for rapid web application development. It’s good for demos; probably not good for production. Built by Google but not supported.
  • AI Gateways are like API gateways, but specialized for accessing the APIs of language models. They facilitate monitoring, choosing which model to serve a request, caching, security, and many other features.
  • Julia Evans’s (@b0rk) latest zine, How Git Works, is a must-read for anyone who doesn’t understand Git. Which is just about everybody.
  • Language developers have been busy. The Grain language was designed for compilation to WebAssembly.  It was inspired by OCaml; pattern matching is a key feature.
  • GoFr is a framework for building microservices in Go. It incorporates tools for observability and supports the use of many different data sources.
  • Infrastructure from Code claims that a system’s infrastructure requirements can be inferred from the code itself, and that a specification for the necessary infrastructure can be generated automatically.

Security

  • While it’s not clear what’s going on, there appears to have been a supply chain attack against the CDN service Polyfill.io. Polyfill claims that they have been defamed and has moved to a new domain name. Security experts advise caution.
  • A new critical vulnerability in Progress Software’s MoveIT product is being actively exploited. Attackers can impersonate any legitimate SFTP user, giving them the ability to copy, delete, or create files. Attackers can also obtain cryptographic hashes for user passwords.
  • The US is banning sales of Kaspersky’s anti-virus software starting in July.
  • Linux malware controlled by emojis sent from Discord? That’s creative. The malware, which spreads through phishes, uses emojis to send commands. Emojis apparently evade security software that expects commands to be text strings.
  • A new kind of phish uses the Windows Search protocol to download malware. Another new phish uses progressive web applications (PWAs) to steal credentials.
  • After widespread criticism of its Recall feature for AI-enabled Windows 11, Microsoft has announced that it would be disabled by default. Release of Recall has been delayed; it will be delivered first to the Windows Insider Program, essentially a private beta.
  • As the importance of AI has grown, HuggingFace has become a target for threat actors. Most recently, attackers have stolen authorization tokens from Spaces.

Web

  • Sim City in the browser with WebAssembly? Impressive.
  • UIX is a new full-stack web development framework. It is part of a larger Unyt project, which has the ambitious goal of developing the next-generation decentralized Internet.
  • BBC Ventures is investing in a 3D video streaming platform.
  • Hotwire is an approach to building web applications (specifically, single page applications or SPAs) that sends HTML to encapsulate complex data rather than JSON. Its goal is to minimize the need for JavaScript.
  • Cobalt is a very lightweight platform for running apps built from HTML5, CSS, and JavaScript. Google, its maker, mentions YouTube as a typical application, but I suspect it can be used for more interesting work.
  • Adrian Holovaty’s Soundslice has incorporated music scanning: You can upload a PDF or a photo of a musical score and Soundslice will analyze it. You can then edit, rearrange, and play back the score.

Hardware

  • An open standard for connectors? That’s important if you want very high performance without having to license proprietary designs.
  • Energy Meter is a tool for measuring the power consumed by Intel CPUs. It could be useful in these days of power-hungry AI applications, though it can’t account for GPUs. Runs on Linux but that’s most of the world’s servers.
  • Mitsubishi has built a robot that solves Rubik’s Cube in 0.305 seconds. Their best time is 0.204 seconds, but according to Guinness, that wasn’t under proper conditions.
  • Lifesaving robots are in use on Lake Michigan to help drowning people remain afloat until a lifeguard can arrive.

Biology

  • Precision Neuroscience has successfully planted an array of 4,096 miniaturized electrodes in a human brain. In doing so, they’ve achieved higher resolution for neuron activity than previous implants, including Neuralink’s.



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles