DeepMind Paper: AlphaDev discovers faster sorting algorithms

Jul 28, 2023

Summary by Adrian Wilkins-Caruana

One of my favorite things about science is that two seemingly unrelated concepts sometimes turn out to be quite similar on a fundamental level. When this happens, it can often lead to new discoveries in one or both of these fields, like with quantum mechanics and information theory, which share similar mathematical structures. In this week’s summary, a game-playing AI is the common thread that links breakthrough performance in games such as chess and Go with the seemingly unrelated area of superoptimizing computer code for basic computing algorithms, like ones used for sorting numbers.

The folks at Google DeepMind – who were the creators of the famous game-playing AI programs AlphaZero (chess) and AlphaGo – have now developed AlphaDev, an AI program that optimizes computer algorithms. Much like how AlphaZero and AlphaGo learn to play their respective games by figuring out combinations of moves that lead to victory, AlphaDev “plays” the algorithm-optimization “game” by learning how to efficiently implement foundational computer algorithms, like sorting an array of numbers.

Where AlphaZero plays chess by analyzing pieces and positions, AlphaDev optimizes algorithms by manipulating machine code. Machine code is the fundamental set of instructions that CPUs know how to execute. When people write code, we use high-level, human-readable languages like C++ that get compiled into “assembly,” like in the figure below. Assembly and machine-code instructions are essentially the same thing, except assembly is more human readable than the binary ones and zeros of machine code.

Before AlphaDev, people progressively figured out more and more efficient ways of sorting numbers in assembly, just like how people figured out better strategies for playing chess. AlphaDev learns to write machine code in a similar way, trying different things to make more efficient algorithms. In each “game,” AlphaDev analyzes the “board” (the algorithm), and its “moves” (adding instructions to the algorithm) get scored based on algorithm-correctness and efficiency. AlphaDev learns which moves to make using reinforcement learning and plans its moves using a Monte Carlo tree search.

But sorting numbers is such a fundamental computer task – how exactly did AlphaDev discover new machine code that could outperform code that exceptionally skilled programmers had painstakingly optimized over the course of several decades?

As Justine Tunney eloquently explained, AlphaDev didn’t discover a new algorithm (like bubble sort or quick sort). Instead, it optimized a sorting kernel, which is a sorting algorithm for a small, fixed amount of numbers. Sorting kernels can be used as building blocks for bigger, more complicated sorting algorithms. For example, the machine code for sorting three numbers might include some moving, comparison, and conditional moving instructions, like this (note, code that follows a “//” indicates a comment, and is just there to help explain what the preceding code does):

The simple code above was the state-of-the-art way to sort three numbers, and is essentially how many of the indispensable and ubiquitous programming language libraries implement the sort-three-numbers kernel. The way AlphaDev improved upon this already near-optimal machine code was by figuring out that the last “mov” instruction is actually redundant and can be removed, as you can see in the figure below.

While removing a single instruction may seem inconsequential, the efficiency gains can stack up quickly, especially for kernels that are executed trillions of times a day. When sorting algorithms used AlphaDev’s more efficient sorting kernels, they resulted in 70% faster sorting when sorting a small amount of numbers, and were about 1.7% faster when sorting >250K numbers. Also, AlphaDev has found optimizations for other fundamental computer functions too, like hashing. The Google DeepMind team hopes that AlphaDev can help optimize all kinds of algorithms, and that it will inspire programmers to think of innovative ways to write more efficient code to create a more powerful and sustainable computing ecosystem.

Learn and Burn

Discussion about this post