Sitemap - 2024 - Learn and Burn

Measuring a model's understanding — starting with path-finding

Making LLMs scalable by replacing weights with learnable tokens

Image generation for infinite games

Do LLMs rely on data contamination to solve math problems?

Running an LLM on a small customizable chip

Better language models with negative attention

A serious look at the future of AI medical advice

LLMs have original, research-worthy ideas

OpenAI's o1 model

The subgoals of attention units in LLMs

A model to parse body language

Stable diffusion can simulate video games

How to build a lensless camera

LLMs are great geo-aware predictors

Spherical-graph based weather prediction

AI is solving international-level math competition problems

How to use LLMs with spreadsheets

Physical neural networks

Predicting depth data from a single image

Vision transformers can see in stereo

How to tell if an LLM is just guessing

A new weakness in LLMs: early vs late token importance

Better LLM problem-solving without repeated prompting

AI can play chess well without using a search tree for future moves

Do AI models exhibit parallel evolution?

Why don't large models overfit?

AlphaFold 3: The Turns of the Amino

Can we learn functions, not weights? Yes we KAN!

LLMs that never forget

You can prune entire layers of LLMs (and they still work)

The path to AI as web developers

A neural network that makes neural networks

How to build a 10m-token context window

Can a fine-tuned LLM create good music?

LLMs only need three weight values (-1, 0, and 1)

Make different images of the same subject without a custom training step

A new standard in openness for LLMs

See yourself in new outfits, no changing needed

Computer, enhance image! 🖼 💫

A breakthrough in detecting LLM-made text

Text-to-video with realistic motion

A 47B LLM needing only 13B weights in memory

Generating optical illusions

Fast LLMs, even when they don't fit in RAM

Modified Gaussian splats for realistic face rendering

Dramatically more efficient LLMs via fast feedforward networks

Realistic, real-time avatars with Gaussian splats