Decoding Language Models

A reader looking to dive deep into the tech behind LLMs. The picks include "Natural Language Processing with Transformers" and "Building Large Language Models from Scratch". See the full list to explore the architecture!

🎯

Safe Bets

— Right up your alley
1
Natural Language Processing with Transformers

by Lewis Tunstall, Leandro von Werra, and Thomas Wolf

This is the definitive hands-on guide. Since you have technical knowledge, you'll appreciate that this book, written by engineers at Hugging Face, goes straight to the code and concepts behind the Transformer architecture that powers all modern LLMs. It's less a theoretical textbook and more a practical manual for building and understanding these models from the inside out.

TechnologyComputer Science
2
Speech and Language Processing

by Dan Jurafsky and James H. Martin

Often called the 'bible' of the field, this is the foundational textbook you need for a deep, academic understanding of the underlying technology. It provides the crucial context for *why* transformers were such a breakthrough by thoroughly explaining the decades of research on n-grams, RNNs, and sequence-to-sequence models that came before. It’s the best way to understand the architecture in its historical and technical context.

TextbookComputer Science

by Sebastian Raschka

You asked for the underlying architecture, and there's no better way to learn it than by building it yourself. This book does exactly that, walking you through the pretraining and finetuning of a GPT-like model from scratch in Python. It demystifies the entire stack, from the tokenizer to the attention mechanism, ensuring you understand every line of code and every architectural decision.

TechnologyProgramming

Curve Balls

— Pleasant surprises, we promise

by Brian Christian

You're focused on the architecture's 'how,' but this book explores the fascinating 'what now?'. It's a deeply technical but accessible investigation into the unintended consequences and emergent behaviors that arise from these very architectures. You'll learn about inverse reinforcement learning and reward hacking—problems that are born directly from the technology you want to understand.

ScienceTechnology
2
The Master Algorithm

by Pedro Domingos

This book is a curveball because it zooms out. Instead of just focusing on the Transformer architecture, it places it in the larger context of the quest for a single, universal learning algorithm. Domingos masterfully explains the five 'tribes' of machine learning (symbolists, connectionists, evolutionaries, Bayesians, and analogizers) and how their ideas compete and combine. It will give you a framework for not just how LLMs work, but where they fit in the grand scheme of AI.

ScienceTechnology
3
Information Theory, Inference and Learning Algorithms

by David J.C. MacKay

You wanted the 'underlying technology,' and this book goes all the way to the mathematical bedrock. It's not about LLMs directly, but it covers the fundamental principles from information theory and Bayesian inference that make them possible. For a technical reader, this is the ultimate 'first principles' read, explaining the 'why' behind the algorithms and architectures everyone else just implements.

TextbookMathematics

The Conversation

Books about LLM
LLMs are rapidly changing, so let's narrow it down a bit. What aspects of LLMs are you most curious about?
The underlying technology and architecture
Underlying technology and architecture – fascinating. What level of technical depth are you looking for?
Some technical knowledge expected

Want picks tailored to YOUR taste?

Shelf Sage asks a few questions and finds books you'll actually love.

Chat with Shelf Sage