Rate this book

Information Theory, Inference, and Learning Algorithms

David J.C. MacKay

Rate this book

Information theory and inference, often taught separately, are here united in one entertaining textbook. These topics lie at the heart of many exciting areas of contemporary science and engineering - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics, and cryptography. This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction. A toolbox of inference techniques, including message-passing algorithms, Monte Carlo methods, and variational approximations, are developed alongside applications of these tools to clustering, convolutional codes, independent component analysis, and neural networks. The final part of the book describes the state of the art in error-correcting codes, including low-density parity-check codes, turbo codes, and digital fountain codes -- the twenty-first century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, David MacKay's groundbreaking book is ideal for self-learning and for undergraduate or graduate courses. Interludes on crosswords, evolution, and sex provide entertainment along the way. In sum, this is a textbook on information, communication, and coding for a new generation of students, and an unparalleled entry point into these subjects for professionals in areas as diverse as computational biology, financial engineering, and machine learning.

GenresComputer ScienceMathematicsScienceArtificial IntelligenceTextbooksNonfictionTechnical

642 pages, Paperback

First published June 15, 2002

135 people are currently reading

3404 people want to read

About the author

David J.C. MacKay

2 books54 followers

David MacKay was a Professor in the Department of Physics at the University of Cambridge. He studied Natural Sciences at Cambridge and then obtained his PhD in Computation and Neural Systems at the California Institute of Technology. He returned to Cambridge as a Royal Society research fellow at Darwin College. He was internationally known for his research in machine learning, information theory, and communication systems, including the invention of Dasher, a software interface that enables efficient communication in any language with any muscle. He has taught Physics in Cambridge since 1995. Since 2005, he devoted much of his time to public teaching about energy. He was a member of the World Economic Forum Global agenda Council on Climate Change.

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

304 (63%)

4 stars

133 (27%)

3 stars

34 (7%)

2 stars

6 (1%)

1 star

3 (<1%)

Displaying 1 - 30 of 31 reviews

Jon Gauthier

129 reviews240 followers

Want to read

March 15, 2015

NB: Both book and lectures are available for free online. (Check YouTube for lectures.)

computer-science machine-learning math

Brian Powell

190 reviews34 followers

December 22, 2018

I've had a long and fruitful relationship with this text. It's been with me through several career shifts and has satisfied various, random fits of curiosity. I was introduced to this book in grad school while trying to use computational methods of Bayesian inference to study the early universe (specifically, MCMC, Bayesian model selection, and other sampling techniques). MacKay's coverage of this material is both conceptually clear and practically-minded, and helped me a great deal. Much of the rest of the book, however -- chapters dealing with information theory, coding theory, and so on -- looked vaguely interesting but seemed quite far afield from my daily cosmological concerns.

Then I developed an interest in cryptology, mostly as a hobbyist but soon found this knowledge useful for a new career doing cybersecurity testing. I poured over the chapters on Shannon's communication theory: rate-distortion theory, source coding theorem, compression, error-correcting codes...A treasure trove of deeply interesting ideas. But the last quarter or so of the book was nary traversed, covering the abstruse subjects of neural networks and learning algorithms. Vaguely interesting, but not relevant to me.

Then I developed an interest in pattern recognition and anomaly detection, and found the final chapters of this book insightful: the pictorial representation of Hopfield network performance in particular, a prototypical example of the many such visual aids throughout the text. Interestingly, each item in the title became separately relevant to me at different times: inference at first, then information theory, and lastly learning algorithms (though actually inference is quite indispensable for machine learning). Though I might have used it this way, MacKay's text is more than a disjointed collection of orthogonal ideas: having covered much of it by now, I can look back and appreciate how each of these subjects are really different facets of the same overarching objective: making sense of data.

MacKay's writing style is engaging, friendly, and precise. It was truly a joy to wander through this text.

computer-science machine-learning statistics

317 reviews287 followers

Read

September 9, 2011

Hokey the Bayesian Bear says: "Only you can prevent the misguided use of p-values."

prob-n-stat

Gavin

Author 1 book539 followers

Want to read

March 10, 2021

Exercise 3.11... Is the lawyer right to imply that the history of wife-beating does not point to Mr S’s being the murderer? Or is the lawyer a slimy trickster? If the latter, what is wrong with his argument?

[Having received an indignant letter from a lawyer about the preceding paragraph, I’d like to add an extra inference exercise at this point: Does my suggestion that Mr. S.’s lawyer may have been a slimy trickster imply that I believe all lawyers are slimy tricksters? [Answer: No]].

Actually readable, actually tractable. We could not afford to lose him.

The exercises are graded 1-5 by difficulty. If it helps, notice that difficulty 3 is the kind of thing which stumped a young David Mackay for "some time" (a week?) (cf. Exercise 3.3).

Free here

maths ml stats

23 reviews4 followers

July 23, 2008

This book is amazing! Its a pretty esoteric approach to teaching machine learning and I don't think its a good introductory book on that subject. But for folks already versed in the topic, this book can shed a lot of new light and does a good job abstracting it with concepts from information theory and stats.

This book was my first in depth exposure to information theory and the proofs, often accompanied by helpful figures, were clear and, hell, even exciting. Its a much easier read than Cover & Thomas.

Deniz Yuret

32 reviews44 followers

November 17, 2012

http://denizyuret.blogspot.com/2006/0...

Nick

8 reviews7 followers

November 22, 2014

One of the best introductions to information theory, coding (lossy and lossless) and Bayesian approaches to decoding and to inference. This firmly grounds machine learning algorithms in a Bayesian paradigm and gives people the intuition for the subject. The problem sections are not just great, they are absolutely worth doing.

owned science-and-philosophy

David Cournapeau

29 reviews3 followers

January 6, 2021

While not directly applicable, this is by far the best general book about ML I have read. Extremely insightful, connecting a lot of separate topics.

It was written before deep learning became popular, but I believe it is still strongly relevant if you want to understand ML at a conceptual level, without necessarily being math heavy.

maths-stats ml

Aaron

16 reviews2 followers

July 27, 2019

Brilliantly exposited. An important read for anyone interested in these topics.

Kurt

5 reviews9 followers

July 2, 2008

I chose this to accompany my reading of Norvig's text on artificial intelligence. I thought the information theoretic concepts deepened my understanding of intelligent agents functioning in an information-deprived environment. The sections on genetic algorithms and neural networks gave a nifty information theoretic perspective on those topics, but I think other texts (such as Koza on genetic algorithms) were better reads.

I shall add this to my "reference" collection, for I find myself returning to it frequently. And as the equations become more familiar, the concepts become clearer, and yet more ideas for cross-disciplinary applications spring into my imagination.

No typographical errors so far. The language was engaging, not dense at all. Notational conventions increased the readability of equations.

Likely any university student taking this course will have sufficient background in probability. I, however, did not. The text provides a crash course on probability, entropy, and inference, as well as more math in the appendices, all of which for me were indispensable.

reference

Michiel

378 reviews89 followers

September 11, 2011

Excellent book about diverse topics in machine learning, statistics, information theory etc. Many exercises and applications.
Free to download on the internet!

phd-starter-kit

Jim Marrone

49 reviews

April 12, 2025

This is a fantastic book, and I would consider it (or excerpts and exercises from it) if I were to teach a course on information theory. I was looking for a book for my own edification, as an academic economist who works with computer scientists. I know plenty of statistics and applied math and machine learning but not formal info theory or computer science. After trying a couple books I read this one cover to cover.

I loved that the book connects formal math to practical coding and even data science, unlike some texts. It is a bit outdated now (neural nets have definitely progressed since it was written) but the foundational theory and intuition hold.

Another highlight was the tangential chapters on fun topics - crossword puzzles! Biology and DNA! There are lots of fun exercises that would be engaging for students. And moreover, there is a lot of philosophical meat to chew on; just thinking about certain topics through and info theory lens opened my mind and brought new insight. I will go back to my underlined pages just to ruminate on those sections.

One downside is that the math and notation can be a bit dense, especially if you don't know physics notation. I skimmed some parts that weren't relevant for my own work. I would have liked to have all chapter be equally clear and intuitive but alas they are not. Still, a dedicated reader can sit with paper and pencil (or computer) and walk themselves through the book.

Mostafa Alkady

61 reviews5 followers

May 9, 2024

This book is for you if:
1- you enjoy banging your head against the wall trying to understand minor details the author didn't provide because it's obvious duh
2- you're already familiar with the material
3- you wanna have a quick idea about a certain topic but you don't care about the inner workings
4- you're a genius

This is not saying the book is particularly hard; but concepts are, a lot of the time, poorly explained. It would have been a great book had the author actually tried simplifying things. It feels more like the author is 'telling' his fellow researchers what he knows rather than 'explaining' concepts to novice readers. I'm done with almost half of it (28 chapters) and I wouldn't recommend it to anyone as a first read at all (or even a second for that matter). I'm sure you'll find better resources somewhere else.

artificial-intelligence

Siddharth

88 reviews4 followers

April 30, 2023

First half of the book is definitely the strongest, has clean proofs of core information theoretic facts. The book tapers our towards the end and becomes a grab bag of facts. I wish they had instead focused on something like PAC learning and weaved it into the narrative.

Soumya Mondal

14 reviews

June 15, 2025

Spent a month of my life grokking this book, only to realise how shallow my knowledge had been so far. Don't fully comprehend everything yet, but will come back to it again sometime soon. Until then onwards and upwards.

Jethro Kuan

20 reviews7 followers

July 5, 2018

Excellently written, would revisit again.

Marek Barak

14 reviews

September 20, 2018

If you are looking for a simple introduction to Bayesian machine learning, this book is a perfect fit.

Kent Sibilev

50 reviews6 followers

December 3, 2019

Amazing treatment of the information theory and the Bayesian inference in general.

Cold

601 reviews13 followers

May 30, 2020

Unbelievably clear thinker. I just wish I had the logical stamina to follow his arguments. Alas the maths undergrad me would be so disappointed.

computer-science non-fiction

John Doe

68 reviews10 followers

Currently reading

May 8, 2021

magnificent unification of information technology, mathematics(probability theory) and AI.

cs mathematics

Satyaki

200 reviews

May 6, 2023

https://www.youtube.com/playlist?list...

https://www.inference.org.uk/is/

ai cs science

Mohammad Asal

12 reviews1 follower

March 14, 2024

Could've used more depth instead of breadth

Proteinbased

7 reviews

September 28, 2016

I really enjoy(ed) working with this book.
The (>400) problems are interesting, the writing clever and motivational.

J C

84 reviews32 followers

Currently reading

November 16, 2015

While deliberating buying the book, I came across many reviews giving the impression that this was an upper-tier book meant only for those already well-versed in bayesian inference, information theory, and machine learning. Fortunately for me (having purchased it for ~50$), I have been gliding along at quite an easy pace. Already I've learnt about hamming codes and the formulas & axioms (interestingly formulated!) of bayesian probability theory. The treatment probably isn't the most sophisticated, I'm sure, but for me at least it's a good enough fit.

I think my apparent ease might have to do with the fact that I've been doing a lot of abstract math recently (axiomatic set theory, differentiable manifolds, lin alg), and to me, this goes to show how powerful math is as a language. (People say it is a universal language, and it is in this sense: once we understand, we rarely have room for mis- or non- understanding). I have to say that it truly pays to be rigorous! (nod to Wittgenstein).

Our brains work by applying well-optimised strategies learnt in the past to new situations, and then custom-fitting this strategy as it accumulates more and more data (although some brains, noticeably adult ones, stop doing this altogether). In fact, this whole process, which can be summed under the term 'learning', is itself a strategy we have to fine-tune constantly. Given these facts, it follows that it will be immensely helpful to identify guaranteed ways of

1. 'picturing' the same thing differently (shortening the time needed to search for alternative strategies, and lending itself to psychological convenience) and of

2. equating very different notions of things by reimagining them under the same picture (allowing the quick identification of applicable strategies without reference to irrelevant details)

And math, which goes above and beyond the precision of natural language, manages this with elegance and grace.

And yet learning math takes a lot out of me, it is physically exhausting, and emotionally-dangerous to get sucked into the non-human, ruthless, high-octane world of abstract math. Yet it is only in these regions of the mind that math is doable... it requires that we rewrite our axioms of thought: there is no space for human guesses or hunches, only previously apprehended notions which must be burnt into memory. But that is its beauty, it is a world perfect in its own right.

applied-math coding-theory machine-learning

Cristián S

16 reviews

October 8, 2023

Great book overall, specially because it focus on the important points, without spending too much time on mathematical details nor boring text. This does not mean the math is not there. It is not a book for someone without a strong mathematical background. What I try to emphasize is that it focuses on the important and intuitive ideas, instead of focusing on the rigorous demonstrations. Indeed, I think it is a dense book in terms on the number of topics covered, but it does not use that many pages to do it.

I liked the way ideas are developed. It is a study book, because it relies in you doing a lot of exercises in order to cover some topics, instead of just giving the answers. Also, it is nice how it shows many different problems that seem unrelated can be seen from a similar perspective. The author really encourages Bayesian thinking throughout the book. I am giving 4 instead of 5 starts because of two things: - I am focused on machine learning and inference in general, so some chapters that go deep on different coding schemes were not that interesting to me, and - I read it on 2021, and some explanations and examples look oldish from today's point of view. I think some examples may be rephrased in order to contact more with today's audience.

Jimmy Longley

75 reviews9 followers

February 19, 2017

Reviewed as part of my 100 books challenge: http://jimmylongley.com/blog/books/

Run-on Sentence Summary

A fresh and entertaining textbook that walks through the fundamentals of information theory and machine learning.

Impressions

Mackay’s prose is fast paced but lucid, and perfect for a self learner. Often when reading CS textbooks, I’ll skim over problems because I can’t be bothered to spin up whatever boilerplate they want me to download off of the website, but this book did a great job of highlighting specific, achievable, and instructive problems and providing detailed solutions.

The book is highly geared towards information theoretic and probability concepts. The section on machine learning, as many others have noted, uses a funky approach and perhaps isn't the best introductory text in retrospect. Still, every chapter builds organically on those coming before and it is better than the sum of its parts.

Final Thoughts

This is one of the most challenging, rewarding and entertaining textbooks I’ve read.

textbooks

Ushan

801 reviews77 followers

December 24, 2010

A review of information theory, coding theory, and several machine learning and statistics topics, all from a Bayesian perspective. Low-density parity-check codes (which are used in HDTV) are very cool!

computer-science

Tarun Thammisetty

2 reviews

August 25, 2016

One of the very rare academic texts which balances intuition and mathematical rigour. The way the author establishes the relationship between Information theory, Inference and Learning is exceptional. An absolute joy to read.

Jon

1 review

February 14, 2020

An exceptional read which gave me so much more confidence in statistics for data science. Fantastic relatable real world questions make this book an absolute classic. Have also read pattern recognition and machine learning, which is also recommend and foundations of data science which isn't as good