Rate this book

The Hundred-Page Machine Learning Book

Andriy Burkov

Rate this book

Concise and to the point — the book can be read during a week. During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.

Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.

Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!

Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.

GenresArtificial IntelligenceComputer ScienceProgrammingTechnologyNonfictionScienceTechnical

159 pages, Paperback

Published January 13, 2019

416 people are currently reading

5049 people want to read

About the author

Andriy Burkov

9 books96 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

637 (46%)

4 stars

502 (36%)

3 stars

187 (13%)

2 stars

29 (2%)

1 star

15 (1%)

Displaying 1 - 30 of 155 reviews

Manny

Author 42 books15.8k followers

November 23, 2021

If you already have some familiarity with machine learning and are not afraid of equations, this is an excellent brief overview of the field, written by someone who comes across as having huge experience of using ML in practice. You would have to know a great deal not to pick up something new.

One of the bits I liked best was the extremely concise, clear explanation of why Deep Learning has been such a success. As the author says, a deep neural network is really a very simple thing: it's just a function that's been built recursively out of a bunch of smaller functions, so something like

f = f₁(f₂(f₃( ... ), f₄( ... ), ..., f₂₅( ... ), ... ), ... )

The pictures you see,

are an easy-to-read graphical notation that says the same thing. Each of the nodes is a component function which will contain some adjustable parameters, and the lines show how the functions are composed together. We're going to tune the parameters by taking a bunch of training examples and doing some kind of gradient descent: we'll look at the answer we get with the current version of the function, do some calculus to find the direction in which we need to change the parameters to improve our guess, move the parameters a little step in that direction, and continue until things work well enough. This is the famous backprop algorithm.

Now what are these component functions going to be? The simplest choice would be to have linear functions, which are easy to work with. But a composition of linear functions is just another linear function, so that won't help us. The next simplest thing is for each function to be a linear function with some non-linear function applied to its output. Inspired by the way neurons work in the brain, it's natural to think that the non-linear function will be some kind of threshold function. If the output of the linear function is large enough, it'll produce an output, and otherwise it'll produce nothing.

Well, what is the threshold function going to be? Since we're going to use calculus to train our net, our first thought is that we want a smooth function, i.e. one that we can differentiate. The sigmoid function on the left seems like the kind of thing we need:

But in fact it turns out that the sigmoid function, clever as it looks, isn't best. If we have many layers in our neural net, the sigmoid functions will multiply together. Then changing a parameter in a deep layer will go through all those multiplied functions, and it will have almost no effect, so you won't be able to train the deep layers effectively. The ReLU function on the right is much better. It looks stupid and ugly, because it's not smooth at the crossover point, but it turns out that what's really important is that it's linear everywhere else. Using ReLU instead of sigmoid opens the door to deep neural nets that work. It took ages for people to realise this.

I have seen versions of this explanation before, but not so compact and straightforward, and I didn't fully appreciate the force of the argument. He does many things in a similar style: other sections I really liked described one-shot learning, unsupervised learning, learning to rank, and learning to recommend. I feel I now understand all of these much better.

What an excellent book! As many of the other reviewers say, this is a model you'd like to see more people emulating.

science

Lensvol

101 reviews6 followers

May 22, 2019

If I have to describe that book, that would be with the following scene:

Imagine you are in a bar with a good friend, who earns his living doing something that you are interested in. At some point during the evening, after discussing this and that, you ask him to give you a general idea about things in his area of expertise. You know, nothing really specific, just some pointers. And, because that person is a really great friend, they sigh and sketch a comprehensive introduction to the subject _on napkin_. All while patiently explaining thorny subjects like 'derivation' and 'normal distribution', not once acting judgemental or 'smarter-than-thou'.

This is a good book. It will not make you an expert but will act as a handy guide to the turbulent world of ML, giving you pointers and rough outlines of what awaits you in the dark.

Quinn Jackson

Author 20 books4 followers

January 26, 2019

Burkov's book deserves to be on every machine learning professional and enthusiast's desk. His economy of exposition, his inclusion of the necessary mathematical underpinnings, and his succinct and pragmatic tone overall are much appreciated.

I have already had the opportunity since receiving and reading the book to have it in hand in my own professional work, both as a refresher and as a quick sanity check. My own work in ML is at the level of the algorithms and the underlying maths themselves, and not just as a user of someone else's black-box implementation thereof, and in this respect, Burkov's treatise and approach to the field have been extremely valuable to me.

One might say of such a short work that it should be longer: I am definitely not in this camp. I already have books with thousands of pages in highly specialized subsections of the ML milieu; when I need an in depth dive into probabilistic graphical models, I can certainly find plenty of literature for such an exploration. What I have not had until Burkov's book, however, has been a single resource to turn to before I even consider such hair-splitting.

Were it that more specialist books were so well executed. Burkov has done the field a great service.

Antonis Maronikolakis

119 reviews5 followers

July 21, 2019

The idea behind the book is great, and execution is admirable. Unfortunately, I have to admit that in all honesty, I believe this book is over-hyped. It serves as a great introduction to most Machine Learning concepts, but it reads more than a listing page than a proper introduction. The book, even though it covers a lot of concepts, doesn't do enough to entice/lead you to research the topics more. In a lot of places it feels rushed and condensed to a point where no real information is given to the reader.

Trying to create a short guide on ML is admirable and I am glad that Mr. Burkov took the initiative to attempt this task, but I am afraid his attempt falls just short of my expectations. Nevertheless, I believe it will pave the way for greater things. For now, Mr. Burkov remains a pioneer in this niche area.

Brendan

128 reviews22 followers

April 29, 2021

I applaud the gumption, but I don't think the concept is possible. Burkov starts this book with promise, but quickly falls into simply summarizing the concepts. He vacillates between giving too much of an introduction and not enough. The most glaring fault is the succinct introduction of concepts filled with jargon like the following.

"An autoencoder is a feed-forward neural network with an encoder-decoder architecture."

Sure, this is fine if you understand the encoder-decoder architecture, but when you wash over the concepts so quickly, I doubt the true newcomers fully understand these concepts when phrased this way. Because of this, I'm not sure who this book is written for. If you have an understanding of ML and want a reference, this book is far too light on details. If you have no understanding of ML, this book is far too heavy on concepts and implicitly supposes too much knowledge. I certainly couldn't give this to my boss to understand, he wouldn't get past the first 10 pages. For me? I'm not sure I got anything out of this. The concepts I knew were not bolstered conceptually here. There were no great metaphors or mathematical insights. The concepts I was unfamiliar with still seem gray and foggy in my mind with this book offering little to hold on to for support.

Unfortunately, Burkov gave himself an impossible task and proved the task to be, for the moment, still impossible.

Mohammed

3 reviews1 follower

August 11, 2019

I wish I had this book when I started learning Machine Learning. I knew few things but learned many more. It takes two days to complete and becomes a reference book to me now.

I really like those practical tips the author Andriy has provided along with the high quality demonstration of almost all popular Machine Learning algorithm.

Ibrahim Sharaf

7 reviews3 followers

September 19, 2023

The simplest and most concise ML book that I've ever read (so far, haha)! I really love the tech-blogposts writing and explanation style that Burkov has adopted for this book, while the usual ML book is very academic and math-focused by nature, on the other side of the spectrum, this book focuses more on intuition, provides details but not too deep to overwhelm the reader, math is simple and used whenever it's necessary only.

However, I'm not saying this should be your textbook for teaching yourself ML, and I believe this isn't the segment that the author was targeting when he wrote the book, in my opinion, the most suitable audience for this book are:
1. Newbies who are looking for an ML introductory book, think of it as a bunch of carefully curated Medium posts, but it's way better and more streamlined.
2. Experienced ML professionals who are looking for a quick brush up on some concepts, maybe for an interview or a new project.

Also kudos to the author for distributing this book as a "Read first, buy later".

read-tech

Ashok Krishna

411 reviews57 followers

September 27, 2020

If you are an absolute beginner or someone just interested in knowing what all the fuss about data science is, then this book isn't for you. That is, unless you have a good grasp of statistical concepts and have the penchant for looking at all those scientific/statistical notations. Good in parts for a beginner to understand the core concepts. Too complex and dry at times. 3 stars!

Michael Lumpkins

1 review

January 8, 2019

Absolutely great book to dabble in everything machine learning related. An excellent resource for exposure to the field for those with technical backgrounds in math, science, computer science, or engineering, but not much experience with machine learning algorithms while also a solid reference book for guidance for those who aren't the most up to date in the field.

Andriy was also kind enough to include a free edition of his book on his site, but I highly suggest purchasing it if you believe you will make frequent use of it. A solid, hard copy is helpful when you already have your monitors full as almost any data scientist or analyst likely already does. I cannot recommend this enough, especially when you can try the book for free.

Data

3 reviews2 followers

November 5, 2022

This book swallows up the heavyweight mathematics textbooks and spits out a slim product no thicker than the width of my smartphone. From page one all the way to page 136, Andriy Burkov, the author, does not waste a single word in distilling the most practical concepts in machine learning. You read that right. It is MORE than 100 pages! Sounds like the book has some bias. Get it?

At a minimum, prospective readers should be comfortable with Calculus, Statistics, Probability, vectors and matrices, as well as, familiar with data science concepts. As for data scientists, the majority of your work is likely focusing on a handful of ML models. While you might skim parts of the book, I think it is still a useful reference because of the breadth of material. I provide a more in depth summary and commentary on the book at the following webpage:

https://thedatageneralist.com/book-re...

Linh Lan

1 review

May 19, 2019

This book became my single resource to turn to. With the QR codes, it allowed me to investigate the topics further. Very easy to read and understand, The Hundred-Page ML Book is concise and covered most of the topic in machine learning that we need to know about.

Yosif

80 reviews4 followers

April 11, 2020

Very good and comprehensive book. The first 70% were real fun, then it got slightly harder (at least for me). As a whole - definitely looks like a useful and practical reference for ML.

must-read-completed

Till Chen

68 reviews12 followers

June 8, 2020

Fantastic ML book! I don't think there's any other ML book that's as lucid and comprehensive as this one. It's an amazing overview without being shallow. Highly recommended!

tech

John Mantios

19 reviews

January 4, 2021

concise and well-explained

Cindy

178 reviews67 followers

March 18, 2021

If you want to condense machine learning into 100 pages (or more precisely, 136) without losing rigor, you're going to end up with a bunch of math equations because machine learning is essentially math applied to data.
This book is so mathematically dense. It's symbolically complex and verbally concise. I really wouldn't recommend this to someone just starting out with machine learning. It could be a good refresher for someone who has studied machine learning at a graduate level. Maybe if you've been working with machine learning for a while and you've forgotten what exactly certain models do mathematically, this could help. I agree with the author : "Practitioners with experience can use this book as a collection of directions for further self-improvement".
The information within seems pretty up-to-date in terms of mentioning validation sets, and even genetic algorithms (mentioned briefly) and LSTMs. Some of the explanations are not sufficient in themselves, and you're going to have to look the topic up online if you're not familiar with it (especially starting on page 65 with deep learning and backpropagation, although the book does include QR codes that you can scan for further elaboration).
What I really enjoyed was that the author talked about certain statistical assumptions that are made about data in machine learning. They seem obvious when stated, but they're not necessarily common sense. For instance, it's usually assumed that your training data is randomly and independently selected from a particular distribution, and a machine learning model trained on the data is supposed to work on unseen future data because we assume that the future data is coming from the same distribution.
I also liked that the author provided multiple ways to solve particular problems (like an imbalanced dataset).
Solid book. I recommend it as a reference, but not necessarily as a one-stop shop for teaching yourself machine learning.

narrow next

Alireza Aghamohammadi

51 reviews48 followers

July 28, 2021

کتاب برای افرادی که وقت و زمان خواندن مراجع اصلی یادگیری ماشین را که معمولا بیش از هزار صفحه است را ندارند بسیار مناسب است و اکیدا توصیه می‌شود.

الگوریتم‌های یادگیری ماشین (مانند ماشین‌های بردار پشتیبان، جنگل تصادفی، رگرسیون و ...) به همراه شبکه‌های عصبی (به صورت مقدماتی) و مبانی ریاضی پشت هر کدام توضیح داده می‌شود. هر مبحث در سه تا چهار صفحه آموزش داده می‌شود.

در واقع کتاب سعی می‌کند از تمامی حواشی دوری کند و چکیده و ضروریات هر الگوریتم را توضیح دهد و به نظرم کاملا موفق بوده است.

computer-science

Jeevan

197 reviews6 followers

January 1, 2025

I bought this book a while back and contrary to the author's "read first, buy later" philosophy - I wanted to buy it first and then read it and it took a while to get here. However, it was a refreshing quick read to brush up on all concepts I have been learning over the years and some intuitions are described in a sentence or two which is very powerful. I highly recommend anyone starting their journey to use this as a guide to learn more -- though my journey was the opposite.

Vishal Singh

2 reviews

June 25, 2020

Great book, it's like a Machine Learning capsule. I will make sure to go through this book from time to time.

want-to-read-next

Jacob Williams

598 reviews17 followers

September 20, 2022

This is a great concise refresher. It covers many of the same concepts as the machine learning course I took earlier this year - decision trees, linear regression, SVM, neural nets, kNN, expectation maximization, ... - and it's sort of like having a set of high-quality, focused notes.

It also explains some details I've ignored in the past and covers many topics I wasn't familiar with:

- One reason for using logarithms when they're not mathematically necessary is to avoid overflow
- Negative cosine similarity as a distance function for e.g. k-nearest-neighbors
- Why to normalize features: so the derivatives with respect to each feature will be in similar ranges
- Rules of thumb on normalization vs standardization of features: standardize when there are outliers, or the data is close to normally distributed, or when doing unsupervised learning
- Multiple strategies for dealing with missing data
- Regularization
- Convolutional neural nets
- Recurrent neural nets, GRUs in particular
- Kernel regression, a non-parametric way to model nonlinear data
- Bagging; in particular random forest, which tries to produce a bunch of uncorrelated decision trees
- Comparison of boosting and bagging: "boosting reduces the bias ... instead of the variance"
- Sequence-to-Sequence learning: encoder, embedding, decoder
- Active learning: "Once we know the importance score of each unlabeled example, we pick the one with the highest importance score and ask the expert to annotate it."
- Denoising autoencoders: try to produce the same output as your input, after passing through an embedding layer, and despite corruption of the input
- Semi-supervised learning, such as the ladder network (a type of denoising autoencoder) that can perform excellently on MNIST with only a tiny fraction of the examples labeled
- One-shot learning with siamese neural networks and triplet loss
- Zero-shot learning, where "we want the model to be able to predict labels that we didn’t have in the training data"
- Stacking: "building a meta-model that takes the output of base models as input"
- Data augmentation: making additional training data by e.g. (for images) "zooming it slightly, rotating, flipping, darkening"
- Transfer learning
- Density estimation
- HDBSCAN, a clustering approach the author recommends trying before k-means
- Calculating "prediction strength" to choose number of clusters
- t-SNE and UMAP do dimensionality reduction "specifically for visualization purposes"
- The embedding in the bottleneck layer of an autoencoder can be used for dimensionality reduction
- Learning a distance metric from data
- Ranking problems: pointwise (the most obvious approach to me) and pairwise perform worse than listwise approaches such as LambdaMART
- Factorization machines, an approach to recommendation systems
- One reason you might use genetic algorithms is that the objective function doesn't have to be differentiable

I'd like to follow up on the concept of Bayesian hyperparameter learning, which was mentioned but not discussed.

This is an intriguing comment that I'd like to understand better:

...not many supervised learning algorithms can boast that they optimize a metric directly. Optimizing a metric is what we really want, but what we do in a typical supervised learning algorithm is we optimize the cost instead of the metric (because metrics are usually not differentiable). Usually, in supervised learning, as soon as we have found a model that optimizes the cost function, we try to tweak hyperparameters to improve the value of the metric. LambdaMART optimizes the metric directly.

Robert

300 reviews

June 7, 2020

My general philosophy when it comes to learning is that it is much more important to know what you don't know than to think you know something. For that reason, I am often a fan of Very Short Introductions and the like. In my view this is a tough genre for an author – it is incredibly difficult to strike the right balance between breadth and depth. Burkov's book does a great job of it, providing a brilliant executive summary of the entire machine learning landscape (insofar as that is possible). The book is not for pure laymen, though should be doable for the "determined amateur" with a good understanding of high-school maths.

I found it to be a fantastic refresher, as well as a useful survey of some of the novel ideas in the ML world. Definitely would recommend to someone who wants a working knowledge of different machine learning methods and what is/isn't possible with ML. But this is probably not suitable if you want a broader picture of how ML will change the world, or specific business applications etc.

natural-science non-fiction textbooks

An Te

386 reviews26 followers

April 2, 2022

Wow! This book has either of two effects on you. You feel like you're enlightened and understand it. Or that you feel flummoxed and way out of your depth. I would say I leaned towards the former, as these concepts were novel to me. I did find the depth and maths helpful to develop the intuition needed to apply the algorithms. Obviously, one has not mastered them reading this book, but the connections points are clear is one desires to pursue it further.

If ever there was a book which would be worth multiple reads, it is this one. A lot of concepts went over my head but some did bed down. If this is all machine learning is about (in a nutshell), it is worth read this book regularly to top up on what is available. Looking to mine it for more insights (if you don't mind the pun)

There's an accompanying wiki also, which helps keep the text relevant and up to date.

Vicki

531 reviews237 followers

January 20, 2023

This book is hard to evaluate. It definitely has a lot of information and is dense in formulas but it’s hard to get a sense of who it’s meant to be for. It’s too detailed for the machine learning manager and leaves the machine learning practitioner (me) with not enough detail in some parts (for example I would have liked more on decision trees and gradient boosting but the book just rushes by.)

There are a lot of formulas without context and the sections don’t logically make sense.

The one thing I think this book is probably good for is preparing for interviews because there are a lot of concepts and buzzwords in bold. So maybe a summary skim? I think the idea is good but the execution was hard to get right.

Randy

144 reviews47 followers

November 8, 2019

This is a brilliant introductory book on machine learning. The author even covered (even in only 100 pages) things like missing values and categorical variables. My only complaint is the index, it could be better. But it's a very short book and you don't need much help from the index to find things - the table of contents is enough.

Beyond that, you could do much worse than to read this, try a real project (find out that the real world is hard) and then find what the other 10,000 pages left out of this book might contain.

machine-learning programming

Winn Koster

4 reviews

March 3, 2021

This book is an exceptional introduction to machine learning techniques for someone who already has a foundation in linear algebra. It's deliberately limited in scope to get you started, and the author is very thoughtful with this -- if you want to dive into a topic with more detail, links to online examples are provided, as well as notes and recommendations for further reading and more advanced topics.

I have now read this cover to cover once, but I expect that I'll pick it up to review certain topics many times going forward.

LeoQuiroa

50 reviews

July 11, 2021

When I was at school, I used to write a really brief summary about the topic I wanted to learn and read it multiple times. When I start to read this book, I had a throwback of those memories. It feels like a very precise and concise summary of ML. I really recommend this book for an advanced audience as a quick reminder. On the other hand, for a beginner audience, oh boy! you will find a dense book.

Girish B

22 reviews3 followers

May 6, 2020

An awesome place to start the journey. A lot of information squeezed into a tiny book. Comes with a companion wiki as well. You are introduced to a lot of topics in a straightforward way. Loved the journey!

Joe

108 reviews1 follower

September 25, 2020

Excellent book, although I call foul on the “100 page” part :) there are excellent introductions, with follow-on references, for a lot of topics in ML. I will recommend this book to anyone looking for an intro or for brushing up on key areas.

amir husaini

5 reviews

February 3, 2022

I right now only understand roughly 60% of it. The privilege of knowing what we don't know in the Data Science or Machine Learning domain will be good to be less overwhelming and reprioritize what are to focus on next. This book is doing well as guideline along the career journey!

Nicky

34 reviews

February 28, 2022

This book is a great introductory read if you want to explore the vast landscape of ML from a holistic point of view. It covers many topics in a concise and comprehensible way, while at the same time staying away from too much detail. Consider it an elaborate ML cheat-sheet. Great work, Andriy!

Zoe Anderson

7 reviews1 follower

November 9, 2022

Great introduction to machine learning; it's concise and presents the fundamentals of machine learning clearly. There is a lot of maths involved, so to get the most out of the book it's worthwhile pausing to work through the formulas.

uni