Jump to ratings and reviews
Rate this book

The Hundred-Page Machine Learning Book

Rate this book
Concise and to the point — the book can be read during a week. During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.

Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.

Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!

Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.

159 pages, Paperback

Published January 13, 2019

Loading interface...
Loading interface...

About the author

Andriy Burkov

6 books78 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
562 (46%)
4 stars
437 (36%)
3 stars
160 (13%)
2 stars
28 (2%)
1 star
11 (<1%)
Displaying 1 - 30 of 138 reviews
Profile Image for Manny.
Author 34 books15k followers
November 23, 2021
If you already have some familiarity with machine learning and are not afraid of equations, this is an excellent brief overview of the field, written by someone who comes across as having huge experience of using ML in practice. You would have to know a great deal not to pick up something new.

One of the bits I liked best was the extremely concise, clear explanation of why Deep Learning has been such a success. As the author says, a deep neural network is really a very simple thing: it's just a function that's been built recursively out of a bunch of smaller functions, so something like

f = f₁(f₂(f₃( ... ), f₄( ... ), ..., f₂₅( ... ), ... ), ... )

The pictures you see,



are an easy-to-read graphical notation that says the same thing. Each of the nodes is a component function which will contain some adjustable parameters, and the lines show how the functions are composed together. We're going to tune the parameters by taking a bunch of training examples and doing some kind of gradient descent: we'll look at the answer we get with the current version of the function, do some calculus to find the direction in which we need to change the parameters to improve our guess, move the parameters a little step in that direction, and continue until things work well enough. This is the famous backprop algorithm.

Now what are these component functions going to be? The simplest choice would be to have linear functions, which are easy to work with. But a composition of linear functions is just another linear function, so that won't help us. The next simplest thing is for each function to be a linear function with some non-linear function applied to its output. Inspired by the way neurons work in the brain, it's natural to think that the non-linear function will be some kind of threshold function. If the output of the linear function is large enough, it'll produce an output, and otherwise it'll produce nothing.

Well, what is the threshold function going to be? Since we're going to use calculus to train our net, our first thought is that we want a smooth function, i.e. one that we can differentiate. The sigmoid function on the left seems like the kind of thing we need:



But in fact it turns out that the sigmoid function, clever as it looks, isn't best. If we have many layers in our neural net, the sigmoid functions will multiply together. Then changing a parameter in a deep layer will go through all those multiplied functions, and it will have almost no effect, so you won't be able to train the deep layers effectively. The ReLU function on the right is much better. It looks stupid and ugly, because it's not smooth at the crossover point, but it turns out that what's really important is that it's linear everywhere else. Using ReLU instead of sigmoid opens the door to deep neural nets that work. It took ages for people to realise this.

I have seen versions of this explanation before, but not so compact and straightforward, and I didn't fully appreciate the force of the argument. He does many things in a similar style: other sections I really liked described one-shot learning, unsupervised learning, learning to rank, and learning to recommend. I feel I now understand all of these much better.

What an excellent book! As many of the other reviewers say, this is a model you'd like to see more people emulating.
Profile Image for Quinn Jackson.
Author 19 books3 followers
January 26, 2019
Burkov's book deserves to be on every machine learning professional and enthusiast's desk. His economy of exposition, his inclusion of the necessary mathematical underpinnings, and his succinct and pragmatic tone overall are much appreciated.

I have already had the opportunity since receiving and reading the book to have it in hand in my own professional work, both as a refresher and as a quick sanity check. My own work in ML is at the level of the algorithms and the underlying maths themselves, and not just as a user of someone else's black-box implementation thereof, and in this respect, Burkov's treatise and approach to the field have been extremely valuable to me.

One might say of such a short work that it should be longer: I am definitely not in this camp. I already have books with thousands of pages in highly specialized subsections of the ML milieu; when I need an in depth dive into probabilistic graphical models, I can certainly find plenty of literature for such an exploration. What I have not had until Burkov's book, however, has been a single resource to turn to before I even consider such hair-splitting.

Were it that more specialist books were so well executed. Burkov has done the field a great service.
86 reviews6 followers
May 22, 2019
If I have to describe that book, that would be with the following scene:

Imagine you are in a bar with a good friend, who earns his living doing something that you are interested in. At some point during the evening, after discussing this and that, you ask him to give you a general idea about things in his area of expertise. You know, nothing really specific, just some pointers. And, because that person is a really great friend, they sigh and sketch a comprehensive introduction to the subject _on napkin_. All while patiently explaining thorny subjects like 'derivation' and 'normal distribution', not once acting judgemental or 'smarter-than-thou'.

This is a good book. It will not make you an expert but will act as a handy guide to the turbulent world of ML, giving you pointers and rough outlines of what awaits you in the dark.
Profile Image for Antonis Maronikolakis.
119 reviews5 followers
July 21, 2019
The idea behind the book is great, and execution is admirable. Unfortunately, I have to admit that in all honesty, I believe this book is over-hyped. It serves as a great introduction to most Machine Learning concepts, but it reads more than a listing page than a proper introduction. The book, even though it covers a lot of concepts, doesn't do enough to entice/lead you to research the topics more. In a lot of places it feels rushed and condensed to a point where no real information is given to the reader.

Trying to create a short guide on ML is admirable and I am glad that Mr. Burkov took the initiative to attempt this task, but I am afraid his attempt falls just short of my expectations. Nevertheless, I believe it will pave the way for greater things. For now, Mr. Burkov remains a pioneer in this niche area.
Profile Image for Mohammed .
3 reviews1 follower
August 11, 2019
I wish I had this book when I started learning Machine Learning. I knew few things but learned many more. It takes two days to complete and becomes a reference book to me now.

I really like those practical tips the author Andriy has provided along with the high quality demonstration of almost all popular Machine Learning algorithm.
Profile Image for Brendan.
119 reviews20 followers
April 29, 2021
I applaud the gumption, but I don't think the concept is possible. Burkov starts this book with promise, but quickly falls into simply summarizing the concepts. He vacillates between giving too much of an introduction and not enough. The most glaring fault is the succinct introduction of concepts filled with jargon like the following.

"An autoencoder is a feed-forward neural network with an encoder-decoder architecture."

Sure, this is fine if you understand the encoder-decoder architecture, but when you wash over the concepts so quickly, I doubt the true newcomers fully understand these concepts when phrased this way. Because of this, I'm not sure who this book is written for. If you have an understanding of ML and want a reference, this book is far too light on details. If you have no understanding of ML, this book is far too heavy on concepts and implicitly supposes too much knowledge. I certainly couldn't give this to my boss to understand, he wouldn't get past the first 10 pages. For me? I'm not sure I got anything out of this. The concepts I knew were not bolstered conceptually here. There were no great metaphors or mathematical insights. The concepts I was unfamiliar with still seem gray and foggy in my mind with this book offering little to hold on to for support.

Unfortunately, Burkov gave himself an impossible task and proved the task to be, for the moment, still impossible.
Profile Image for Ashok Krishna.
379 reviews54 followers
September 27, 2020
If you are an absolute beginner or someone just interested in knowing what all the fuss about data science is, then this book isn't for you. That is, unless you have a good grasp of statistical concepts and have the penchant for looking at all those scientific/statistical notations. Good in parts for a beginner to understand the core concepts. Too complex and dry at times. 3 stars!
Profile Image for Ibrahim Sharaf.
6 reviews3 followers
September 19, 2023
The simplest and most concise ML book that I've ever read (so far, haha)! I really love the tech-blogposts writing and explanation style that Burkov has adopted for this book, while the usual ML book is very academic and math-focused by nature, on the other side of the spectrum, this book focuses more on intuition, provides details but not too deep to overwhelm the reader, math is simple and used whenever it's necessary only.

However, I'm not saying this should be your textbook for teaching yourself ML, and I believe this isn't the segment that the author was targeting when he wrote the book, in my opinion, the most suitable audience for this book are:
1. Newbies who are looking for an ML introductory book, think of it as a bunch of carefully curated Medium posts, but it's way better and more streamlined.
2. Experienced ML professionals who are looking for a quick brush up on some concepts, maybe for an interview or a new project.

Also kudos to the author for distributing this book as a "Read first, buy later".
January 8, 2019
Absolutely great book to dabble in everything machine learning related. An excellent resource for exposure to the field for those with technical backgrounds in math, science, computer science, or engineering, but not much experience with machine learning algorithms while also a solid reference book for guidance for those who aren't the most up to date in the field.

Andriy was also kind enough to include a free edition of his book on his site, but I highly suggest purchasing it if you believe you will make frequent use of it. A solid, hard copy is helpful when you already have your monitors full as almost any data scientist or analyst likely already does. I cannot recommend this enough, especially when you can try the book for free.
3 reviews2 followers
November 5, 2022
This book swallows up the heavyweight mathematics textbooks and spits out a slim product no thicker than the width of my smartphone. From page one all the way to page 136, Andriy Burkov, the author, does not waste a single word in distilling the most practical concepts in machine learning. You read that right. It is MORE than 100 pages! Sounds like the book has some bias. Get it?

At a minimum, prospective readers should be comfortable with Calculus, Statistics, Probability, vectors and matrices, as well as, familiar with data science concepts. As for data scientists, the majority of your work is likely focusing on a handful of ML models. While you might skim parts of the book, I think it is still a useful reference because of the breadth of material. I provide a more in depth summary and commentary on the book at the following webpage:

https://thedatageneralist.com/book-re...
1 review
May 19, 2019
This book became my single resource to turn to. With the QR codes, it allowed me to investigate the topics further. Very easy to read and understand, The Hundred-Page ML Book is concise and covered most of the topic in machine learning that we need to know about.
79 reviews4 followers
April 11, 2020
Very good and comprehensive book. The first 70% were real fun, then it got slightly harder (at least for me). As a whole - definitely looks like a useful and practical reference for ML.
Profile Image for Tianyao (Till) Chen.
69 reviews12 followers
June 8, 2020
Fantastic ML book! I don't think there's any other ML book that's as lucid and comprehensive as this one. It's an amazing overview without being shallow. Highly recommended!
Profile Image for Cindy.
162 reviews64 followers
March 18, 2021
If you want to condense machine learning into 100 pages (or more precisely, 136) without losing rigor, you're going to end up with a bunch of math equations because machine learning is essentially math applied to data.
This book is so mathematically dense. It's symbolically complex and verbally concise. I really wouldn't recommend this to someone just starting out with machine learning. It could be a good refresher for someone who has studied machine learning at a graduate level. Maybe if you've been working with machine learning for a while and you've forgotten what exactly certain models do mathematically, this could help. I agree with the author : "Practitioners with experience can use this book as a collection of directions for further self-improvement".
The information within seems pretty up-to-date in terms of mentioning validation sets, and even genetic algorithms (mentioned briefly) and LSTMs. Some of the explanations are not sufficient in themselves, and you're going to have to look the topic up online if you're not familiar with it (especially starting on page 65 with deep learning and backpropagation, although the book does include QR codes that you can scan for further elaboration).
What I really enjoyed was that the author talked about certain statistical assumptions that are made about data in machine learning. They seem obvious when stated, but they're not necessarily common sense. For instance, it's usually assumed that your training data is randomly and independently selected from a particular distribution, and a machine learning model trained on the data is supposed to work on unseen future data because we assume that the future data is coming from the same distribution.
I also liked that the author provided multiple ways to solve particular problems (like an imbalanced dataset).
Solid book. I recommend it as a reference, but not necessarily as a one-stop shop for teaching yourself machine learning.

Profile Image for Alireza Aghamohammadi.
49 reviews46 followers
July 28, 2021

کتاب برای افرادی که وقت و زمان خواندن مراجع اصلی یادگیری ماشین را که معمولا بیش از هزار صفحه است را ندارند بسیار مناسب است و اکیدا توصیه می‌شود.

الگوریتم‌های یادگیری ماشین (مانند ماشین‌های بردار پشتیبان، جنگل تصادفی، رگرسیون و ...) به همراه شبکه‌های عصبی (به صورت مقدماتی) و مبانی ریاضی پشت هر کدام توضیح داده می‌شود. هر مبحث در سه تا چهار صفحه آموزش داده می‌شود.

در واقع کتاب سعی می‌کند از تمامی حواشی دوری کند و چکیده و ضروریات هر الگوریتم را توضیح دهد و به نظرم کاملا موفق بوده است.
Profile Image for Jacob Williams.
514 reviews11 followers
September 20, 2022
This is a great concise refresher. It covers many of the same concepts as the machine learning course I took earlier this year - decision trees, linear regression, SVM, neural nets, kNN, expectation maximization, ... - and it's sort of like having a set of high-quality, focused notes.

It also explains some details I've ignored in the past and covers many topics I wasn't familiar with:


- One reason for using logarithms when they're not mathematically necessary is to avoid overflow
- Negative cosine similarity as a distance function for e.g. k-nearest-neighbors
- Why to normalize features: so the derivatives with respect to each feature will be in similar ranges
- Rules of thumb on normalization vs standardization of features: standardize when there are outliers, or the data is close to normally distributed, or when doing unsupervised learning
- Multiple strategies for dealing with missing data
- Regularization
- Convolutional neural nets
- Recurrent neural nets, GRUs in particular
- Kernel regression, a non-parametric way to model nonlinear data
- Bagging; in particular random forest, which tries to produce a bunch of uncorrelated decision trees
- Comparison of boosting and bagging: "boosting reduces the bias ... instead of the variance"
- Sequence-to-Sequence learning: encoder, embedding, decoder
- Active learning: "Once we know the importance score of each unlabeled example, we pick the one with the highest importance score and ask the expert to annotate it."
- Denoising autoencoders: try to produce the same output as your input, after passing through an embedding layer, and despite corruption of the input
- Semi-supervised learning, such as the ladder network (a type of denoising autoencoder) that can perform excellently on MNIST with only a tiny fraction of the examples labeled
- One-shot learning with siamese neural networks and triplet loss
- Zero-shot learning, where "we want the model to be able to predict labels that we didn’t have in the training data"
- Stacking: "building a meta-model that takes the output of base models as input"
- Data augmentation: making additional training data by e.g. (for images) "zooming it slightly, rotating, flipping, darkening"
- Transfer learning
- Density estimation
- HDBSCAN, a clustering approach the author recommends trying before k-means
- Calculating "prediction strength" to choose number of clusters
- t-SNE and UMAP do dimensionality reduction "specifically for visualization purposes"
- The embedding in the bottleneck layer of an autoencoder can be used for dimensionality reduction
- Learning a distance metric from data
- Ranking problems: pointwise (the most obvious approach to me) and pairwise perform worse than listwise approaches such as LambdaMART
- Factorization machines, an approach to recommendation systems
- One reason you might use genetic algorithms is that the objective function doesn't have to be differentiable


I'd like to follow up on the concept of Bayesian hyperparameter learning, which was mentioned but not discussed.

This is an intriguing comment that I'd like to understand better:


...not many supervised learning algorithms can boast that they optimize a metric directly. Optimizing a metric is what we really want, but what we do in a typical supervised learning algorithm is we optimize the cost instead of the metric (because metrics are usually not differentiable). Usually, in supervised learning, as soon as we have found a model that optimizes the cost function, we try to tweak hyperparameters to improve the value of the metric. LambdaMART optimizes the metric directly.
Profile Image for Robert.
284 reviews
June 7, 2020
My general philosophy when it comes to learning is that it is much more important to know what you don't know than to think you know something. For that reason, I am often a fan of Very Short Introductions and the like. In my view this is a tough genre for an author – it is incredibly difficult to strike the right balance between breadth and depth. Burkov's book does a great job of it, providing a brilliant executive summary of the entire machine learning landscape (insofar as that is possible). The book is not for pure laymen, though should be doable for the "determined amateur" with a good understanding of high-school maths.

I found it to be a fantastic refresher, as well as a useful survey of some of the novel ideas in the ML world. Definitely would recommend to someone who wants a working knowledge of different machine learning methods and what is/isn't possible with ML. But this is probably not suitable if you want a broader picture of how ML will change the world, or specific business applications etc.
Profile Image for An Te.
386 reviews25 followers
April 2, 2022
Wow! This book has either of two effects on you. You feel like you're enlightened and understand it. Or that you feel flummoxed and way out of your depth. I would say I leaned towards the former, as these concepts were novel to me. I did find the depth and maths helpful to develop the intuition needed to apply the algorithms. Obviously, one has not mastered them reading this book, but the connections points are clear is one desires to pursue it further.

If ever there was a book which would be worth multiple reads, it is this one. A lot of concepts went over my head but some did bed down. If this is all machine learning is about (in a nutshell), it is worth read this book regularly to top up on what is available. Looking to mine it for more insights (if you don't mind the pun)

There's an accompanying wiki also, which helps keep the text relevant and up to date.
Profile Image for Vicki.
514 reviews225 followers
January 20, 2023
This book is hard to evaluate. It definitely has a lot of information and is dense in formulas but it’s hard to get a sense of who it’s meant to be for. It’s too detailed for the machine learning manager and leaves the machine learning practitioner (me) with not enough detail in some parts (for example I would have liked more on decision trees and gradient boosting but the book just rushes by.)

There are a lot of formulas without context and the sections don’t logically make sense.

The one thing I think this book is probably good for is preparing for interviews because there are a lot of concepts and buzzwords in bold. So maybe a summary skim? I think the idea is good but the execution was hard to get right.
Profile Image for Randy.
142 reviews44 followers
November 8, 2019
This is a brilliant introductory book on machine learning. The author even covered (even in only 100 pages) things like missing values and categorical variables. My only complaint is the index, it could be better. But it's a very short book and you don't need much help from the index to find things - the table of contents is enough.

Beyond that, you could do much worse than to read this, try a real project (find out that the real world is hard) and then find what the other 10,000 pages left out of this book might contain.
4 reviews
March 3, 2021
This book is an exceptional introduction to machine learning techniques for someone who already has a foundation in linear algebra. It's deliberately limited in scope to get you started, and the author is very thoughtful with this -- if you want to dive into a topic with more detail, links to online examples are provided, as well as notes and recommendations for further reading and more advanced topics.

I have now read this cover to cover once, but I expect that I'll pick it up to review certain topics many times going forward.
Profile Image for LeoQuiroa.
50 reviews
July 11, 2021
When I was at school, I used to write a really brief summary about the topic I wanted to learn and read it multiple times. When I start to read this book, I had a throwback of those memories. It feels like a very precise and concise summary of ML. I really recommend this book for an advanced audience as a quick reminder. On the other hand, for a beginner audience, oh boy! you will find a dense book.
Profile Image for Girish B.
22 reviews3 followers
May 6, 2020
An awesome place to start the journey. A lot of information squeezed into a tiny book. Comes with a companion wiki as well. You are introduced to a lot of topics in a straightforward way. Loved the journey!
Profile Image for Joe.
106 reviews1 follower
September 25, 2020
Excellent book, although I call foul on the “100 page” part :) there are excellent introductions, with follow-on references, for a lot of topics in ML. I will recommend this book to anyone looking for an intro or for brushing up on key areas.
Profile Image for amir husaini.
5 reviews
February 3, 2022
I right now only understand roughly 60% of it. The privilege of knowing what we don't know in the Data Science or Machine Learning domain will be good to be less overwhelming and reprioritize what are to focus on next. This book is doing well as guideline along the career journey!
Profile Image for Nicky.
25 reviews
February 28, 2022
This book is a great introductory read if you want to explore the vast landscape of ML from a holistic point of view. It covers many topics in a concise and comprehensible way, while at the same time staying away from too much detail. Consider it an elaborate ML cheat-sheet. Great work, Andriy!
Profile Image for Zoe Anderson.
7 reviews1 follower
November 9, 2022
Great introduction to machine learning; it's concise and presents the fundamentals of machine learning clearly. There is a lot of maths involved, so to get the most out of the book it's worthwhile pausing to work through the formulas.
2 reviews
October 27, 2019
Nice introduction to the ML topics, also good as reference book when you later want to revisit interconnected concepts
Displaying 1 - 30 of 138 reviews

Can't find what you're looking for?

Get help and learn more about the design.