The following are the notes I took from reading the book by Pedro Domingos ‘The Master Algorithm’. Frankly, I did not like the book, it ventures into philosophy and personal musings which are not very informational. His talks at Google and at Microsoft introducing the book and the idea were much more interesting, which caused me to read it.
Civilization advances by extending the number of important operations we can perform without thinking about them.
‒ Alfred North Whitehead
The Master Algorithm
by Pedro Domingos, 2015
Machine learning is the automation of discovery, and it is responsible for making our smartphones work, helping Netflix suggest movies for us to watch, and getting presidents elected. But there is a push to use machine learning to do even more—to cure cancer and AIDS and possibly solve every problem humanity has. Domingos is at the very forefront of the search for the Master Algorithm, a universal learner capable of deriving all knowledge—past, present and future—from data. In this book, he lifts the veil on the usually secretive machine learning industry and details the quest for the Master Algorithm, along with the revolutionary implications such a discovery will have on our society.
Pedro Domingos is a Professor of Computer Science and Engineering at the University of Washington, and he is the cofounder of the International Machine Learning Society.
Algorithms increasingly run our lives. They find books, movies, jobs, and dates for us, manage our investments, and discover new drugs. More and more, these algorithms work by learning from the trails of data we leave in our newly digital world. Like curious children, they observe us, imitate, and experiment. And in the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask.
Machine learning is the automation of discovery—the scientific method on steroids—that enables intelligent robots and computers to program themselves. No field of science today is more important yet more shrouded in mystery. Pedro Domingos, one of the field’s leading lights, lifts the veil for the first time to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He charts a course through machine learning’s five major schools of thought, showing how they turn ideas from neuroscience, evolution, psychology, physics, and statistics into algorithms ready to serve you. Step by step, he assembles a blueprint for the future universal learner—the Master Algorithm—and discusses what it means for you, and for the future of business, science, and society.
If data-ism is today’s rising philosophy, this book will be its bible. The quest for universal learning is one of the most significant, fascinating, and revolutionary intellectual developments of all time. A groundbreaking book, The Master Algorithm is the essential guide for anyone and everyone wanting to understand not just how the revolution will happen, but how to be at its forefront.
Traditionally, the only way to get a computer to do something—from adding two numbers to flying an airplane —was to write down an algorithm explaining how, in painstaking detail. But machine-learning algorithms, also known as learners, are different: they figure it out on their own, by making inferences from data. And the more data they have, the better they get. Now we don’t have to program computers; they program themselves.
Hundreds of new learning algorithms are invented every year, but they’re all based on the same few basic ideas. Far from esoteric, and quite aside even from their use in computers, they are answers to questions that matter to all of us: How do we learn? Is there a better way? What can we predict? Can we trust what we’ve learned? Rival schools of thought within machine learning have very different answers to these questions. The main ones are five in number:
- Symbolists view learning as the inverse of deduction and take ideas from philosophy, psychology, and logic.
- Connectionists reverse engineer the brain and are inspired by neuroscience and physics.
- Evolutionaries simulate evolution on the computer and draw on genetics and evolutionary biology.
- Bayesians believe learning is a form of probabilistic inference and have their roots in statistics.
- Analogizers learn by extrapolating from similarity judgments and are influenced by psychology and mathematical optimization.
P and NP are the two most important classes of problems in computer science. (The names are not very mnemonic, unfortunately.)
A problem is in P if we can solve it efficiently, and it’s in NP if we can efficiently check its solution.
The famous P = NP question is whether every efficiently checkable problem is also efficiently solvable. Because of NP-completeness, all it takes to answer it is to prove that one NP-complete problem is efficiently solvable (or not). NP is not the hardest class of problems in computer science, but it’s arguably the hardest “realistic” class: if you can’t even check a problem’s solution before the universe ends, what’s the point of trying to solve it?
Science goes through three phases, which we can call the Brahe, Kepler, and Newton phases. In the Brahe phase, we gather lots of data, like Tycho Brahe patiently recording the positions of the planets night after night, year after year. In the Kepler phase, we fit empirical laws to the data, like Kepler did to the planets motions. In the Newton phase, we discover the deeper truths. Most science consists of Brahe-and Kepler-like work; Newton moments are rare.
Our search for the Master Algorithm is complicated, but also enlivened, by the rival schools of thought that exist within machine learning. The main ones are the:
Each tribe has a set of core beliefs, and a particular problem that it cares most about. It has found a solution to that problem, based on ideas from its allied fields of science, and it has a master algorithm that embodies it.
For symbolists, all intelligence can be reduced to manipulating symbols, in the same way that a mathematician solves equations by replacing expressions by other expressions. Symbolists understand that you can’t learn from scratch: you need some initial knowledge to go with the data. They’ve figured out how to incorporate preexisting knowledge into learning, and how to combine different pieces of knowledge on the fly in order to solve new problems. Their master algorithm is inverse deduction, which figures out what knowledge is missing in order to make a deduction go through, and then makes it as general as possible.
For connectionists, learning is what the brain does, and so what we need to do is reverse engineer it. The brain learns by adjusting the strengths of connections between neurons, and the crucial problem is figuring out which connections are to blame for which errors and changing them accordingly. The connectionists’ master algorithm is backpropagation, which compares a system’s output with the desired one and then successively changes the connections in layer after layer of neurons so as to bring the output closer to what it should be.
Evolutionaries believe that the mother of all learning is natural selection. If it made us, it can make anything, and all we need to do is simulate it on the computer. The key problem that evolutionaries solve is teaming structure: not just adjusting parameters, like backpropagation does, but creating the brain that those adjustments can then fine-tune. The evolutionaries master algorithm is genetic programming, which mates and evolves computer programs in the same way that nature mates and evolves organisms.
Bayesians are concerned above all with uncertainty. All learned knowledge is uncertain, and learning itself is a form of uncertain inference. The problem then becomes how to deal with noisy, incomplete, and even contradictory information without falling apart. The solution is probabilistic inference, and the master algorithm is Bayes theorem and its derivates. Bayes’ theorem tells us how to incorporate new evidence into our beliefs, and probabilistic inference algorithms do that as efficiently as possible.
For analogizers, the key to learning is recognizing similarities between situations and thereby inferring other similarities. If two patients have similar symptoms, perhaps they have the same disease. The key problem is judging how similar two things are. The analogizers’ master algorithm is the support vector machine, which figures out which experiences to remember and how to combine them to make new predictions.