Introduction: AI, Machine Learning, and Deep Learning
Originally published 16 February, 2021
By Michio Suginoo
Last Edited 22 February, 2021
AI, Machine Learning, and Deep Learning: these 3 terms are often used in a very ambiguous manner, sometimes interchangeably.
Here, this Venn Diagram illustrates one popular way to categorize these three terms. This is not the only way. So, please take it as an example.
At the highest hierarchy, AI has a catch-all general definition to capture anything that reasons and learns like humans. This rough general definition serves its purpose, since new innovation is emerging all the time and needs to be fit into this category.
Nevertheless, we still do not have such a program that satisfies this description in a strict sense yet.
Then, Machine Learning is a subset of AI. It is a new “algorithm paradigm” that learns without being explicitly programmed.
Then, a subset of Machine Learning, we have Deep Learning. It is also called Deep Neural Networks, and it is inspired by Neuroscience and Cognitive Science. Since nobody knows how our brain works, we can reflect on the system only inspirations derived from our speculation. And the philosophy of Connectionism has a strong influence on the architecture of Deep Learning.
Overall, reality today is far from this Venn Diagram. In the past, success stories of Machine Learning have been overemphasized. And recently, shortcomings of AI are gradually acknowledged in the public domain. So, let’s take this Venn Diagram as a “long term goal setting” rather than a reflection of the status quo.
Here is another factor that distinguishes Deep Learning from Conventional Machine Learning models: its scalability. The chart above illustrates this notion.
Traditional ML models converge into a state of plateau after a certain level of data feeding.
On the other hand, Deep Learning improves its accuracy, as we feed more data into the system.
Scalability is one of primary advantages of Deep Learning. At the same time, as its flip side, Deep Learning is Data Hungry; it requires a massive amount of data in order to achieve high performance. Scalability and data hungry nature are two sides of the same coin, Deep Learning.
FYI, I borrowed the chart above from Professor Andrew Ng’s presentation. Professor Ng is a leading educator of Machine Learning. He is very good at giving useful inspiration to help students understand complex subjects in Deep Learning field. Here is the link to his slide: https://www.slideshare.net/ExtractConf
Obsession of AI: programs that learns and reasons like human Artificial Intelligence Engineers share an Obsession: to reach Human Level Performance and even go beyond that.
The gentleman in the pic is responsible for this obsession. He is a prominent British mathematician of 20th century, Alan Turing. Turing is considered to be one of the founding fathers of modern Artificial Intelligence. In the 1950s, he envisioned that Machine will sooner or later exhibit behaviours indistinguishable from human behaviours. During WWII, he played a critical role in decoding the encryption of Enigma machine used by German Nazi. And his Turing Test is still influential in setting goals of artificial intelligence today. In a way Turing inaugurated a convention to compare Machine’s ability with and to Human Performance.
Deep Learning History
Deep Learning is not a new idea. It has existed since the 1940s. Deep learning has three waves of waxes and wanes in the past. Here is a chart of Google Ngram to track the frequency of five words associated with Deep Learning since the 1940s.
Deep Learning has changed its brand name three times in the past:
In the first wave, it was called Cybernetics
In the second wave, Neural Networks
And today, we are in the hype of its 3rd wave with the new brand name of Deep Learning.
One of the most influential works at an early stage was a model called Perceptron by an American psychologist, Frank Rosenblatt, in the late 50s and the early 60s. He managed to incorporate the basic architecture of Deep Learning into the mode. Nevertheless, Perceptron could not advance the sophistication of the architecture—specifically by expanding the layers deeper enough—to achieve a satisfactory level of performance due to the computational power constraints at that time. As an unfortunate consequence, neural networks were dismissed by critics and its popularity declined. This symbolized a historical notion that hardware limitations can materially constrain both the imagination and the implementation of Deep Learning. And this historical notion will resonate today and towards the future.
As a matter of fact, today, thanks to Hardware Revolution, in this 3rd wave, Deep Learning has gained the momentum to rise together with other 3 terms: Artificial Intelligence, Neural Networks, and Machine Learning. Altogether combined, it is shaping the biggest wave ever.
Some sceptics have that Deep Learning already has made a turn at the peak of this 3rd wave. Gartner Hype Cycle in the next chart projected new innovations on a stylized Hype cycle formulation in 2018. You can see Deep Neural Nets (Deep Learning) on the peak of the wave.
As a matter of fact, in recent years, the shortcomings of Deep Learning have become increasingly acknowledged in the public domain. We will discuss blind spots of Deep Learning in Chapter 7 with a special focus on ‘Underspecification’. Are we already seeing negative signals for the coming 3rd dark age of Deep Learning?
Regardless of what the future holds, both Machine Learning and Deep Learning made remarkable transformative success cases that conventional science could not achieve. We should lean both its strengths and its weaknesses.
My intent for this series
Throughout this series, I would like to outline basic intuitions and inspirations that I gained from my past Machine Learning journey. At the beginning it was not easy for me to grasp the high-level conceptual framework of Machine Learning and Deep Learning. Whatever the real reason it might have been, to get right intuitions and inspirations was very helpful for me to digest the complexity of the subject.
Now, reflecting my own experience, my intention for this series is to share those intuitions with beginners. In this context, as a general policy throughout this series, I would like to draw a rough sketch rather than to go into details.
Fortunately, there are oceans of open-resources that you can dive deeper into for these topics. Instead of creating redundant works, my intention in this series is to present guiding intuitions that help the readers navigate their own ML & DL journeys.
Please continue the content of this series from the links here: