Mark Wickham 2018
Mark Wickham Practical Java Machine Learning https://doi.org/10.1007/978-1-4842-3951-3_1
1. Introduction
Chapter establishes the foundation for the book.
It describes what the book will achieve, who the book is intended for, why machine learning (ML) is important, why Java makes sense, and how you can deploy Java ML solutions.
The chapter includes the following:
A review all of the terminology of AI and its sub-fields including machine learning
Why ML is important and why Java is a good choice for implementation
Setup instructions for the most popular development environments
An introduction to ML-Gates, a development methodology for ML
The business case for ML and monetization strategies
Why this book does not cover deep learning, and why that is a good thing
When and why you may need deep learning
How to think creatively when exploring ML solutions
An overview of key ML findings
1.1 Terminology
As artificial intelligence and machine learning have seen a surge in popularity, there has arisen a lot of confusion with the associated terminology. It seems that everyone uses the terms differently and inconsistently.
Some quick definitions for some of the abbreviations used in the book:
Artificial intelligence (AI) : Anything that pretends to be smart.
Machine learning (ML) : A generic term that includes the subfields of deep learning (DL) and classic machine learning (CML).
Deep learning (DL) : A class of machine learning algorithms that utilize neural networks.
Reinforcement learning (RL) : A supervised learning style that receives feedback, but not necessarily for each input.
Neural networks (NN) : A computer system modeled on the human brain and nervous system.
Classic machine learning (CML) : A term that more narrowly defines the set of ML algorithms that excludes the deep learning algorithms.
Data mining (DM) : Finding hidden patterns in data, a task typically performed by people.
Machine learning gate (MLG): The book will present a development methodology called ML-Gates. The gate numbers start at ML-Gate 5 and conclude at ML-Gate 0. MLG3, for example, is the abbreviation for ML-Gate 3 of the methodology.
Random Forest (RF) algorithm : A learning method for classification, regression and other tasks, that operates by constructing decision trees at training time.
Naive Bayes (NB) algorithm : A family of probabilistic classifiers based on applying Bayes theorem with strong (naive) independence assumptions between the features.
K-nearest neighbor (KNN) algorithm : A non-parametric method used for classification and regression where the input consists of the k closest training examples in the feature space.
Support vector machine (SVM) algorithm : A supervised learning model with associated learning algorithm that analyzes data used for classification and regression.
Much of the confusion stems from the various factions or domains that use these terms. In many cases, they created the terms and have been using them for decades within their domain.
Table shows the domains that have historically claimed ownership to each of the terms. The terms are not new. Artificial intelligence is a general term. AI first appeared back in the 1970s.
Table 1-1
AI Definitions and Domains
Term | Definition | Domain |
---|
Statistics | Quantifies the data. DM, ML, DL all use statistics to make decisions. | Math departments |
Artificial intelligence (AI) | The study of how to create intelligent agents. Anything that pretends to be smart. We program a computer to behave as an intelligent agent. It does not have to involve learning or induction. | Historical, Marketing, Trending. |
Data mining (DM) | Explains and recognizes meaningful patterns. Unsupervised methods. Discovers the hidden patterns in your data that can be used by people to make decisions. A complete commercial process flow, often on large data sets (Big Data). | Business world, business intelligence |
Machine learning (ML) | A large branch within AI in which we build models to predict outcomes. Uses algorithms and has a well-defined objective. We generalize existing knowledge to new data. Its about learning a model to classify objects. | Academic departments |
Deep learning (DL) | Applies neural networks for ML. Pattern recognition is an important task. | Trending |
The definitions in Table represent my consolidated understanding after reading a vast amount of research and speaking with industry experts. You can find huge philosophical debates online supporting or refuting these definitions.
Do not get hung up on the terminology. Usage of the terms often comes down to domain perspective of the entity involved. A mathematics major who is doing research on DL algorithms will describe things differently than a developer who is trying to solve a problem by writing application software. The following is a key distinction from the definitions:
Data miningis all abouthumansdiscovering the hidden patterns in data, whilemachine learningautomates the process and allows thecomputerto perform the work through the use of algorithms.
It is helpful to think about each of these terms in context of infrastructure and algorithms. Figure shows a graphical representation of these relationships. Notice that statistics are the underlying foundation, while artificial intelligence on the right-hand side includes everything within each of the additional subfields of DM, ML, and DL.
Machine learning is all about the practice ofselectingandapplying algorithmsto our data.
I will discuss algorithms in detail in Chapter . The algorithms are the secret sauce that enables the machine to find the hidden patterns in our data.
Figure 1-1
Artificial intelligence subfield relationships
1.2 Historical
The term artificial intelligence is hardly new. It has actually been in use since the 1970s. A quick scan of reference books will provide a variety of definitions that have in fact changed over the decades. Figure shows a representation of 1970s AI, a robot named Shakey, alongside a representation of what it might look like today.
Figure 1-2
AI, past and present
Most historians agree that there have been a couple of AI winters . They represent periods of time when AI fell out of favor for various reasons, something akin to a technological ice age. They are characterized by a trend that begins with pessimism in the research community, followed by pessimisms in the media, and finally followed by severe cutbacks in funding. These periods, along with some historical context, are summarized in Table .
Next page