Preface
In 1983, the movie WarGames came out. I was a pre-teen and I was absolutely engrossed: by the possibility of a nuclear apocalypse, by the almost magical way the lead character interacted with computer systems, but mostly by the potential of machines that could learn. I spent years studying the strategic nuclear arsenals of the East and the West fortunately with a naivete of a tweener but it took almost 10 years before I took my first serious steps in computer programming. Teaching a computer to do a set process was amazing. Learning the intricacies of complex systems and bending them around my curiosity was a great experience. Still, I had a large step forward to take. A few short years later, I worked with my first program that was explicitly designed to learn. I was blown away and I knew I found my intellectual home. I want to share the world of computer programs that learn with you.
Who do I think you are? Ive written Machine Learning (with Python) for Everyone for the absolute beginner to machine learning. Even more so, you may well have very little college level mathematics in your toolbox and Im not going to try to change that. While many machine learning books are very heavy on mathematical concepts and equations, Ive done my best to minimize the amount of mathematical luggage youll have to carry. I do expect, given the books title, that youll have some basic proficiency in Python. If you can read Python, youll be able to get a lot more out of our discussions. While many books on machine learning rely on mathematics, Im relying on stories, pictures, and Python code to communicate with you. There will be the occasional equation. Largely, these can be skipped if you are so inclined. But, if Ive done my job well, Ill have given you enough context around the equation to maybe just maybe understand what it is trying to say.
Why might you have this book in your hand? The least common denominator is that all of my readers want to learn about machine learning. Now, you might be coming from very different backgrounds: a student in an introductory computing class focused on machine learning, a mid-career business analyst who all of sudden has been thurst beyond the limits of spreadsheet analysis, a tech hobbyist looking to expand her interests, a scientist needing to analyze your data in a new way. Machine learning is permeating its way through society. Depending on your background, Machine Learning (with Python) for Everyone has different things to offer you. Even a mathematically sophisticated reader who is looking to do break in to machine learning using Python can get a lot out of this book.
So, my goal is to take someone with an interest or need to do some machine learning and teach them the process and most important concepts of machine learning in a concrete way using the Python scikit-learn
library and some of its friends. Youll come awway with overall patterns and strategies, pitfalls and gotchas, that will be applied in every learning system you ever study, build, or use.
Many books that try to convey mathematical topics, like machine learning, do so by presenting equations as if they tell a story to the uninitiated. I think that leaves many of us even those of us that like mathematics! stuck. For myself, I build a far better mental picture of the process of machine learning by combining visual and verbal descriptions with running code. Im a computer scientist at heart and by training. I love building things. Building things is how I know that Ive reached a level where I really understand them. You might be familiar with the phrase, If you really want to know something, teach it to someone. Well, theres a follow-on. If you really want to know something, teach a computer to do it! Thats my take on how Im going to teach you machine learning. With minimal mathematics, I want to give you the concepts behind the most important and frequently used machine learning tools and techniques. Then, I want you to immediately see how to make a computer do it. One note, we wont be programming these methods from scratch. Well be standing on the shoulders of other giants and using some very powerful and time-saving, pre-built software libraries more on that shortly.
We wont be covering all of these libraries in great detail there is simply too much material to do that. Instead, we are going to be practical. We are going to use the best tool for the job. Ill explain enough to orient you to the concepts were using and then well get to using it. For our mathematically-inclined colleagues, Ill give pointers to more in-depth references they can pursue. Ill save most of this for end-of-the-chapter notes so the rest of us can skip it easily.
If you are flipping through this introduction, deciding if you want to invest time in this book, I want to give you some insight into things that are out-of-scope for us. We arent going to dive into mathematical proofs or rely on mathematics to explain things. There are many books out there that follow that path and Ill give pointers to my favorites at the ends of the chapters. Likewise, Im going to assume that you are fluent in basic- to intermediate-level Python programming. However, for more advanced Python topics and things that shows up from a 3rd party package like NumPy or Pandas Ill explain enough of whats going on so that you can understand it and its context.
Our main focus is on the techniques of machine learning. We will investigate a number of learning algorithms and other processing methods along the way. However, our goal is not completeness. Well discuss the most common techniques. We will only glance briefly at two large sub-areas of machine learning: graphical models and neural or deep networks. But, we will see how the techniques we focus on relate to these more advanced methods.
Another topic we wont cover is implementing specific learning algorithms. Well build on top of the algorithms that are already available in scikit-learn
and friends: well create larger solutions using them as components. But, someone has to implement the gears and cogs inside the black-box we funnel data into. If you are really interested in implementation aspects, you are in good company: I love them! Have all your friends buy a copy of this book, so I can argue I need to write a follow-up that dives into these lower-level details.
I must take a few moments to thank several people that have contributed greatly to this book. My editor at Pearson, Debra Williams, has been instrumental in every phase of this books development. From our initial meetings, to her probing for a topic that might meet both our needs, to gently sheparding me through many (many!) early drafts, to constantly giving me just enough of a push to keep going, and finally climbing the steepest parts of the mountain at its peek ... through all of these phases, Debra has shown the highest degrees of professionalism. I can only respond with a heartfelt thank you.
My wife, Dr. Barbara Fenner, also deserves more praise and thanks than I can give her in this short space. In additional to the normal burdens that any partner of an author must bear, she also served as my primary draft reader and our intrepid illustrator. All of hte non-computer generated diagrams in this book are thanks to her hard work. While this is not our first joint academic project, it has been turned into the longest. Her patience, is by all appearances, never ending. Barbara,