Building Recommender Systems with Machine Learning and AI
Frank Kane
Sundog Education
http://www.sundog-education.com/
Copyright 2018 Sundog Software LLC, DBA Sundog Education
Contents
Getting Started
Introduction
Hello! If youre reading this, youve purchased this course on recommender systems either in PDF or book form. Thank you! If youre also interested in the video version of it, youll find it at www.sundog-education.com.
Im Frank Kane, CEO of Sundog Education. I spent nine years at Amazon.com, where I dedicated most of my career to building various parts of their recommendation systems and managing teams responsible for them. You know people who bought this also bought? Yeah, I ran that for awhile, along with their personalization platform team. I have a lot of real-word experience to share with you, combined with the latest research Ive done on new developments in the field since I left Amazon. Recommendations are one of the most fascinating applications of machine learning, and also one of the most lucrative recommending products is central to Amazons success, recommending movies is central to YouTube and Netflix, and you can even think of Google as just recommending web pages and ads to people.
This isnt going to read like a typical book. What you have here are the slides Ive used for presenting this information in video or live form, along with the script I prepared to accompany these slides. So for each topic, youll see an image of the slide associated with it, along with the text I wrote to explain each slide. But, it works surprisingly well it has all the textual information youd get in a typical book, but with many more visual aids, and a more casual, conversational tone than youd find in a textbook. Its ideal for visual learners who just find reading material a lot more efficient than listening to someone read it aloud in a video. Ive written this script with this written version of the course in mind, and have attempted to make sure everything works just as well in print as it does in a video presentation.
Youll find that code walk-throughs work a little bit differently in this format than what you may be used to from other technical books. Early in this course, youll be directed to download all of the code that accompanies it. When we get to slides that review this code, youll want to pull up that code on your computer as directed in those lectures, and refer to it alongside these written notes that explain what each section of the code does. Really, theres no other practical way to do it recommender systems involve a fair amount of code, and in most cases it simply wont fit on one written page. I promise to be very specific about what parts of the code Im talking about as we go through it, so you wont get lost.
Oh, and if Amazons lawyers are reading this dont worry. Ive been careful to only cover algorithms and techniques that have appeared publicly, in print. Im not revealing any inside, confidential information here although most of what we did at Amazon has been published at this point, anyhow .
I know youre itching to go hands-on and produce some recommendations on your own, so lets dive right in and get all the software and data you need installed!
Getting Set Up
In the next few minutes, youre going to install a Python development environment on your PC if you dont have one already. Then, youll install a package for Python called Surprise that makes developing recommender systems easy. Finally, well download the course materials including some real movie rating data, and well make movie recommendations for a real person right here in lecture 1.
So, lets do this! The first thing you need is some sort of scientific Python environment that supports Python 3. That means a Python environment thats made for data scientists, like Anaconda or Enthought Canopy. If you already have one, then great you can skip that step. But if not, lets get Anaconda installed on your system, and well also get the course materials you need while were at it.
Now, if youre the sort of person who prefers to just follow written instructions for things like this, you can head over to my website at the URL shown here. Pull it up anyhow, as were going to refer to this page as we set things up in this video. Remember to pay attention to capitalization the R and S in RecSys need to be capitalized.
Youll also have a chance to join the Facebook group for this course, where you can collaborate with fellow students, and youll be offered a chance to stay in touch with me as well.
In this course, were going to use the Python programming language, as its pretty easy to pick up. So if you dont already have a development environment for Python 3 installed, youll need to get one. I recommend Anaconda its free and widely used. Lets head to www.anaconda.com/download , and select the installer for whatever operating system youre using. For me, thats Windows 64-bit and be sure to select the Python 3 version, not Python 2. Once it downloads, go through the installer, making sure to install it on a drive that has plenty of space available at least 3 GB .
Now that Anaconda is installed, you can launch it
.. And select the environments tab here. To keep things clean, lets set up an environment just for this course. Click on create and lets call it RecSys thats shorthand for recommender systems, by the way. We want a Python environment, for whatever current version of Python3 is offered to you. It will take a few moments for that environment to be created.
Next we need to install a Python package that makes developing recommender systems easier, called Surprise. To do that, click on the arrow next to the RecSys environment you just made, and open up a terminal from it. Now in the terminal, run:
conda install c conda-forge scikit-surprise
If prompted, hit y to continue, and let it do its thing.
When its done, we can close this terminal window.
Next, we need to download the scripts and data used in this course. From our course setup page at sundog-education.com/RecSys, youll find a link to the materials. Lets go ahead and download that. When its done, well unzip it, and put it somewhere appropriate, like your documents folder.
Throughout this course, were going to build up a large project that recommends movies in many different ways. So were going to need to data to work with back at our course setup page, youll find a link to the MovieLens data set. Its a subset of 100,000 real movie ratings from real people, along with some information about the movies themselves. Download that, and unzip it .
When its unzipped, move the resulting ml-latest-small folder inside the course materials folder you made earlier.
Now, we have everything we need! Lets make some movie recommendations. Back at Anaconda Navigator, make sure the RecSys environment we made is still selected, and now click on the Home icon.