SQL for Marketers
Dominate data analytics, data science, and big data
By: The LazyProgrammer ( http://lazyprogrammer.me )
Introduction
More and more companies these days are learning that they need to make DATA-DRIVEN decisions.
With big data and data science on the rise, we have more data than we know what to do with.
One of the basic languages of data analytics is SQL, which is used for many popular databases including MySQL, Postgres, Microsoft SQL Server, Oracle, and even big data solutions like Hive and Cassandra.
Im going to let you in on a little secret. Most high-level marketers and product managers at big tech companies know how to manipulate data to gain important insights. No longer do you have to wait around the entire day for some software engineer to answer your questions - now you can find the answers directly, by yourself, using SQL!
Do you want to know how to optimize your sales funnel using SQL, look at the seasonal trends in your industry, and run a SQL query on Hadoop? Then join me now in my new book, SQL for marketers: Dominate data analytics, data science, and big data!
Why marketers and product managers (PMs) need to know SQL
Are you tired of depending on crufty analytics software?
Do you have to ask an engineer to help you whenever you have a question about the data?
This is not ideal and wont help you do your job efficiently.
SQL, short for structured query language, is a language that can be used for all kinds of databases - from the tiny databases stored in your iPhone, to large big data databases that span multiple continents.
Engineers have done a great job of creating these different types of complex data stores, while still allowing you to use the same language, more or less, for all of them.
What does that mean for you?
It means as long as you know SQL, you can take advantage of ALL of this software, and gain insights into this data, no matter what kind of database it is stored in, as long as it supports SQL.
You can ask questions like:
How many people are falling into each stage of the sales funnel?
What is my year over year profit?
Are there any differences in the demographics between the people who are buying product X and product Y?
What is our most profitable month?
What are the seasonal trends in our industry?
Im an engineer, so I probably havent even thought of all the questions youve already had for years! But I guarantee you, knowing SQL will help you answer these questions.
On various teams Ive worked on in the tech world - Ive noticed that marketing people and product managers have SQL skills and sometimes even coding skills! So if you are looking to not only make your day more productive, but make yourself more marketable to employers and catch up to the other go-getters in your field - then you should most definitely learn SQL.
Chapter 1: Overview of SQL databases and Installing SQL
Overview of SQL Databases
In this section Im going to go over a number of different technologies that use SQL.
You have have already heard of MySQL. Its very popular on the web, and if you get hosting from sites like Namecheap, it usually comes with PHP and MySQL pre-installed.
Another similar database is Postgres. Postgres has made a lot of great improvements recently, one being JSON support which allows it to have a lot of functionality similar to MongoDB.
MySQL and Postgres can be run on your local machine, so you can have a PHP application or a Ruby on Rails application running alongside MySQL on the same server. But you can imagine this eats up resources.
So what engineers usually do at scale is to put these databases on their own servers. Now you can have multiple machines running the application code, all talking to the same database.
What happens when data gets really really big? Well then we cant even store it all on one machine. You may have heard of a popular big data technology called Hadoop. The Hadoop file system allows us to do 2 interesting things. 1) is that it splits up our data into chunks. So we could potentially store a 1TB file. It just means we would have more chunks. We can split these chunks across different machines. 2) As you know, machines can fail, and the more machines you have, the higher the chance of failure. So another thing we do is replication. We make multiple copies of these chunks in different places, so that if one machine fails, we still have a copy somewhere else.
Hive is one framework that allows you to use Hadoop as a database, and supports a similar language to SQL called Hive QL. Cassandra is another big database technology that supports a similar language, called Cassandra Query Language or CQL.
At the end of this book well look at just plain SQL on a system called Spark which runs on Hadoop.
Throughout the entirety of this book, were going to use yet another type of database, called SQLite.
Why SQLite?
SQLite is a file-based database, so it isnt as large scale as some of the others, but its more than enough so support a wide range of scenarios.
SQLite is even on your iPhone! Every app can have its own SQLite database. Many of the apps on your computer do the same thing.
Another great thing about SQLite is its super easy to install and use no matter what system youre on - so if you use Windows, Linux, or Mac, you can do all the exercises in this course.
Installing SQLite on Mac, Windows, or Linux
If you have a Mac or you use Linux, then youre in luck. SQLite already comes with the Mac, so theres nothing to do here.
If you are using Windows and have a machine powerful enough to run VirtualBox, I would recommend grabbing a lightweight version of Linux like Lubuntu or Xubuntu.
In the Linux console, you simply have to enter the command:
sudo apt-get install sqlite
If you really want to stick with Windows, then head over to SQLite.org and go to the download page. Download the DLL zip file and the tools zip file, since well be using the command line interface throughout this course.
You can unzip these files to C:\sqlite, and add this directory to your PATH environment variable.
You can find instructions for how to do that here: https://stackoverflow.com/questions/23400030/windows-7-add-path
To check that youve done it correctly, open up cmd.exe, and type in sqlite3 - it should open the SQLite command line shell.
Chapter 2: Relational databases and getting data
What is a relational database?
In this section Im going to answer the question - what is a relational database? - and give you some concrete examples.
A relational database is a collection of tables. As you know a table has rows and columns.
The columns are referred to as fields or attributes. Each row is a new record.
field1 | field2 |
record1 |
record2 |
So for example I could have a table called users, with the fields name and email:
Users:
name | email |
Bob | bob@gmail.com |
Jane | jane@gmail.com |
Each table, or relation, usually refers to a specific entity. So users was one example. Products might be another example. Orders would be another example.
The relational part of a relational database comes from the fact that different tables can be related to each other.
For example, if your users table had an id field like this:
Users:
id | name | email |
1 | Bob | bob@gmail.com |
| Jane | jane@gmail.com |
And an orders table like this:
Orders:
Next page