I began business life as an entrepreneur in business in Hong Kong. I ran trade exhibitions, imported coffee from Kenya, and started and operated two restaurants, one of which (unusually for Hong Kong) served vegetarian food. These businesses were profitable, but I would have saved myself a great deal of stress, and done better if I had used some business intelligence informed by statistics to improve my decision-making. Looking back, I wish I had been able to think through and write analyses on topics such as these among many others:
a. calculation of optimal restaurant staffing levels
b. analysis of sales over time and seasonality in trends
c. receipts per customer by restaurant type and analyzed strength and type of any differences
d. predicted sales data for potential exhibitors with visualiza- tions of various what if scenarios
e. gained deeper insights from visualizing my data
f. won over more partners and investors with better visual and written presentations
Im sure that there are many more people like me, aware that they ought to be doing more with the data they collect as part of normal business operations, but uncertain of how to go about it. There is no shortage of textbooks and manuals, but these dont seem to get to the hands-on applications quickly enough. This book is for people such as me, in two components: the analysis, finding out the underlying story from the data, and then the presentation of the story.
1.1 This book is a little different
Most books on stats introduce statistical techniques on a chapter by chapter basis. Instead, Ive written this book so that it reflects the way many people learn. The chapters are structured by the type of question you might want to ask. Examples of the type of questions are summarized below, and at the beginning of each chapter. The chapters themselves doesnt include much math and technical details. Instead these are placed towards the end of the book in a glossary.
Many of the Excel procedures are linked to screen casts prepared by me to illustrate that particular procedure. The data-sets used in the book are available from my Dropbox public folder: just click on the hyperlink that appears next to each worked example. With the data-sets open in Excel, you can follow along at your own pace. (And then do the same with your own data). I have used Tableau for some visualizations, and you will find links to my workbooks and screen casts.
1.2 Chapter descriptions
If youre reading this book, then you are very likely already engaged in business and would like to know how to take that business to new heights. Take a look through the chapter descriptions that follow and then go straight to that chapter. If you think you might need a bit of a statistics refresher, look through the Glossary in Chapter 16 first. A very good free statistics
Chapter introduces which is a free data visu- alization tool. While Excel does have graphing tools which are easy to use, the results can look a little clunky. Tableau helps
https://www.openintro.org/stat/ http://www.tableausoftware.com/public/
us to merge data from different sources and create remarkable visualizations which can then be easily shared. Ill be illustrating results throughout this book with either Tableau, Excel or both. One caution: if you publish your results to the web using Tableau Public, your data is also published. If this is a problem, there is an option to pay for an enhanced version of Tableau. The screenshot below shows work done changes on land use in the Delta region of British Columbia. By clicking on the link, you can open the workbook and alter the settings. You can filter the year (see top right) and also land use type. Thanks to Malcolm Little for his work on this project.
Tableau showing changes in landcover
Tableau has training and demonstration videos available on its website, and there are plenty of examples out there. The screen casts which Tableau provides (available at their homepage) are probably enough. Where I have found some technique (such as boxplots) particularly tricky I have created screen casts for this book. The image below is data we will use in the regression chapter. You run a
https://public.tableau.com/views/DeltaCropCategories1996-2011/Sheet1?:embed=y&: showTabs=y&:display_count=yes
trucking company and from your logbooks extract the distance and duration and number of deliveries for some delivery jobs. The image shows a trend line, with distance on the horizontal axis and time taken on the vertical axis. The points on the scatterplot are sized by number of deliveries. You can see that more deliveries increases the time, as one would expect. Using regression, Chapter 6, we will work out a model which can show how much extra you should charge for each delivery.
Tableau showing distances, times, and deliveries
Chapter 3 . Writing up your findings. In most cases statistical analysis is done in order to help a decision-maker decide what to do. I am assuming that it is your job to think through the problem and assist the decision-maker by assembling and analyzing the data on his/her behalf. This chapter suggests ways in which you might write up your report, accompanied by visualizations to get across your message. There are also links to some helpful websites which
https://youtu.be/oFACLZLJWZI
discuss the preparation of slides and how to make presentations.
Chapter : Data and how to get it. If you are considering collecting data yourself, through a survey for example, then youll find this chapter useful. So-called big data is a hot topic, and so I include some discussion. The chapter also includes links to publicly avail- able data sets which might be helpful.
Chapter deals with tests for whether two or more quantities are the same or different. For example, you are a franchisee with three coffee-shops. You want to know whether daily sales are the same or different, and perhaps what factors cause any difference that you find. You want to know whether there is a statistically significant common factor or not, to eliminate the possibility that the difference you see occurred purely by chance. Perhaps the average age of the customers makes a difference in the sales? ANOVA uses very similar theory to regression, the subject of the next chapter and perhaps the most important in the book.