Machine Learning for OpenCV
Intelligent image processing with Python
Michael Beyeler
BIRMINGHAM - MUMBAI
Machine Learning for OpenCV
Copyright 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2017
Production reference: 1130717
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78398-028-4
www.packtpub.com
Credits
Author Michael Beyeler | Copy Editor Manisha Sinha |
Reviewers Vipul Sharma Rahul Kavi | Project Coordinator Manthan Patel |
Commissioning Editor Veena Pagare | Proofreader Safis Editing |
Acquisition Editor Varsha Shetty | Indexer Tejal Daruwale Soni |
ContentDevelopmentEditor Jagruti Babaria | Graphics Tania Dutta |
Technical Editor Sagar Sawant | Production Coordinator Deepika Naik |
Foreword
Over the last few years, our machines have slowly but surely learned how to see for themselves. We now take it for granted that our cameras detect our faces in pictures that we take, and that social media apps can even recognize us and our friends in the photos that we upload from these cameras. Over the next few years we will experience even more radical transformation. Before long, cars will be driving themselves, our cellphones will be able to read and translate a sign in any language for us, and our x-rays and other medical images will be read and analyzed by powerful algorithms that will be able to accurately suggest a medical diagnosis, and even recommend effective treatments.
These transformations are driven by an explosive combination of increased computing power, masses of image data, and a set of clever ideas taken from math, statistics, and computer science. This rapidly growing intersection that is machine learning has taken off, affecting many of our day-to-day interactions with the world, and with each other. One of the most remarkable features of the current machine learning paradigm-shift in computer vision is that it relies to a large extent on software tools that are freely available and developed by large groups of volunteers, hobbyists, scientists, and engineers in open source communities. This means that, in principle, the barriers to entry are also lower than ever: anyone who is interested in putting their mind to it can harness machine learning for image processing.
However, just like in a garden with many forking paths, the wealth of tools and ideas, and the rapid development of these ideas, underscores the need for a guide who can show you the way, and orient you in the right direction. I have some good news for you: having picked up this book, you are in the good hands of my colleague and collaborator Dr. Michael Beyeler as your guide. With his broad range of expertise, Michael is both a hard-nosed engineer, computer scientist, and neuroscientist, as well as a prolific open source software developer. He has not only taught robots how to see and navigate through complex environments, and computers how to model brain activity, but he also regularly teaches humans how to use programming to solve a variety of different machine learning and image processing problems. This means that you will get to benefit not only from the sure-handed rigor of his expertise and experience, but also that you will get to enjoy his thoughtfulness in teaching the ideas in his book, as well as a good dose of his sense of humor.
The second piece of good news is that this going to be an exhilarating trip. There's nothing that matches the thrill of understanding that comes from putting together the pieces of the puzzle that go into solving a problem in computer vision and machine learning with code and data. As Richard Feynman put it: "What I cannot create, I do not understand". So, get ready to get your hands dirty (so to speak) with the code and data in the (open source!) code examples that accompany this book, and to get creative. Understanding will surely follow.
Ariel Rokem
Data Scientist, The University of Washington eScience Institute
About the Author
Michael Beyeler is a Postdoctoral Fellow in Neuroengineering and Data Science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye). His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. Michael is the author of OpenCV with Python Blueprints by Packt Publishing, 2015, a practical guide for building advanced computer vision projects. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android.
Michael received a PhD in computer science from the University of California, Irvine as well as a MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland. When he is not "nerding out" on brains, he can be found on top of a snowy mountain, in front of a live band, or behind the piano.
About the Reviewers
Vipul Sharma is a Software Engineer at a startup in Bangalore, India. He studied engineering in Information Technology at Jabalpur Engineering College (2016). He is an ardent Python fan and loves building projects on computer vision in his spare time. He is an open source enthusiast and hunts for interesting projects to contribute to. He is passionate about learning and strives to better himself as a developer. He writes blogs on his side projects at http://vipul.xyz. He also publishes his code at http://github.com/vipul-sharma20 .
Next page