ANALYTICS AND
TECH MINING FOR
ENGINEERING
MANAGERS
ANALYTICS AND
TECH MINING FOR
ENGINEERING
MANAGERS
SCOTT W. CUNNINGHAM
AND JAN H. KWAKKEL
Analytics and Tech Mining for Engineering Managers
Copyright Momentum Press, LLC, 2016.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means electronic, mechanical, photocopy, recording, or any otherexcept for brief quotations, not to exceed 400 words, without the prior permission of the publisher.
First published by Momentum Press, LLC
222 East 46th Street, New York, NY 10017
www.momentumpress.net
ISBN-13: 978-1-60650-510-6 (print)
ISBN-13: 978-1-60650-511-3 (e-book)
Momentum Press Engineering Management Collection
Collection ISSN: 2376-4899 (print)
Collection ISSN: 2376-4902 (electronic)
Cover and interior design by Exeter Premedia Services Private Ltd.,
Chennai, India
10 9 8 7 6 5 4 3 2 1
Printed in the United States of America
SWCThis book is dedicated to my mother, Joan Cunningham.
ABSTRACT
This book offers practical tools in Python to students of innovation as well as competitive intelligence professionals to track new developments in science, technology, and innovation. The book will appeal to bothtech-mining and data science audiences. For tech-mining audiences, Python presents an appealing, all-in-one language for managing the tech-mining process. The book is a complement to other introductory books on the Python language, providing recipes with which a practitioner can grow a practice of mining text. For data science audiences, this book gives a succinct overview of the most useful techniques of text mining. The book also provides relevant domain knowledge from engineering management; so, an appropriate context for analysis can be created.
This is the first book of a two-book series. This first book discusses the mining of text, while the second one describes the analysis of text. This book describes how to extract actionable intelligence from a variety of sources including scientific articles, patents, pdfs, and web pages. There are a variety of tools available within Python for mining text. In particular, we discuss the use of pandas, BeautifulSoup, and pdfminer.
KEYWORDS
data science, natural language processing, patent analysis, Python, science, technology and innovation, tech mining
The authors of this book asked me to share perspectives on tech mining. I co-authored the 2004 book on the topic (Porter and Cunningham 2004). With an eye toward Scott and Jans materials, here are some thoughts. These are meant to stimulate your thinking about tech mining and you.
Who does tech mining? Experience suggests two contrasting types of people: technology and data folks. Technology folks know the subject; they are either experienced professionals or trained professionals or both, working in that industry or research field to expand their intelligence via tech mining. They seek to learn a bit about data manipulation and analytics to accomplish those aims. For instance, imagine a chemist seeking a perspective on scientific opportunities or an electrical engineer analyzing emerging technologies to facilitate open innovation by his or her company. The data science folks are those whose primary skill include some variation of data science and analytics. I, personally, have usually been in this groupneeding to learn enough about the subject under study to not be totally unacquainted. Moreover, in collaborating on a major intelligence agency project to identify emerging technologies from full-text analyses, we were taken by the brilliance of the data folksreally impressive capabilities to mine science, technology, and innovation text resources. Unfortunately, we were also taken by their disabilities in relating those analyses to real applications. They were unable to practically identify emergence in order to provide usable intelligence.
So, challenges arise on both sides. But, a special warning to readers of this bookwe suspect you are likely Type B, and we fear that the challenges are tougher for us. Years ago, we would have said the oppositeanalysts can analyze anything. Now, we think the other way; that you really need to concentrate on relating your skills to answering real questions in real time. My advice would be to push yourself to perform hands-on analyses on actual tech-mining challenges. Seek out internships or capstone projects or whatever, to orient your tech mining skills to generate answers to real questions, and to get feedback to check their utility.
Having said that, an obvious course of action is to team up Types A and B to collaborate on tech-mining work. This is very attractive, but you must work to communicate well. Dont invest 90 percent of your energy in that brilliant analysis and 10 percent in telling about it. Think more toward a 5050 process where you iteratively present preliminary results, and get feedback on the same. Adjust your presentation content and mode to meet your users needs, not just your notions of whats cool.
Whats happening in tech mining? The field is advancing. Its hard for a biased insider like me to gauge this well, but check out the website www.VPInstitute.org. Collect some hundreds of tech-mining-oriented papers and overview their content. You can quickly get a picture of the diverse array of science, technology, and innovation topics addressed in the open-source literature. Less visiblebut the major use of tech-mining toolsare the competitive technical intelligence applications by companies and agencies.
Tech mining is advancing. In the 2000s, studies largely addressed who, what, where, and when questions about an emerging technology. While research profiling is still useful, we now look to go further along following directions.
Assessing R&D in a domain of interest, to inform portfolio management or funding agency program review.
Generating competitive technological intelligence, to track known competitors and to identify potential friends and foes. Tech mining is a key tool to facilitate open innovation by identifying potential sources of complementary capabilities and collaborators.
Technology road mapping by processing text resources (e.g., sets of publication or patent records on a topic under scrutiny) to extract topical content and track its evolution over time periods.
Contributing to future-oriented technology analysestech mining provides vital empirical grounding to inform future prospects. Transition from identifying past trends and patterns of engagement to laying out future possibilities is not automatic, and offers a field for productive study.
Id point to some resources to track whats happening in tech mining as time progresses.
Note the globalization of tech-mining interest. For instance, this book has been translated in Chinese (Porter and Cunningham 2012)not expecting many of you to rush off to read it, but it is an indicator of considerable interest in Asian economies pursuing science, technology, and innovation opportunities. And that reinforces the potential of text processing of languages other than English.
Track the scholarly literature. Tech mining analytics and applications splatter across various scholarly fields. Here I note a few pieces from our colleagues. Bibliometrics journals cover analytical advancesc.f., Ma and Porter (2014) and Zhang et al. (2014). Management of technology-oriented journals cover analytics and applicationsc.f., Guo et al. (2014), Newman et al. (2014), and Porter, Cunningham, and Sanz (2015).
Next page