• Complain

Thomas Zimmermann - Perspectives on Data Science for Software Engineering

Here you can read online Thomas Zimmermann - Perspectives on Data Science for Software Engineering full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2016, publisher: Morgan Kaufmann, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Thomas Zimmermann Perspectives on Data Science for Software Engineering

Perspectives on Data Science for Software Engineering: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Perspectives on Data Science for Software Engineering" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. The idea for this book was created during the 2014 conference at Dagstuhl, an invitation-only gathering of leading computer scientists who meet to identify and discuss cutting-edge informatics topics.

At the 2014 conference, the concept of how to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field highlighted many discussions. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the communitys leaders gathered to share hard-won lessons from the trenches.

Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and how to utilize these techniques in successful software projects. Newcomers to software engineering data science will learn the tips and tricks of the trade, while more experienced data scientists will benefit from war stories that show what traps to avoid.

  • Presents the wisdom of community experts, derived from a summit on software analytics
  • Provides contributed chapters that share discrete ideas and technique from the trenches
  • Covers top areas of concern, including mining security and social data, data visualization, and cloud-based data
  • Presented in clear chapters designed to be applicable across many domains

Thomas Zimmermann: author's other books


Who wrote Perspectives on Data Science for Software Engineering? Find out the surname, the name of the author of the book and a list of all author's works by series.

Perspectives on Data Science for Software Engineering — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Perspectives on Data Science for Software Engineering" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Perspectives on data science for software engineering

* North Carolina State University, Raleigh, NC, United States
Microsoft Research, Redmond, WA, United States

Abstract

Given recent increases in how much data we can collect, and given a shortage in skilled analysts that can assess that data, there now exists more data than people to study it. Consequently, the analysis of real-world data is an exploding field, to say this least. About software projects, a lot of information is recorded in software repositories. Never before have we had so much information about the details on how people collaborate to build software.

Keywords

Data Science; Software Analytics; Mining Software Repositories; Software repositories; Data mining; Data analytics

Chapter Outline

Why This Book?

Historically, this book began as a week-long workshop in Dagstuhl, Germany []. The goal of that meeting was to document the wide range of work on software analytics.

That meeting had the following premise: So little time , so much data .

That is, given recent increases in how much data we can collect, and given a shortage in skilled analysts that can assess that data [], there now exists more data than people to study it. Consequently, the analysis of real-world data (using semi-automatic or fully automatic methods) is an exploding field, to say this least.

This issue is made more pressing by two factors:

Many useful methods : Decades of research in artificial intelligence, social science methods, visualizations, statistics, etc. has generated a large number of powerful methods for learning from data.

Much support for those methods : Many of those methods are explored in standard textbooks and education programs. Those methods are also supported in toolkits that are widely available (sometimes, even via free downloads). Further, given the Big Data revolution, it is now possible to acquire the hardware necessary, even for the longest runs of these tools. So now the issue becomes not how to get these tools but, instead, how to use these tools.

If general analytics is an active field, software analytics is doubly so. Consider what we know about software projects:

source code;

emails about that code;

check-ins;

work items;

bug reports;

test suites;

test executions;

and even some background information on the developers.

All that information is recorded in software repositories, such as CVS, Subversion, GIT, GITHUB, and Bugzilla. Found in these repositories are telemetry data, run-time traces, and log files reflecting how customers experience software, application and feature usage, records of performance and reliability, and more.

Never before have we had so much information about the details on how people collaborate to

use someone elses insights and software tools;

generate and distribute new insights and software tools;

maintain and update existing insights and software tools.

Here, by tools we mean everything from the four lines of SQL that are triggered when someone surfs to a web page, to scripts that might be only dozens to hundreds of lines of code, or to much larger open source and proprietary systems. Also, our use of tools includes building new tools as well as ongoing maintenance work, as well as combinations of hardware and software systems.

Accordingly, for your consideration, this book explores the process for analyzing data from software development applications to generate insights. The chapters here were written by participants at the Dagstuhl workshop (), plus numerous other experts in the field on industrial and academic data mining. Our goal is to summarize and distribute their experience and combined wisdom and understanding about the data analysis process.

Fig 1 The participants of the Dagstuhl Seminar 14261 on Software Development - photo 1
Fig. 1 The participants of the Dagstuhl Seminar 14261 on "Software Development Analytics" (June 22-27, 2014)
About This Book

Each chapter is aimed at a generalized audience with some technical interest in software engineering (SE). Hence, the chapters are very short and to the point. Also, the chapter authors have taken care to avoid excessive and confusing techno-speak.

As to insights themselves, they are in two categories:

Lessons specific to software engineering : Some chapters offer valuable comments on issues that are specific to data science for software engineering. For example, see Geunther Ruhes excellent chapter on decision support for software engineering.

General lessons about data analytics : Other chapters are more general. These comment on issues relating to drawing conclusions from real-world data. The case study material for these chapters comes from the domain of software engineering problems. That said, this material has much to offer data scientists working in many other domains.

Our insights take many forms:

Some introductory material to set the scene;

Success stories and application case studies;

Techniques;

Words of wisdom;

Tips for success, traps for the unwary, as well as the steps required to avoid those traps.

That said, all our insights have one thing in common: we wish we had known them years ago ! If we had, then that would have saved us and our clients so much time and money.

The Future

While these chapters were written by experts, they are hardly complete. Data science methods for SE are continually changing, so we view this book as a first edition that will need significant and regular updates. To that end, we have created a news group for posting new insights. Feel free to make any comment at all there.

To browse the messages in that group, go to https://groups.google.com/forum/#!forum/perspectivesds4se

To post to that group, send an email to

To unsubscribe from that group, send an email to

Note that if you want to be considered for any future update of this book:

Make the subject line an eye-catching mantra; ie, a slogan reflecting a best practice for data science for SE.

The post should read something like the chapters of this book. That is, it should be:

Short, and to the point.

Make little or no use of jargon, formulas, diagrams, or references.

Be approachable by a broad audience and have a clear take-away message.

Share and enjoy!

References

[1] Software development analytics (Dagstuhl Seminar 14261) Gall H., Menzies T., Williams L., Zimmermann T. Dagstuhl Rep J. 2014;4(6):6483. http://drops.dagstuhl.de/opus/volltexte/2014/4763/.

[2] Big data: The next frontier for competition. McKinsey & Company. http://www.mckinsey.com/features/big_data.

Software analytics and its application in practice

* Microsoft Research, Beijing, China
University of Illinois at Urbana-Champaign, Urbana, IL, United States

Abstract

A huge wealth of data exists in software life cycle, and hidden in the data is information about the quality of software and services as well as the dynamics of software development. With various analytical and computing technologies, software analytics is to obtain insightful and actionable information for data-driven tasks in engineering software and services. In this chapter, we discuss the different aspects of software analytics, and we also share our lessons learned when putting software analytics into practice.

Keywords

Software analytics; research topics; target audience; technology pillars; connection to practice

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Perspectives on Data Science for Software Engineering»

Look at similar books to Perspectives on Data Science for Software Engineering. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Perspectives on Data Science for Software Engineering»

Discussion, reviews of the book Perspectives on Data Science for Software Engineering and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.