LitArk » Books » Children

Max Bramer - Principles of Data Mining

Here you can read online Max Bramer - Principles of Data Mining full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 0, publisher: Springer London, London, genre: Children. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Principles of Data Mining
Author:
Max Bramer
Publisher:
Springer London, London
Genre:
Books / Children
Year:
0
Rating:
4 / 5
Favourites:
Add to favourites
Your mark:
- 80
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Principles of Data Mining: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Principles of Data Mining" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Max Bramer: author's other books

Who wrote Principles of Data Mining? Find out the surname, the name of the author of the book and a list of all author's works by series.

Principles of Data Mining — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Principles of Data Mining" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Springer-Verlag London Ltd. 2016

Max Bramer Principles of Data Mining Undergraduate Topics in Computer Science 10.1007/978-1-4471-7307-6_1

1. Introduction to Data Mining

Max Bramer 1

(1)

School of Computing, University of Portsmouth, Portsmouth, Hampshire, UK

1.1 The Data Explosion

Modern computer systems are accumulating data at an almost unimaginable rate and from a very wide variety of sources: from point-of-sale machines in the high street to machines logging every cheque clearance, bank cash withdrawal and credit card transaction, to Earth observation satellites in space, and with an ever-growing volume of information available from the Internet.

Some examples will serve to give an indication of the volumes of data involved (by the time you read this, some of the numbers will have increased considerably):

The current NASA Earth observation satellites generate a terabyte (i.e. bytes) of data every day . This is more than the total amount of data ever transmitted by all previous observation satellites.
The Human Genome project is storing thousands of bytes for each of several billion genetic bases.
Many companies maintain large Data Warehouses of customer transactions. A fairly small data warehouse might contain more than a hundred million transactions.
There are vast amounts of data recorded every day on automatic recording devices, such as credit card transaction files and web logs, as well as non-symbolic data such as CCTV recordings.
There are estimated to be over 650 million websites, some extremely large.
There are over 900 million users of Facebook (rapidly increasing), with an estimated 3 billion postings a day.
It is estimated that there are around 150 million users of Twitter, sending 350 million Tweets each day.

Alongside advances in storage technology, which increasingly make it possible to store such vast amounts of data at relatively low cost whether in commercial data warehouses, scientific research laboratories or elsewhere, has come a growing realisation that such data contains buried within it knowledge that can be critical to a companys growth or decline, knowledge that could lead to important discoveries in science, knowledge that could enable us accurately to predict the weather and natural disasters, knowledge that could enable us to identify the causes of and possible cures for lethal illnesses, knowledge that could literally mean the difference between life and death. Yet the huge volumes involved mean that most of this data is merely storednever to be examined in more than the most superficial way, if at all. It has rightly been said that the world is becoming data rich but knowledge poor.

Machine learning technology, some of it very long established, has the potential to solve the problem of the tidal wave of data that is flooding around organisations, governments and individuals.

1.2 Knowledge Discovery

Knowledge Discovery has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. It is a process of which data mining forms just one part, albeit a central one.

Figure shows a slightly idealised version of the complete knowledge discovery process.

Figure 1.1

The Knowledge Discovery Process

Data comes in, possibly from many sources. It is integrated and placed in some common data store. Part of it is then taken and pre-processed into a standard format. This prepared data is then passed to a data mining algorithm which produces an output in the form of rules or some other kind of patterns. These are then interpreted to giveand this is the Holy Grail for knowledge discoverynew and potentially useful knowledge.

This brief description makes it clear that although the data mining algorithms, which are the principal subject of this book, are central to knowledge discovery they are not the whole story. The pre-processing of the data and the interpretation (as opposed to the blind use) of the results are both of great importance. They are skilled tasks that are far more of an art (or a skill learnt from experience) than an exact science. Although they will both be touched on in this book, the algorithms of the data mining stage of knowledge discovery will be its prime concern.

1.3 Applications of Data Mining

There is a rapidly growing body of successful applications in a wide range of areas as diverse as:

analysing satellite imagery
analysis of organic compounds
automatic abstracting
credit card fraud detection
electric load prediction
financial forecasting
medical diagnosis
predicting share of television audiences
product design
real estate valuation
targeted marketing
text summarisation
thermal power plant optimisation
toxic hazard analysis
weather forecasting

and many more. Some examples of applications (potential or actual) are:

a supermarket chain mines its customer transactions data to optimise targeting of high value customers
a credit card company can use its data warehouse of customer transactions for fraud detection
a major hotel chain can use survey databases to identify attributes of a high-value prospect
predicting the probability of default for consumer loan applications by improving the ability to predict bad loans
reducing fabrication flaws in VLSI chips
data mining systems can sift through vast quantities of data collected during the semiconductor fabrication process to identify conditions that are causing yield problems
predicting audience share for television programmes, allowing television executives to arrange show schedules to maximise market share and increase advertising revenues
predicting the probability that a cancer patient will respond to chemotherapy, thus reducing health-care costs without affecting quality of care
analysing motion-capture data for elderly people
trend mining and visualisation in social networks.

Applications can be divided into four main types: classification, numerical prediction, association and clustering. Each of these is explained briefly below. However first we need to distinguish between two types of data.

1.4 Labelled and Unlabelled Data

In general we have a dataset of examples (called instances ), each of which comprises the values of a number of variables, which in data mining are often called attributes . There are two types of data, which are treated in radically different ways.

For the first type there is a specially designated attribute and the aim is to use the data given to predict the value of that attribute for instances that have not yet been seen. Data of this kind is called labelled . Data mining using labelled data is known as supervised learning . If the designated attribute is categorical , i.e. it must take one of a number of distinct values such as very good, good or poor, or (in an object recognition application) car, bicycle, person, bus or taxi the task is called classification . If the designated attribute is numerical, e.g. the expected sale price of a house or the opening price of a share on tomorrows stock market, the task is called regression .

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Principles of Data Mining»

Look at similar books to Principles of Data Mining. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Ranjana Rajnish

Web Data Mining with Python: Discover and extract information from the web using Python

Brij B Gupta (editor)

Data Mining Approaches for Big Data and Sentiment Analysis in Social Media (Advances in Data Mining and Database Management)

Industrial Conference on Data Mining

Advances in Data Mining Applications in Medicine, Web Mining, Marketing, Image and Signal Mining, 6th Industrial Conference on Data Mining, ICDM 2006, Leipzig, Germany, July 14-15, 2006, Proceedings

Russell

Mining the social web [data mining Facebook, Twitter, Linkedin, Google+, Github, and more

Yanchang Zhao

R and Data Mining

Layton

Learning data mining with Python: use Python to manipulate data and build predictive models

Han Jiawei

Data mining: concepts and techniques

Zak

Data Mining Concepts and Techniques: Complete Guide to a Comprehensive Understanding of Data Mining

Jesus Salcedo

Machine Learning for Data Mining

Darren Quick

Big Digital Forensic Data: Volume 1: Data Reduction Framework and Selective Imaging

Foster Provost

Data Science for Business: What you need to know about data mining and data-analytic thinking

Stephan Kudyba

Data Mining and Business Intelligence: A Guide to Productivity

Reviews about «Principles of Data Mining»

Discussion, reviews of the book Principles of Data Mining and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.