• Complain

Max Bramer - Principles of Data Mining

Here you can read online Max Bramer - Principles of Data Mining full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 0, publisher: Springer London, London, genre: Children. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Max Bramer Principles of Data Mining
  • Book:
    Principles of Data Mining
  • Author:
  • Publisher:
    Springer London, London
  • Genre:
  • Year:
    0
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Principles of Data Mining: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Principles of Data Mining" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Max Bramer: author's other books


Who wrote Principles of Data Mining? Find out the surname, the name of the author of the book and a list of all author's works by series.

Principles of Data Mining — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Principles of Data Mining" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Springer-Verlag London Ltd. 2016
Max Bramer Principles of Data Mining Undergraduate Topics in Computer Science 10.1007/978-1-4471-7307-6_1
1. Introduction to Data Mining
Max Bramer 1
(1)
School of Computing, University of Portsmouth, Portsmouth, Hampshire, UK
1.1 The Data Explosion
Modern computer systems are accumulating data at an almost unimaginable rate and from a very wide variety of sources: from point-of-sale machines in the high street to machines logging every cheque clearance, bank cash withdrawal and credit card transaction, to Earth observation satellites in space, and with an ever-growing volume of information available from the Internet.
Some examples will serve to give an indication of the volumes of data involved (by the time you read this, some of the numbers will have increased considerably):
  • The current NASA Earth observation satellites generate a terabyte (i.e. Picture 1 bytes) of data every day . This is more than the total amount of data ever transmitted by all previous observation satellites.
  • The Human Genome project is storing thousands of bytes for each of several billion genetic bases.
  • Many companies maintain large Data Warehouses of customer transactions. A fairly small data warehouse might contain more than a hundred million transactions.
  • There are vast amounts of data recorded every day on automatic recording devices, such as credit card transaction files and web logs, as well as non-symbolic data such as CCTV recordings.
  • There are estimated to be over 650 million websites, some extremely large.
  • There are over 900 million users of Facebook (rapidly increasing), with an estimated 3 billion postings a day.
  • It is estimated that there are around 150 million users of Twitter, sending 350 million Tweets each day.
Alongside advances in storage technology, which increasingly make it possible to store such vast amounts of data at relatively low cost whether in commercial data warehouses, scientific research laboratories or elsewhere, has come a growing realisation that such data contains buried within it knowledge that can be critical to a companys growth or decline, knowledge that could lead to important discoveries in science, knowledge that could enable us accurately to predict the weather and natural disasters, knowledge that could enable us to identify the causes of and possible cures for lethal illnesses, knowledge that could literally mean the difference between life and death. Yet the huge volumes involved mean that most of this data is merely storednever to be examined in more than the most superficial way, if at all. It has rightly been said that the world is becoming data rich but knowledge poor.
Machine learning technology, some of it very long established, has the potential to solve the problem of the tidal wave of data that is flooding around organisations, governments and individuals.
1.2 Knowledge Discovery
Knowledge Discovery has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. It is a process of which data mining forms just one part, albeit a central one.
Figure shows a slightly idealised version of the complete knowledge discovery process.
Figure 11 The Knowledge Discovery Process Data comes in possibly from many - photo 2
Figure 1.1
The Knowledge Discovery Process
Data comes in, possibly from many sources. It is integrated and placed in some common data store. Part of it is then taken and pre-processed into a standard format. This prepared data is then passed to a data mining algorithm which produces an output in the form of rules or some other kind of patterns. These are then interpreted to giveand this is the Holy Grail for knowledge discoverynew and potentially useful knowledge.
This brief description makes it clear that although the data mining algorithms, which are the principal subject of this book, are central to knowledge discovery they are not the whole story. The pre-processing of the data and the interpretation (as opposed to the blind use) of the results are both of great importance. They are skilled tasks that are far more of an art (or a skill learnt from experience) than an exact science. Although they will both be touched on in this book, the algorithms of the data mining stage of knowledge discovery will be its prime concern.
1.3 Applications of Data Mining
There is a rapidly growing body of successful applications in a wide range of areas as diverse as:
  • analysing satellite imagery
  • analysis of organic compounds
  • automatic abstracting
  • credit card fraud detection
  • electric load prediction
  • financial forecasting
  • medical diagnosis
  • predicting share of television audiences
  • product design
  • real estate valuation
  • targeted marketing
  • text summarisation
  • thermal power plant optimisation
  • toxic hazard analysis
  • weather forecasting
and many more. Some examples of applications (potential or actual) are:
  • a supermarket chain mines its customer transactions data to optimise targeting of high value customers
  • a credit card company can use its data warehouse of customer transactions for fraud detection
  • a major hotel chain can use survey databases to identify attributes of a high-value prospect
  • predicting the probability of default for consumer loan applications by improving the ability to predict bad loans
  • reducing fabrication flaws in VLSI chips
  • data mining systems can sift through vast quantities of data collected during the semiconductor fabrication process to identify conditions that are causing yield problems
  • predicting audience share for television programmes, allowing television executives to arrange show schedules to maximise market share and increase advertising revenues
  • predicting the probability that a cancer patient will respond to chemotherapy, thus reducing health-care costs without affecting quality of care
  • analysing motion-capture data for elderly people
  • trend mining and visualisation in social networks.
Applications can be divided into four main types: classification, numerical prediction, association and clustering. Each of these is explained briefly below. However first we need to distinguish between two types of data.
1.4 Labelled and Unlabelled Data
In general we have a dataset of examples (called instances ), each of which comprises the values of a number of variables, which in data mining are often called attributes . There are two types of data, which are treated in radically different ways.
For the first type there is a specially designated attribute and the aim is to use the data given to predict the value of that attribute for instances that have not yet been seen. Data of this kind is called labelled . Data mining using labelled data is known as supervised learning . If the designated attribute is categorical , i.e. it must take one of a number of distinct values such as very good, good or poor, or (in an object recognition application) car, bicycle, person, bus or taxi the task is called classification . If the designated attribute is numerical, e.g. the expected sale price of a house or the opening price of a share on tomorrows stock market, the task is called regression .
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Principles of Data Mining»

Look at similar books to Principles of Data Mining. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Principles of Data Mining»

Discussion, reviews of the book Principles of Data Mining and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.