LitArk » Books » Home and family

Witold Pedrycz - Data Science and Big Data: An Environment of Computational Intelligence

Here you can read online Witold Pedrycz - Data Science and Big Data: An Environment of Computational Intelligence full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2017, publisher: Springer, genre: Home and family. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Data Science and Big Data: An Environment of Computational Intelligence
Author:
Witold Pedrycz / ShyiMing Chen
Publisher:
Springer
Genre:
Books / Home and family
Year:
2017
Rating:
4 / 5
Favourites:
Add to favourites
Your mark:
- 80
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Data Science and Big Data: An Environment of Computational Intelligence: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Data Science and Big Data: An Environment of Computational Intelligence" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

This book presents a comprehensive and up-to-date treatise of a range of methodological and algorithmic issues. It also discusses implementations and case studies, identifies the best design practices, and assesses data analytics business models and practices in industry, health care, administration and business.Data science and big data go hand in hand and constitute a rapidly growing area of research and have attracted the attention of industry and business alike. The area itself has opened up promising new directions of fundamental and applied research and has led to interesting applications, especially those addressing the immediate need to deal with large repositories of data and building tangible, user-centric models of relationships in data. Data is the lifeblood of todays knowledge-driven economy.Numerous data science models are oriented towards end users and along with the regular requirements for accuracy (which are present in any modeling), come the requirements for ability to process huge and varying data sets as well as robustness, interpretability, and simplicity (transparency). Computational intelligence with its underlying methodologies and tools helps address data analytics needs.The book is of interest to those researchers and practitioners involved in data science, Internet engineering, computational intelligence, management, operations research, and knowledge-based systems.

Witold Pedrycz: author's other books

Who wrote Data Science and Big Data: An Environment of Computational Intelligence? Find out the surname, the name of the author of the book and a list of all author's works by series.

Data Science and Big Data: An Environment of Computational Intelligence — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Data Science and Big Data: An Environment of Computational Intelligence" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Part I
Fundamentals

Springer International Publishing AG 2017

Witold Pedrycz and Shyi-Ming Chen (eds.) Data Science and Big Data: An Environment of Computational Intelligence Studies in Big Data 10.1007/978-3-319-53474-9_1

Large-Scale Clustering Algorithms

Rocco Langone 1

(1)

KU Leuven ESAT-STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium

Rocco Langone (Corresponding author)

Email:

Vilen Jumutc

Email:

Johan A. K. Suykens

Email:

Abstract

Computational tools in modern data analysis must be scalable to satisfy business and research time constraints. In this regard, two alternatives are possible: (i) adapt available algorithms or design new approaches such that they can run on a distributed computing environment (ii) develop model-based learning techniques that can be trained efficiently on a small subset of the data and make reliable predictions. In this chapter two recent algorithms following these different directions are reviewed. In particular, in the first part a scalable in-memory spectral clustering algorithm is described. This technique relies on a kernel -based formulation of the spectral clustering problem also known as kernel spectral clustering . More precisely, a finite dimensional approximation of the feature map via the Nystrm method is used to solve the primal optimization problem, which decreases the computational time from cubic to linear. In the second part, a distributed clustering approach with fixed computational budget is illustrated. This method extends the k-means algorithm by applying regularization at the level of prototype vectors. An optimal stochastic gradient descent scheme for learning with

and

norms is utilized, which makes the approach less sensitive to the influence of outliers while computing the prototype vectors.

Keywords

Data clustering Big data Kernel methods Nystrm approximation Stochastic optimization K-means Map-Reduce Regularization In-memory algorithms scalability

Introduction

Data clustering allows to partition a set of points into groups called clusters which are as similar as possible. It plays a key role in computational intelligence because of its diverse applications in various domains. Examples include collaborative filtering and market segmentation, where clustering is used to provide personalized recommendations to users, trend detection which allows to discover key trends events in streaming data, community detection in social networks , and many others [].

With the advent of the big data era, a key challenge for data clustering lies in its scalability , that is, how to speed-up a clustering algorithm without affecting its performance. To this purpose, two main directions have been explored [] for some recent surveys on clustering algorithms for big data .

In this chapter two algorithms for large-scale data clustering are reviewed. The first one, named fixed-size kernel spectral clustering (FSKSC), is a sampling-based spectral clustering method. Spectral clustering (SC) [

The remainder of the chapter is organized as follows. Section . Finally some conclusions are given.

Notation

Transpose of the vector

Transpose of the matrix

Identity matrix

Data Science and Big Data An Environment of Computational Intelligence - image 9

Data Science and Big Data An Environment of Computational Intelligence - image 10

Vector of ones

Data Science and Big Data An Environment of Computational Intelligence - image 11

Training sample of

data points

Feature map

Feature space of dimension

Partitioning composed of k clusters

Cardinality of a set

p -norm of a vector

Gradient of function f

Standard Clustering Approaches

3.1 Spectral Clustering

Spectral clustering represents a solution to the graph partitioning problem. More precisely, it allows to divide a graph into weakly connected sub-graphs by making use of the spectral properties of the graph Laplacian matrix [].

A graph (or network) Data Science and Big Data An Environment of Computational Intelligence - image 20

is a mathematical structure used to model pairwise relations between certain objects. It refers to a set of N vertices or nodes Data Science and Big Data An Environment of Computational Intelligence - image 21

Data Science and Big Data An Environment of Computational Intelligence - image 21

and a collection of edges

that connect pairs of vertices. If the edges are provided with weights the corresponding graph is weighted, otherwise it is referred as an unweighted graph. The topology of a graph is described by the similarity or affinity matrix, which is an

matrix

Data Science and Big Data An Environment of Computational Intelligence - image 24

, where

Data Science and Big Data An Environment of Computational Intelligence - image 25

indicates the link between the vertices i and j . Associated to the similarity matrix there is the degree matrix Data Science and Big Data An Environment of Computational Intelligence - image 26

Data Science and Big Data An Environment of Computational Intelligence - image 26

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Data Science and Big Data: An Environment of Computational Intelligence»

Look at similar books to Data Science and Big Data: An Environment of Computational Intelligence. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Probyto Data Science and Consulting Pvt. Ltd.

Data Science for Business Professionals: A Practical Guide for Beginners

Data Science & Business Analytics

Pallavi Vijay Chavan

Data Science: Techniquest and Intelligent Applications

Thomas Mailund

Beginning Data Science in R 4: Data Analysis, Visualization, and Modelling for the Data Scientist

Big Data Analytics and Intelligence: A Perspective for Health Care

Amar Sahay

Essentials of Data Science and Analytics: Statistical Tools, Machine Learning, and R-Statistical Software Overview

Parikshit Narendra Mahalle

Foundations of Data Science for Engineering Problem Solving (Studies in Big Data, 94)

Schmarzo

Big data MBA: driving business strategies with data science

Fawcett Tom

Data Science for Business

Ulrika Jägare

Data Science Strategy For Dummies

EMC Education Services [EMC Education Services]

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data

Foster Provost

Data Science for Business: What you need to know about data mining and data-analytic thinking

Reviews about «Data Science and Big Data: An Environment of Computational Intelligence»

Discussion, reviews of the book Data Science and Big Data: An Environment of Computational Intelligence and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.