LitArk » Books » Politics

Vishnu Pendyala - Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness

Here you can read online Vishnu Pendyala - Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2018, publisher: Apress, genre: Politics. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Book:
Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness
Author:
Vishnu Pendyala
Publisher:
Apress
Genre:
Books / Politics
Year:
2018
Rating:
3 / 5
Favourites:
Add to favourites
Your mark:
- 60
- 1
- 2
- 3
- 4
- 5

Description
Author's other books
Similar books

Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Examine the major problem of maintaining the quality of big data and discover novel solutions. You will learn the four Vs of big data, including veracity, and study the problem from various angles. The solutions discussed are drawn from diverse areas of engineering and math, including machine learning, statistics, formal methods, and the Blockchain technology.

Veracity of Big Data serves as an introduction to machine learning algorithms and diverse techniques such as the Kalman filter, SPRT, CUSUM, fuzzy logic, and Blockchain, showing how they can be used to solve problems in the veracity domain. Using examples, the math behind the techniques is explained in easy-to-understand language.

Determining the truth of big data in real-world applications involves using various tools to analyze the available information. This book delves into some of the techniques that can be used. Microblogging websites such as Twitter have played a major role in public life, including during presidential elections. The book uses examples of microblogs posted on a particular topic to demonstrate how veracity can be examined and established. Some of the techniques are described in the context of detecting veiled attacks on microblogging websites to influence public opinion.

What Youll Learn

Understand the problem concerning data veracity and its ramifications
Develop the mathematical foundation needed to help minimize the impact of the problem using easy-to-understand language and examples
Use diverse tools and techniques such as machine learning algorithms, Blockchain, and the Kalman filter to address veracity issues

Who This Book Is For

Software developers and practitioners, practicing engineers, curious managers, graduate students, and research scholars

Vishnu Pendyala: author's other books

Who wrote Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness? Find out the surname, the name of the author of the book and a list of all author's works by series.

Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Vishnu Pendyala 2018

Vishnu Pendyala Veracity of Big Data

1. The Big Data Phenomenon

Vishnu Pendyala 1

(1)

San Jose, California, USA

We are inundated with data. Data from Twitter microblogs, YouTube and surveillance videos, Instagram pictures, SoundCloud audio, enterprise applications, and many other sources are part of our daily life. Computing has come of age to facilitate the pervasiveness of machine-readable data and leveraging it for the advancement of humanity. This Big Data phenomenon is the new information revolution that no IT professional can afford to miss to be part of. Big Data Analytics has proven to be a game changer in the way that businesses provide their services. Business models are getting better, operations are becoming intelligent, and revenue streams are growing.

Uncertainty is probably the biggest impeding factor in the economic evolution of mankind. Thankfully, Big Data helps to deal with uncertainty . The more we know an entity, the more we can learn about the entity and thereby reduce the uncertainty. For instance, analyzing the continuous data about customers buying patterns is enabling stores to predict changes in demand and stock accordingly. Big Data is helping businesses to understand customers better so that they can be served better. Analyzing the consumer data from various sources, such Online Social Networks ( OSN) , usage of mobile apps, and purchase transaction records, businesses are able to personalize offerings. Computational statistics and Machine Learning algorithms are able to hypothesize patterns from Big Data to help achieve this personalization.

Web 2.0 , which includes the OSN, is one significant source of Big Data. Another major contributor is the Internet of Things. The billions of devices connecting to the Internet generate Petabytes of data. It is a well-known fact that businesses collect as much data as they can about consumers their preferences, purchase transactions, opinions, individual characteristics, browsing habits, and so on. Consumers themselves are generating substantial chunks of data in terms of reviews, ratings, direct feedback, video recordings, pictures and detailed documents of demos, troubleshooting, and tutorials to use the products and such, exploiting the expressiveness of the Web 2.0, thus contributing to the Big Data.

From the list of sources of data, it can be easily seen that it is relatively inexpensive to collect data. There are a number of other technology trends too that are fueling the Big Data phenomenon. High Availability systems and storage, drastically declining hardware costs, massive parallelism in task execution, high-speed networks, new computing paradigms such as cloud computing, high performance computing, innovations in Analytics and Machine Learning algorithms, new ways of storing unstructured data, and ubiquitous access to computing devices such as smartphones and laptops are all contributing to the Big Data revolution.

Human beings are intelligent because their brains are able to collect inputs from various sources, connect them, and analyze them to look for patterns. Big Data and the algorithms associated with it help achieve the same using compute power. Fusion of data from disparate sources can yield surprising insights into the entities involved. For instance, if there are plenty of instances of flu symptoms being reported on OSN from a particular geographical location and there is a surge in purchases of flu medication based on the credit card transactions in that area, it is quite likely that there is an onset of a flu outbreak. Given that Big Data makes no sense without the tools to collect, combine, and analyze data, some proponents even argue that Big Data is not really data, but a technology comprised of tools and techniques to extract value from huge sets of data .

Note

Generating value from Big Data can be thought of as comprising two major functions: fusion , the coming together of data from various sources; and fission , analyzing that data.

There is a huge amount of data pertaining to the human body and its health. Genomic data science is an academic specialization that is gaining increasing popularity. It helps in studying the disease mechanisms for better diagnosis and drug response. The algorithms used for analyzing Big Data are a game changer in genome research as well. This category of data is so huge that the famed international journal of science, Nature, carried a new item about how the genome researchers are worried that the computing infrastructure may not cope with the increasing amount of data that their research generates.

Science has a lot to benefit from the developments in Big Data. Social scientists can leverage data from the OSN to identify both micro- and macrolevel details, such as any psychiatric conditions at an individual level or group dynamics at a macrolevel. The same data from OSN can also be used to detect medical emergencies and pandemics. In financial sector too, data from the stock markets, business news, and OSN can reveal valuable insights to help improve lending practices, set macroeconomic strategies, and avert recession.

There are various other uses of Big Data applications in a wide variety of areas . Housing and real estate business; actuaries; and government departments such as national security, defense, education, disease control, law enforcement, and energy, which are all characterized by huge amounts of data are expected to benefit from the Big Data phenomenon.

Note

Where there is humongous data and appropriate algorithms applied on the data, there is wealth, value, and prosperity.

Why Big Data

A common question that arises is this: Why Big Data, why not just data? For data to be useful, we need to be able to identify patterns and predict those patterns for future data that is yet to be known. A typical analogy is to predict the brand of rice in a bag based on a given sample. The rice in the bag is unknown to us. We are only given a sample from it and samples of known brands of rice. The known samples are called training data in the language of Machine Learning. The sample of unknown rice is the test data.

It is common sense that the larger the sample, the better the prediction of the brand. If we are given just two grains of rice of each brand in the training data, we may base our conclusion solely based on the characteristics of those two grains, missing out on other characteristics. In the Machine Learning parlance, this is called overfitting . If we have a bigger sample, we can recognize a number of features and a possible range of values for the features: in other words, the probability distributions of the values, and look for similar distributions in the data that is yet to be known. Hence the need for humongous data, Big Data, and not just data.

In fact, a number of algorithms that are popular with Big Data have been in existence for long. The Nave Bayes technique , for instance, has been there since the 18th century and the Support Vector Machine model was invented in early 1960s. They gained prominence with the advent of the Big Data revolution for reasons explained earlier.

An often-cited heuristic to differentiate Big Data from the conventional bytes of data is that the Big Data is too big to fit into traditional Relational Database Management Systems (RDBMS) . With the ambitious plan of the Internet of Things to connect every entity of the world to everything else, the conventional RDBMS will not be able to handle the data upsurge. In fact, Seagate predicts that the world will not be able to cope with the storage needs in a couple of years. According to them, it is harder to manufacture capacity than to generate data. It will be interesting to see if and how the storage industry will meet the capacity demands of the Volume from Big Data phenomenon, which brings us to the Vs of the Big Data.

Light

Font size:

↓

↑

Reset

Interval:

↓

↑

Bookmark:

Make

Similar books «Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness»

Look at similar books to Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.

Arockia Liborious

Fun with Machine Learning: Simplify the Data Science process by automating repetitive and complex tasks using AutoML

Satchidananda Dehuri (editor)

Advances in Machine Learning for Big Data Analysis (Intelligent Systems Reference Library, 218)

Parikshit Narendra Mahalle

Foundations of Data Science for Engineering Problem Solving (Studies in Big Data, 94)

Lantz

Machine Learning with R

Bowles

Machine Learning in Python: Essential Techniques for Predictive Analysis

$Jean - ESSENTIAL MATH FOR DATA SCIENCE: take control of your data with fundamental calculus, linear... algebra, probability, and statistics$

$ESSENTIAL MATH FOR DATA SCIENCE: take control of your data with fundamental calculus, linear... algebra, probability, and statistics$

Jean

ESSENTIAL MATH FOR DATA SCIENCE: take control of your data with fundamental calculus, linear... algebra, probability, and statistics

Peter Wlodarczak

Machine Learning and Its Applications

Jesus Salcedo

Machine Learning for Data Mining

Blaminsky Jarek

Clojure for machine learning : successfully leverage advanced machine learning techniques using the Clojure ecosystem

Brett Lantz

Machine Learning with R

Andreas C. Müller

Introduction to Machine Learning with Python: A Guide for Data Scientists

Michele Usuelli

R Machine Learning Essentials

Reviews about «Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness»

Discussion, reviews of the book Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.