• Complain

Angelov Plamen - Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece

Here you can read online Angelov Plamen - Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Cham, year: 2017, publisher: Springer International Publishing, genre: Romance novel. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Angelov Plamen Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece

Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Angelov Plamen: author's other books


Who wrote Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece? Find out the surname, the name of the author of the book and a list of all author's works by series.

Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Springer International Publishing AG 2017
Plamen Angelov , Yannis Manolopoulos , Lazaros Iliadis , Asim Roy and Marley Vellasco (eds.) Advances in Big Data Advances in Intelligent Systems and Computing 10.1007/978-3-319-47898-2_1
Predicting Human Behavior Based on Web Search Activity: Greek Referendum of 2015
Spyros E. Polykalas 1
(1)
Department of Digital Media and Communication, TEI of Ionian Islands, Argostoli, Greece
(2)
Department of Electronic Computing Systems, TEI of Piraeus, Aigaleo, Greece
Spyros E. Polykalas (Corresponding author)
Email:
George N. Prezerakos
Email:
Abstract
The enormous volumes of data generated by web users are the basis of several research activities in a new innovative field of research: online forecasting. Online forecasting is associated with the proper computation of web users data with the aim to arrive at accurate predictions of the future in several areas of human socio-economic activity. In this paper an algorithm is applied in order to predict the results of the Greek referendum held in 2015, using as input the data generated by users of the Google search engine. The proposed algorithm allows us to predict the results of the referendum with great accuracy. We strongly believe that due to the high internet penetration, as well as, the high usage of web search engines, the proper analysis of data generated by web search users reveals useful information about people preferences and/or future actions in several areas of human activity.
Keywords
Google Trends Online forecasting Predictions Forecasting Search engines Human behavior
Introduction
Almost a decade ago, Google opened to the public the web users preferences in relation to their searching behavior. Several researchers realized that the proper processing of the web users search behavior may allow them to reveal useful information about the users needs, wants, concerns and in general about their feelings and preferences (Ettredge et al. ).
Web users generate data almost in all web activities, such as visiting a website, buying online, sending/receiving emails and participating in social networks. In cases where the popularity of such activities is high, then there is plenty of room for researchers and companies to use these data in order to reach valuable conclusions not only for web users, but for the general population. The most indicative case of user generated data is web search, since is characterized by high popularity among web users and by an almost monopolized market structure since Google Search engine holds more than 85 % of the market (source: www.statista.com ).
A recent study published by Eurostat indicates that the 59 % of Europeans use web search services to find information relevant to goods and services. As the percentages of internet penetration and use of web search increase, relevant generated data regarding web search behavior, become statistically significant. Thus, forecasting based on web search data is becoming increasingly more accurate.
Within this context, the aim of this paper is to explore whether there is a correlation between the users web search preferences during a time period before the Greek referendum, held in July 2015, and the actual results of the referendum. In particular in this paper an algorithm is applied in order to analyze the data generated by users of the Google engine, aiming to predict the actual results of the referendum.
The paper is structured as follows: Sect. the main findings of this paper are discussed.
Literature Review
Online forecasting based on users web search data is becoming as one of the most promising fields in the research area of forecasting. Several efforts have been carried out by Googles own researchers which have attempted predictions using search term popularity in a number of areas ranging from home, automobile and retail sales to travel behavior (Bangwayo-Skeete and Skeete ).
With respect to elections, an initial approach in (Pion and Hamel ) provided predictions for the 2010 UK elections by applying twice the concept behind Galtons predictive wisdom of the crowds.
The Proposed Algorithm
The proposed algorithm is applied on the data generated by the users of the Google search engine. Each time that a user, searches the Web with the Google search engine, the relevant data such as, the typed word or phrase, the date, the time, the location and data related to his/her profile are stored by Google. The data are analyzed by Google and some of them become publicly available by the Google Trends service. In particular Google Trends returns a normalized averaged number that corresponds to the volume of daily searches for a specific term compared to the rest of the search terms.
The proposed algorithm uses the search popularity of selected word/phrases, as provided by the Google Trends, in order to analyze the feelings, intentions and thoughts of the web users in relation to these word/phrases, aiming to predict their future behavior. Early versions of the proposed algorithm have been applied, in several elections races (Polykalas et al. ). The algorithm consists of four main phases: initial, words set, noise elimination and runs. At the initial phase the examined time period before the event under study is determined and the geographic restrictions for the web search users is set. During the next phase the popularity of selected words/phrases relevant to the case study is examined, in order to determine the set of words/phrases that will be used as input data. As stated earlier, Google Trends returns a normalized value of the popularity of each word/phrase typed by web search users. We call this popularity the Web Interest (WI) of each typed word/phrase. The WI of each examined word/phrase should fulfill two criteria in order for the relevant word/phrase to be part of the selected words/phrases. The first one concerns the variance of the relevant WI during the determined time period, while the second one is related to the absolute values of the WI during the determined time period. If the WI varies significant during the determined period and the value of the WI is comparable with the WI of previously selected words/phrases, then the examined word/phrase is selected as algorithm input. Several words/phrases should be examined during this phase, in order to include, in the final set of word/phrases, all potential words/phrases than meet the aforementioned two conditions. Having determined the geographical restrictions, the time period and the final set of the words/phrases, the next phase is related to the elimination of potential noise. The noise elimination phase consists of three different sub-phases. The first one deals with the elimination of noise generated by indecisive or confused web users. An indecisive/confused web user is defined as the web user who is searching, at the same time, for words/phrases that show contradicted feelings or unpredictable future behavior. Further explanation is given for this sub-phase in the next section where the proposed algorithm is applied to our case study. The second sub-phase is related to the examination of previous (if any) relevant events similar to the one under study. If there are similar historical events then, the relevant data that were generated by web search engine users, as well as, the relevant historical actual results are used as feedback to the current data used for the case under study. The third sub-phase concerns the exclusion of the influence generated by non-representative facts during the determined time period. In order to determine the non-representative facts, a day to day examination of the WI of the selected words is required. If a selected word presents very high variation during a short period (high increase followed by high decrease within 12 days), which is not followed by a respective variation of the WI of the other selected word/phrases, then these WI values should not be considered as valid input values (in practical terms this means that another event, related to the main one such as a TV interview, a scandal etc., that drew high media attention has occurred and has skewed the respective WIs). The last phase of the proposed algorithm contains the final runs of the proposed algorithm, which in turns generate the final results. A normalization of the final results is required only if the number of different set of word/phrases used in the proposed algorithm is less than the relevant actual set of tendencies under study.
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece»

Look at similar books to Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece»

Discussion, reviews of the book Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.