• Complain

Roberto De Virgilio Francesco Guerra - Semantic Search over the Web

Here you can read online Roberto De Virgilio Francesco Guerra - Semantic Search over the Web full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Berlin;Heidelberg, year: 2014, publisher: Springer, genre: Computer. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

Roberto De Virgilio Francesco Guerra Semantic Search over the Web

Semantic Search over the Web: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Semantic Search over the Web" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Roberto De Virgilio Francesco Guerra: author's other books


Who wrote Semantic Search over the Web? Find out the surname, the name of the author of the book and a list of all author's works by series.

Semantic Search over the Web — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Semantic Search over the Web" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Part 1
Introduction to Web of Data
Roberto De Virgilio , Francesco Guerra and Yannis Velegrakis (eds.) Data-Centric Systems and Applications Semantic Search over the Web 2012 10.1007/978-3-642-25008-8_1 Springer-Verlag Berlin Heidelberg 2012
1. Topology of the Web of Data
Christian Bizer
(1)
Web-based Systems Group, Freie Universitt Berlin, Garystr. 21, 14195 Berlin, Germany
(2)
Web-based Systems Group, Freie Universitt Berlin, Garystr. 21, 14195 Berlin, Germany
Christian Bizer (Corresponding author)
Email:
Pablo N. Mendes
Email:
Anja Jentzsch
Email:
Abstract
Over the last years, an increasing number of web sites have started to embed structured data into HTML documents as well as to publish structured data in addition to HTML documents directly on the Web. This trend has led to the extension of the Web with a global data spacethe Web of Data. As the classic document Web, the Web of Data covers a wide variety of topics ranging from data describing people, organizations, and events over products and reviews to statistical data provided by governments as well as research data from various scientific disciplines. This chapter gives an overview of the topology of the Web of Data. We discuss the different techniques that are used to publish structured data on the Web and provide statistics about the amount and topics of the data currently published using each technique.
1.1 Introduction
The degree of structure of Web content is the determining factor for the types of functionality that search engines can provide. The more well structured the Web content is, the easier it is for search engines to understand Web content and provide advanced functionality, such as faceted filtering or the aggregation of content from multiple Web sites, based on this understanding.
Today, most Web sites are generated from structured data that is stored in relational databases. Thus, it does not require too much extra effort for Web sites to publish this structured data directly on the Web in addition to HTML pages, and thus help search engines to understand Web content and provide improved functionality.
An early approach to realize this idea and help search engines to understand Web content is Microformats,] as an alternative, more generic language for embedding any type of data into HTML pages.
Today, major search engines such as Google, Yahoo, and Bing extract Microformat and RDFa data describing products, reviews, persons, events, and recipes from Web pages and use the extracted data to improve the users search experience. The search engines have started to aggregate structured data from different Web sites and augment their search results with these aggregated information units in the form of rich snippets which combine, for instance, data describing a product with reviews of the product from different sites.
The support of Microformats and RDFa by major data consumers, such as Google, Yahoo! Microsoft, and Facebook, has led to a sharp increase in the number of Web sites that embed structured data into HTML pages. According to statistics presented by Yahoo!, the number of Web pages containing RDFa data has increased by 510% between 2009 and 2010. As of October 2010, 430 million Web pages contained RDFa markup, while over 300 million of pages contained microformat data [will be equally understood by all three search engines. This move toward standardization is likely to further increase the amount of structured data being published on the Web.
Parallel to the different techniques to embed structured data into HTML pages, a set of best practices for publishing structured data directly on the Web has gotten considerable traction: Linked Data [].
This chapter gives an overview of the topology of the Web of Data that has been created by publishing data on the Web using the Microformats, RDFa, Microdata, and Linked Data publishing techniques. Section discusses Linked Data and gives an overview of the Linked Data deployment on the Web. For each of the four techniques, we:
Summarize the main features and give an overview of the history of the technique
Provide a syntax example which shows how data describing a person is published on the Web using the technique
Present deployment statistics showing the amounts and types of data currently published using the specific technique
The syntax examples highlight how the different techniques handle (1) the identification of entities; (2) the representation of type informatione.g., that an entity is a person; (3) the representation of literal property values, such as the name of the person; and (4) the representation of relationships between entities, such as that Peter Smith knows Paula Jones.
In order to provide an entry point for experimentation as well as for the evaluation of search engines that facilitate Web data, Sect. gives an overview of large-scale datasets that have been crawled from the Web of Data and are publicly available for download.
1.2 Microformats
Microformats(also referred to as Picture 1 ) are community-driven vocabulary agreements for providing semantic markup on Web pages. The motto of the Microformats community is designed for humans first, machines second. Each Microformat defines a vocabulary and a syntax for applying the vocabulary to describe the content on Web pages. A Microformats syntax commonly specifies which properties are required or optional and which classes should be nested under one another.
Microformats emerged as a community effort, in contrast to other semantic mark-up technologies which sought the route of a standardization body. Early contributors to Microformats include Kevin Marks, Tantek elik, and Mark Pilgrim, among others. The first implementations of Microformats date back to 2003,.
It is argued that the simplistic approach offered by Microformats eases the learning curve and therefore lowers the entry barrier for newcomers. On the other hand, due to the lack of a unified syntax for all microformats, consuming structured data from Microformats requires the development of specialized parsers for each format. This is a reflection of the Microformats approach to address specific use cases, in contrast to RDFa and Microdata (presented in the following sections) which support the representation of any kind of data.
1.2.1 Microformats Syntax
Microformats consist of a definition of a vocabulary (names for classes and properties), as well as a set of rules (e.g., required properties, correct nesting of elements). These rules largely rely on existing HTML/XHTML attributes for inserting markup. One example is the HTML attribute class , commonly used as a style sheet selector, which is reused in microformats for describing properties and types of entities.
Figure shows the properties url (Peters home page) and fn (Peters full name). The markup also states that Peter knows Paula through the use of the properties met and acquaintance defined in the XFN microformat. An hCard parser should be aware that the url property refers to the value of the href attribute, while fn refers to the value of the child text of the HTML element. Such parsing instructions are described within each microformat specification.
Fig 11 Microformats representation of the example entity description The - photo 2
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Semantic Search over the Web»

Look at similar books to Semantic Search over the Web. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Semantic Search over the Web»

Discussion, reviews of the book Semantic Search over the Web and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.