Semantic Computing
eBook - ePub

Semantic Computing

Phillip C-Y Sheu

  1. 252 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Semantic Computing

Phillip C-Y Sheu

Book details
Book preview
Table of contents
Citations

About This Book

-->

As the first volume of World Scientific Encyclopedia with Semantic Computing and Robotic Intelligence, this volume is designed to lay the foundation for the understanding of the Semantic Computing (SC), as a core concept to study Robotic Intelligence in the subsequent volumes.

This volume aims to provide a reference to the development of Semantic Computing, in the terms of "meaning", "context", and "intention". It brings together a series of technical notes, in average, no longer than 10 pages in length, each focuses on one topic in Semantic Computing; being review article or research paper, to explain the fundamental concepts, models or algorithms, and possible applications of the technology concerned.

This volume will address three core areas in Semantic Computing:

-->

  • Understanding the (possibly naturally-expressed) intentions (semantics) of users and expressing them in a machine-processable format: Semantics description languages, ontology integration, interoperability
  • Understanding the meanings (semantics) of computational content (of various sorts, including, but is not limited to, text, video, audio, process, network, software and hardware) and expressing them in a machine-processable format in Multimedia, IoT, SDN, wearable computing, interfacable with mobile computing, search engines, question answering, web services, to support applications in biomedicine, healthcare, manufacturing, engineering, education, finance, entertainment, business, science and humanity
  • Mapping the semantics of the user in context for content retrieval, management, creation in the form of structured data, image and video, audio and speech, big data, natural language, deep learning.

--> -->
--> Contents:

  • Part 1: Understanding Semantics:
    • Open Information Extraction (D-T Vo and E Bagheri)
    • Methods and Resources for Computing Semantic Relatedness (Y Feng and E Bagheri)
    • Semantic Summarization of Web News (F Amato, V Moscato, A Picariello, G Sperlí, A D'Acierno and A Penta)
    • Event Identification in Social Networks (F Zarrinkalam and E Bagheri)
    • Community Detection in Social Networks (H Fani and E Bagheri)
    • High-Level Surveillance Event Detection (F Persia and D D'Auria)
  • Part 2: Data Science:
    • Selected Topics in Statistical Computing (S B Chatla, C-H Chen and G Shmueli)
    • Bayesian Networks: Theory, Applications and Sensitivity Issues (R S Kenett)
    • GLiM: Generalized Linear Models (J R Barr and S Zacks)
    • OLAP and Machine Learning (J Jin)
    • Survival Analysis via Cox Proportional Hazards Additive Models (L Bai and D Gillen)
    • Deep Learning (X Hao and G Zhang)
    • Two-Stage and Sequential Sampling for Estimation and Testing with Prescribed Precision (S Zacks)
    • Business Process Mining (A Pourmasoumi and E Bagheri)
    • The Information Quality Framework for Evaluating Data Science Programs (S Y Coleman and R S Kenett)
  • Part 3: Data Integration:
    • Enriching Semantic Search with Preference and Quality Scores (M Missikoff, A Formica, E Pourabbas and F Taglino)
    • Multilingual Semantic Dictionaries for Natural Language Processing: The Case of BabelNet (C D Bovi and R Navigli)
    • Model-Based Documentation (F Farazi, C Chapman, P Raju and W Byrne)
    • Entity Linking for Tweets (P Basile and A Caputo)
    • Enabling Semantic Technologies Using Multimedia Ontology (A M Rinaldi)
  • Part 4: Applications:
    • Semantic Software Engineering (T Wang, A Kitazawa and P Sheu)
    • A Multimedia Semantic Framework for Image Understanding and Retrieval (A Penta)
    • Use of Semantics in Robotics — Improving Doctors' Performance Using a Cricothyrotomy Simulator (D D'Auria and F Persia)
    • Semantic Localization (S Ma and Q Liu)
    • Use of Semantics in Bio-Informatics (C C N Wang and J J P Tsai)
    • Actionable Intelligence and Online Learning for Semantic Computing (C Tekin and M van der Schaar)

-->
--> Readership: For students and researchers of Computer Science. -->
Keywords:Artificial Intelligence;Semantic Computing;Thinking Machine;RoboticsReview: Key Features:

  • World Scientific Encyclopedia with Semantic Computing and Robotic Intelligence is the most up-to-date on the subject that is available in the market today
  • Supported by its sister e-journal available at http://www.worldscientific.com/worldscinet/escri, where new articles are released online regularly on a monthly basis
  • Once every 6-months, the articles published in the e-journal will be compiled into a volume as part of the World Scientific Encyclopedia with Semantic Computing and Robotic Intelligence, published as ESCRI collection

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Semantic Computing an online PDF/ePUB?
Yes, you can access Semantic Computing by Phillip C-Y Sheu in PDF and/or ePUB format, as well as other popular books in Tecnologia e ingegneria & Robotica. We have over one million books available in our catalogue for you to explore.

Information

Publisher
WSPC
Year
2017
ISBN
9789813227934

Part 1

Understanding Semantics

Open information extraction

Duc-Thuan Vo* and Ebrahim Bagheriā€ 
Laboratory for Systems, Software and Semantics (LS3)
Ryerson University, Toronto, ON, Canada
*[email protected]
ā€ [email protected]
Open information extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation types such as Verb, Noun and Prep, Verb and Prep, and Infinitive with deep linguistic analysis. They expose simple yet principled ways in which verbs express relationships in linguistics such as verb phrase-based extraction or clause-based extraction. They obtain a significantly higher performance over previous systems in the first generation. In this paper, we describe an overview of two Open IE generations including strengths, weaknesses and application areas.
Keywords: Open information extraction; natural language processing; verb phrase-based extraction; clause-based extraction.

1.Information Extraction and Open Information Extraction

Information Extraction (IE) is growing as one of the active research areas in artificial intelligence for enabling computers to read and comprehend unstructured textual content.1 IE systems aim to distill semantic relations which present relevant segments of information on entities and relationships between them from large numbers of textual documents. The main objective of IE is to extract and represent information in a tuple of two entities and a relationship between them. For instance, given the sentence ā€œBarack Obama is the President of the United Statesā€, they venture to extract the relation tuple President of (Barack Obama, the United States) automatically. The identified relations can be used for enhancing machine reading by building knowledge bases in Resource Description Framework (RDF) or ontology forms. Most IE systems2ā€“5 focus on extracting tuples from domain-specific corpora and rely on some form of pattern-matching technique. Therefore, the performance of these systems is heavily dependent on considerable domain specific knowledge. Several methods employ advanced pattern matching techniques in order to extract relation tuples from knowledge bases by learning patterns based on labeled training examples that serve as initial seeds.
Many of the current IE systems are limited in terms of scalability and portability across domains while in most corpora likes news, blog, email, encyclopedia, the extractors need to be able to extract relation tuples from across different domains. Therefore, there has been move towards next generation IE systems that can be highly scalable on large Web corpora. Etzioni et al.1 have introduced one of the pioneering Open IE systems called TextRunner.6 This system tackles an unbounded number of relations and eschews domain-specific training data, and scales linearly. This system does not presuppose a predefined set of relations and is targeted at all relations that can be extracted. Open IE is currently being developed in its second generation in systems such as ReVerb,7 OLLIE,7 and ClausIE,8 which extend from previous Open IE systems such as TextRunner,6 StatSnowBall,9 and WOE.10 Figure 1 summarizes the differences of traditional IE systems and the new IE systems which are called Open IE.1,11

2.First Open IE Generation

In the first generation, Open IE systems aimed at constructing a general model that could express a relation based on unlexicalized features such as POS or shallow tags, e.g., a description of a verb in its surrounding context or the presence of capitalization and punctuation. While traditional IE requires relations to be specified in their input, Open IE systems use their relation-independent model as self-training to learn relations and entities in the corpora. TextRunner is one of the first Open IE systems. It applied a Naive Bayes model with POS and Chunking features that trained tuples using examples heuristically generated from the Penn Tree-bank. Subsequent work showed that a linear-chain Conditional Random Field (CRF)1,6 or Markov Logic Network9 can be used for identifying extractable relations. Several Open IE systems have been proposed in the first generation, including TextRunner, WOE, and StatSnowBall that typically consist of the following three stages: (1) Intermediate levels of analysis and (2) Learning models and (3) Presentation, which we elaborate in the following:
image
Fig. 1. IE versus Open IE.
Intermediate levels of analysis
In this stage, Natural Language Processing (NLP) techniques such as named entity recognition (NER), POS and Phrase-chunking are used. The input sequence of words are taken as input and each word in the sequence is labeled with its part of speech, e.g., noun, verb, adjective by a POS tagger. A set of nonoverlapping phrases in the sentence is divided based on POS tags by a phrase chunked. Named entities in the sentence are located and categorized by NER. Some systems such as TextRunner, WOE used KNnext8 work directly with the output of the syntactic and dependency parsers as shown in Fig. 2. They define a method to identify useful proposition components of the parse trees. As a result, a parser will return a parsing tree including the POS of each word, the presence of phrases, grammatical structures and semantic roles for the input sentence. The structure and annotation will be essential for determining the relationship between entities for learning models of the next stage.
image
Fig. 2. POS, NER and DP analysis in the sentence ā€œAlbert Einstein was awarded the Nobel Prize for Physics in 1921ā€.
Learning models
An Open IE would learn a general model that depicts how a relation could be expressed in a particular language. A linear-chain model such as CRF can then be applied to a sequence which is labeled with POS tags, word segments, semantic roles, named entities, and traditional forms of relation extraction from the first stage. The system will train a learning model given a set of input observations to maximize the conditional probability of a finite set of labels. TextRunner and WOEpos use CRFs to learn whether sequences of tokens are part of a relation. When identifying entities, the system determines a maximum number of words and their surrounding pair of entities which could be considered as possible evidence of a relation. Figure 3 shows entity pairs ā€œAlbert Einsteinā€ and ā€œthe Nobel Prizeā€ with the relationship ā€œwas awardedā€ serving to anchor the entities. On the other hand, WOEparse learns relations generated from corePath, a form of shortest path where a relation could exist, by computing the normalized logarithmic frequency as the probability that a relation could be found. For instance, the shortest path ā€œAlbert Einsteinā€
image
ā€œwas awardedā€
image
ā€œthe Nobel Prizeā€ presents the relationship between ā€œAlbert Einsteinā€ and ā€œthe Nobel Prizeā€ could be learned from the patterns ā€œE1ā€
image
ā€œVā€
image
ā€œE2ā€ in the training data.
image
Fig. 3. A CRF is used to identify the relationship ā€œwas awardedā€ between ā€œAlbert Einsteinā€ and ā€œthe Nobel Prizeā€.
Presentation
In this stage, Open IE systems provide a presentation of the extracted relation triples. The sentences of the input will be presented in the form of instances of a set of relations after being labeled by the learning models. TextRunner and WOE take sentences in a corpus and quickly extract textual triples that are present in each sentence. The form of relation triples contain three textual components where the first and third denote pairs of entity arguments and the second denotes the relationship between them as (Arg1, Rel, Arg2). Figure 4 shows the differences of presentations between traditional IE and Open IE.
Additionally, with large scale and heterogeneous corpora such as the Web, Open IE systems also need to address the disambiguation of entities, e.g., same entities may be referred to by a variety of names (Obama or Barack Obama or B. H. Obama) or the same string (Michael) may refer to different entities. Open IE systems try to compute the probability that two strings denote synonymous pairs of entities based on a highly scalable and unsupervised analysis of tuples. TextRunner applies the Resolver system12 while WOE uses the infoboxes from Wikipedia for classifying entities in the relation triples.

2.1.Advantages and disadvantages

Open IE systems need to be highly scalable and perform extractions on huge Web corpora such as news, blog, emails, and encyclopedias. TextRunner was tested on a collection of over 120 million Web pages and extracted over 500 million triples. This system also had a collaboration with Google on running over one billion public Web pages with noticeable precision and recall on this large-scale corpus.
First generation Open IE systems can suffer from problems such as extracting incoherent and uninformative relations. Incoherent extractions are circumstances when the system extracts relation phra...

Table of contents

Citation styles for Semantic Computing

APA 6 Citation

[author missing]. (2017). Semantic Computing ([edition unavailable]). World Scientific Publishing Company. Retrieved from https://www.perlego.com/book/854176/semantic-computing-pdf (Original work published 2017)

Chicago Citation

[author missing]. (2017) 2017. Semantic Computing. [Edition unavailable]. World Scientific Publishing Company. https://www.perlego.com/book/854176/semantic-computing-pdf.

Harvard Citation

[author missing] (2017) Semantic Computing. [edition unavailable]. World Scientific Publishing Company. Available at: https://www.perlego.com/book/854176/semantic-computing-pdf (Accessed: 14 October 2022).

MLA 7 Citation

[author missing]. Semantic Computing. [edition unavailable]. World Scientific Publishing Company, 2017. Web. 14 Oct. 2022.