Knowledge Discovery from Multi-Sourced Data

This book addresses several knowledge discovery problems on multi-sourced data where the theories, techniques, and methods in data cleaning, data mining, and natural language processing are synthetically used. This book mainly focuses on three data models: the multi-sourced isomorphic data, the mult...

Full description

Bibliographic Details
Main Authors: Ye, Chen, Wang, Hongzhi (Author), Dai, Guojun (Author)
Format: eBook
Language:English
Published: Singapore Springer Nature Singapore 2022, 2022
Edition:1st ed. 2022
Series:SpringerBriefs in Computer Science
Subjects:
Online Access:
Collection: Springer eBooks 2005- - Collection details see MPG.ReNa
LEADER 03197nmm a2200349 u 4500
001 EB002017469
003 EBX01000000000000001180367
005 00000000000000.0
007 cr|||||||||||||||||||||
008 220701 ||| eng
020 |a 9789811918797 
100 1 |a Ye, Chen 
245 0 0 |a Knowledge Discovery from Multi-Sourced Data  |h Elektronische Ressource  |c by Chen Ye, Hongzhi Wang, Guojun Dai 
250 |a 1st ed. 2022 
260 |a Singapore  |b Springer Nature Singapore  |c 2022, 2022 
300 |a XII, 83 p. 14 illus., 9 illus. in color  |b online resource 
505 0 |a 1. Introduction -- 2. Functional-dependency-based truth discovery for isomorphic data -- 3. Denial-constraint-based truth discovery for isomorphic data -- 4. Pattern discovery for heterogeneous data -- 5. Deep fact discovery for text data 
653 |a Artificial intelligence / Data processing 
653 |a Database Management 
653 |a Data mining 
653 |a Data Mining and Knowledge Discovery 
653 |a Database management 
653 |a Data Science 
700 1 |a Wang, Hongzhi  |e [author] 
700 1 |a Dai, Guojun  |e [author] 
041 0 7 |a eng  |2 ISO 639-2 
989 |b Springer  |a Springer eBooks 2005- 
490 0 |a SpringerBriefs in Computer Science 
028 5 0 |a 10.1007/978-981-19-1879-7 
856 4 0 |u https://doi.org/10.1007/978-981-19-1879-7?nosfx=y  |x Verlag  |3 Volltext 
082 0 |a 006.312 
520 |a This book addresses several knowledge discovery problems on multi-sourced data where the theories, techniques, and methods in data cleaning, data mining, and natural language processing are synthetically used. This book mainly focuses on three data models: the multi-sourced isomorphic data, the multi-sourced heterogeneous data, and the text data. On the basis of three data models, this book studies the knowledge discovery problems including truth discovery and fact discovery on multi-sourced data from four important properties: relevance, inconsistency, sparseness, and heterogeneity, which is useful for specialists as well as graduate students. Data, even describing the same object or event, can come from a variety of sources such as crowd workers and social media users. However, noisy pieces of data or information are unavoidable. Facing the daunting scale of data, it is unrealistic to expect humans to “label” or tell which data source is more reliable.Hence, it is crucial to identify trustworthy information from multiple noisy information sources, referring to the task of knowledge discovery. At present, the knowledge discovery research for multi-sourced data mainly faces two challenges. On the structural level, it is essential to consider the different characteristics of data composition and application scenarios and define the knowledge discovery problem on different occasions. On the algorithm level, the knowledge discovery task needs to consider different levels of information conflicts and design efficient algorithms to mine more valuable information using multiple clues. Existing knowledge discovery methods have defects on both the structural level and the algorithm level, making the knowledge discovery problem far from totally solved