Principles and practice of big data preparing, sharing, and analyzing complex information

Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information, Second Edition updates and expands on the first edition, bringing a set of techniques and algorithms that are tailored to Big Data projects. The book stresses the point that most data analyses conducted on la...

Full description

Bibliographic Details
Main Author: Berman, Jules J.
Format: eBook
Language:English
Published: London Academic Press 2018
Edition:Second edition
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
LEADER 06194nmm a2200433 u 4500
001 EB001940279
003 EBX01000000000000001103181
005 00000000000000.0
007 cr|||||||||||||||||||||
008 210123 ||| eng
020 |a 9780128156100 
050 4 |a QA76.9.B45 
100 1 |a Berman, Jules J. 
245 0 0 |a Principles and practice of big data  |b preparing, sharing, and analyzing complex information  |c Jules J. Berman 
250 |a Second edition 
260 |a London  |b Academic Press  |c 2018 
300 |a 1 online resource 
505 0 |a Chapter 2: Providing Structure to Unstructured Data; Section 2.1. Nearly All Data Is Unstructured and Unusable in Its Raw Form; Section 2.2. Concordances; Section 2.3. Term Extraction; Section 2.4. Indexing; Section 2.5. Autocoding; Section 2.6. Case Study: Instantly Finding the Precise Location of Any Atom in the Universe (Some Assembly Required); Section 2.7. Case Study (Advanced): A Complete Autocoder (in 12 Lines of Python Code); Section 2.8. Case Study: Concordances as Transformations of Text; Section 2.9. Case Study (Advanced): Burrows Wheeler Transform (BWT); Glossary; References 
505 0 |a Front Cover; Principles and Practice of Big Data: Preparing, sharing, and analyzing complex information; Copyright; Other Books by Jules J. Berman; Dedication; Contents; About the Author; Author's Preface to Second Edition; Author's Preface to First Edition; References; Chapter 1: Introduction; Section 1.1. Definition of Big Data; Section 1.2. Big Data Versus Small Data; Section 1.3. Whence Comest Big Data?; Section 1.4. The Most Common Purpose of Big Data Is to Produce Small Data; Section 1.5. Big Data Sits at the Center of the Research Universe; Glossary; References 
505 0 |a Includes bibliographical references and index 
505 0 |a Introduction -- Providing structure to unstructured data -- Identification, deidentification, and reidentification -- Metadata, semantics, and triples -- Classifications and ontologies -- Introspection -- Standards and data integration -- Immutability and immortality -- Assessing the adequacy of a big data resource -- Measurement -- Indispensable tips for fast and simple big data analysis -- Finding the clues in large collections of data -- Using random numbers to knock your big data analytic problems down to size -- Special considerations in big data analysis -- Big data failures and how to avoid (some of) them -- Data reanalysis : much more important than analysis -- Repurposing big data -- Data sharing and data security -- Legalities -- Societal issues 
505 0 |a Section 4.1. Metadata; Section 4.2. eXtensible Markup Language; Section 4.3. Semantics and Triples; Section 4.4. Namespaces; Section 4.5. Case Study: A Syntax for Triples; Section 4.6. Case Study: Dublin Core; Glossary; References; Chapter 5: Classifications and Ontologies; Section 5.1. It's All About Object Relationships; Section 5.2. Classifications, the Simplest of Ontologies; Section 5.3. Ontologies, Classes With Multiple Parents; Section 5.4. Choosing a Class Model; Section 5.5. Class Blending; Section 5.6. Common Pitfalls in Ontology Development 
505 0 |a Section 5.7. Case Study: An Upper Level Ontology; Section 5.8. Case Study (Advanced): Paradoxes; Section 5.9. Case Study (Advanced): RDF Schemas and Class Properties; Section 5.10. Case Study (Advanced): Visualizing Class Relationships; Glossary; References; Chapter 6: Introspection; Section 6.1. Knowledge of Self; Section 6.2. Data Objects: The Essential Ingredient of Every Big Data Collection; Section 6.3. How Big Data Uses Introspection; Section 6.4. Case Study: Time Stamping Data; Section 6.5. Case Study: A Visit to the TripleStore 
505 0 |a Chapter 3: Identification, Deidentification, and Reidentification; Section 3.1. What Are Identifiers?; Section 3.2. Difference Between an Identifier and an Identifier System; Section 3.3. Generating Unique Identifiers; Section 3.4. Really Bad Identifier Methods; Section 3.5. Registering Unique Object Identifiers; Section 3.6. Deidentification and Reidentification; Section 3.7. Case Study: Data Scrubbing; Section 3.8. Case Study (Advanced): Identifiers in Image Headers; Section 3.9. Case Study: One-Way Hashes; Glossary; References; Chapter 4: Metadata, Semantics, and Triples 
653 |a Big data / fast 
653 |a Big data / http://id.loc.gov/authorities/subjects/sh2012003227 
653 |a COMPUTERS / Databases / Data Mining / bisacsh 
653 |a Données volumineuses 
041 0 7 |a eng  |2 ISO 639-2 
989 |b OREILLY  |a O'Reilly 
015 |a GBB8F7064 
776 |z 0128156104 
776 |z 9780128156094 
776 |z 9780128156100 
776 |z 0128156090 
856 4 0 |u https://learning.oreilly.com/library/view/~/9780128156100/?ar  |x Verlag  |3 Volltext 
082 0 |a 005.7 
520 |a Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information, Second Edition updates and expands on the first edition, bringing a set of techniques and algorithms that are tailored to Big Data projects. The book stresses the point that most data analyses conducted on large, complex data sets can be achieved without the use of specialized suites of software (e.g., Hadoop), and without expensive hardware (e.g., supercomputers). The core of every algorithm described in the book can be implemented in a few lines of code using just about any popular programming language (Python snippets are provided). Through the use of new multiple examples, this edition demonstrates that if we understand our data, and if we know how to ask the right questions, we can learn a great deal from large and complex data collections. The book will assist students and professionals from all scientific backgrounds who are interested in stepping outside the traditional boundaries of their chosen academic disciplines 
520 |a Bringing a set of techniques and algorithms that are tailored to Big Data projects, this book offers case studies across a range of scientific and engineering disciplines and provides insights into semantics, identification, de-identification, vulnerabilities and regulatory/legal issues. --