MPG.eBooks - Table of Contents: Natural language processing with Java

Read Now

Natural language processing with Java explore various approaches to organize and extract useful text from unstructured data using Java

Annotation

Bibliographic Details
Main Author:	Reese, Richard M.
Format:	eBook
Language:	English
Published:	Birmingham, UK Packt Publishing 2015
Series:	Community experience distilled
Subjects:	Computers / Programming Languages / Java / Bisacsh Java (computer Program Language) / Fast Java (langage De Programmation) Traitement Automatique Des Langues Naturelles Java (computer Program Language) / Http://id.loc.gov/authorities/subjects/sh95008574 Natural Language Processing (computer Science) / Fast Natural Language Processing (computer Science) / Http://id.loc.gov/authorities/subjects/sh88002425
Online Access:	https://learning.oreilly.com/library/view/~/978178...
Collection:	O'Reilly - Collection details see MPG.ReNa

Table of Contents:

Using the PTBTokenizer classUsing the DocumentPreprocessor class; Using a pipeline; Using LingPipe tokenizers; Training a tokenizer to find parts of text; Comparing tokenizers; Understanding normalization; Converting to lowercase; Removing stopwords; Creating a StopWords class; Using LingPipe to remove stopwords; Using stemming; Using the Porter Stemmer; Stemming with LingPipe; Using lemmatization; Using the StanfordLemmatizer class; Using lemmatization in OpenNLP; Normalizing using a pipeline; Summary; Chapter 3: Finding Sentences; The SBD process; What makes SBD difficult?
Understanding SBD rules of LingPipe's HeuristicSentenceModel classSimple Java SBDs; Using regular expressions; Using the BreakIterator class; Using NLP APIs; Using OpenNLP; Using the SentenceDetectorME class; Using the sentPosDetect method; Using the Stanford API; Using the PTBTokenizer class; Using the DocumentPreprocessor class; Using the StanfordCoreNLP class; Using LingPipe; Using the IndoEuropeanSentenceModel class; Using the SentenceChunker class; Using the MedlineSentenceModel class; Training a Sentence Detector model; Using the Trained model
Verifying the modelUsing the model; Preparing data; Summary; Chapter 2: Finding Parts of Text; Understanding the parts of text; What is tokenization?; Uses for tokenizers; Simple Java tokenizers; Using the Scanner class; Specifying the delimiter; Using the split method; Using the BreakIterator class; Using the StreamTokenizer class; Using the StringTokenizer class; Java core tokenization performance considerations; NLP tokenizer APIs; Using the OpenNLPTokenizer; Using the SimpleTokenizer class; Using the WhitespaceTokenizer class; Using the TokenizerME class; Using the Stanford tokenizer
Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Introduction to NLP; What is NLP?; Why use NLP?; Why is NLP so hard?; Survey of NLP tools; Apache OpenNLP; Stanford NLP; LingPipe; GATE; UIMA; Overview of text processing tasks; Finding parts of text; Finding sentences; Finding people and things; Detecting parts of speech; Classifying text and documents; Extracting relationships; Using combined approaches; Understanding NLP models; Identifying the task; Selecting a model; Building and training the model
Evaluating the model using the SentenceDetectorEvaluator classSummary; Chapter 4: Finding People and Things; Why NER is difficult?; Techniques for name recognition; Lists and regular expressions; Statistical classifiers; Using regular expressions for NER; Using Java's regular expressions to find entities; Using LingPipe's RegExChunker class; Using NLP APIs; Using OpenNLP for NER; Determining the accuracy of the entity; Using other entity types; Processing multiple entity types; Using the Stanford API for NER; Using LingPipe for NER; Using LingPipe's name entity models

Natural language processing with Java explore various approaches to organize and extract useful text from unstructured data using Java

Similar Items