Python 3 Text Processing with NLTK 3 Cookbook
eBook - ePub

Python 3 Text Processing with NLTK 3 Cookbook

Jacob Perkins

Partager le livre
  1. 304 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Python 3 Text Processing with NLTK 3 Cookbook

Jacob Perkins

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.

This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Python 3 Text Processing with NLTK 3 Cookbook est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Python 3 Text Processing with NLTK 3 Cookbook par Jacob Perkins en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Ciencia de la computaciĂłn et Procesamiento del lenguaje natural. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2014
ISBN
9781782167853

Python 3 Text Processing with NLTK 3 Cookbook


Table of Contents

Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Tokenizing Text and WordNet Basics
Introduction
Tokenizing text into sentences
Getting ready
How to do it...
How it works...
There's more...
Tokenizing sentences in other languages
See also
Tokenizing sentences into words
How to do it...
How it works...
There's more...
Separating contractions
PunktWordTokenizer
WordPunctTokenizer
See also
Tokenizing sentences using regular expressions
Getting ready
How to do it...
How it works...
There's more...
Simple whitespace tokenizer
See also
Training a sentence tokenizer
Getting ready
How to do it...
How it works...
There's more...
See also
Filtering stopwords in a tokenized sentence
Getting ready
How to do it...
How it works...
There's more...
See also
Looking up Synsets for a word in WordNet
Getting ready
How to do it...
How it works...
There's more...
Working with hypernyms
Part of speech (POS)
See also
Looking up lemmas and synonyms in WordNet
How to do it...
How it works...
There's more...
All possible synonyms
Antonyms
See also
Calculating WordNet Synset similarity
How to do it...
How it works...
There's more...
Comparing verbs
Path and Leacock Chordorow (LCH) similarity
See also
Discovering word collocations
Getting ready
How to do it...
How it works...
There's more...
Scoring functions
Scoring ngrams
See also
2. Replacing and Correcting Words
Introduction
Stemming words
How to do it...
How it works...
There's more...
The LancasterStemmer class
The RegexpStemmer class
The SnowballStemmer class
See also
Lemmatizing words with WordNet
Getting ready
How to do it...
How it works...
There's more...
Combining stemming with lemmatization
See also
Replacing words matching regular expressions
Getting ready
How to do it...
How it works...
There's more...
Replacement before tokenization
See also
Removing repeating characters
Getting ready
How to do it...
How it works...
There's more...
See also
Spelling correction with Enchant
Getting ready
How to do it...
How it works...
There's more...
The en_GB dictionary
Personal word lists
See also
Replacing synonyms
Getting ready
How to do it...
How it works...
There's more...
CSV synonym replacement
YAML synonym replacement
See also
Replacing negations with antonyms
How to do it...
How it works...
There's more...
See also
3. Creating Custom Corpora
Introduction
Setting up a custom corpus
Getting ready
How to do it...
How it works...
There's more...
Loading a YAML file
See also
Creating a wordlist corpus
Getting ready
How to do it...
How it works...
There's more...
Names wordlist corpus
English words corpus
See also
Creating a part-of-speech tagged word corpus
Getting ready
How to do it...
How it works...
There's more...
Customizing the word tokenizer
Customizing the sentence tokenizer
Customizing the paragraph block reader
Customizing the tag separator
Converting tags to a universal tagset
See also
Creating a chunked phrase corpus
Getting ready
How to do it...
How it works...
There's more...
Tree leaves
Treebank chunk corpus
CoNLL2000 corpus
See also
Creating a categorized text corpus
Getting ready
How to do it...
How it works...
There's more...
Category file
Categorized tagged corpus reader
Categorized corpora
See also
Creating a categorized chunk corpus reader
Getting ready
How to do it...
How it works...
There's more...
Categorized CoNLL chunk corpus reader
See also
Lazy corpus loading
How to do it...
How it works...
There's more...
Creating a custom corpus view
How to do it...
How it works...
There's more...
Block reader functions
Pickle corpus view
Concatenated corpus view
See also
Creating a MongoDB-backed corpus reader
Getting ready
How to do it...
How it works...
There's more...
See also
Corpus editing with file locking
Getting ready
How to do it...
How it works...
4. Part-of-speech Tagging
Introduction
Default tagging
Getting ready
How to do it...
How it works...
There's more...
Evaluating accuracy
Tagging sentences
Untagging a tagged sentence
See also
Training a unigram part-of-speech tagger
How to do it...
How it works...
There's more...
Overriding the context model
Minimum frequency cutoff
See also
Combining taggers with backoff tagging
How to do it...
How it works...
There's more...
Saving and loading a trained tagger with pickle
See also
Training and combining ngram taggers
Getting ready
How to do it...
How it works...
There's more.....

Table des matiĂšres