Natural Language Processing Fundamentals
eBook - ePub

Natural Language Processing Fundamentals

Build intelligent applications that can interpret the human language to deliver impactful results

Sohom Ghosh, Dwight Gunning

  1. 374 páginas
  2. English
  3. ePUB (apto para móviles)
  4. Disponible en iOS y Android
eBook - ePub

Natural Language Processing Fundamentals

Build intelligent applications that can interpret the human language to deliver impactful results

Sohom Ghosh, Dwight Gunning

Detalles del libro
Vista previa del libro

Información del libro

Use Python and NLTK (Natural Language Toolkit) to build out your own text classifiers and solve common NLP problems.

Key Features

  • Assimilate key NLP concepts and terminologies
  • Explore popular NLP tools and techniques
  • Gain practical experience using NLP in application code

Book Description

If NLP hasn't been your forte, Natural Language Processing Fundamentals will make sure you set off to a steady start. This comprehensive guide will show you how to effectively use Python libraries and NLP concepts to solve various problems.

You'll be introduced to natural language processing and its applications through examples and exercises. This will be followed by an introduction to the initial stages of solving a problem, which includes problem definition, getting text data, and preparing it for modeling. With exposure to concepts like advanced natural language processing algorithms and visualization techniques, you'll learn how to create applications that can extract information from unstructured data and present it as impactful visuals. Although you will continue to learn NLP-based techniques, the focus will gradually shift to developing useful applications. In these sections, you'll understand how to apply NLP techniques to answer questions as can be used in chatbots.

By the end of this book, you'll be able to accomplish a varied range of assignments ranging from identifying the most suitable type of NLP task for solving a problem to using a tool like spacy or gensim for performing sentiment analysis. The book will easily equip you with the knowledge you need to build applications that interpret human language.

What you will learn

  • Obtain, verify, and clean data before transforming it into a correct format for use
  • Perform data analysis and machine learning tasks using Python
  • Understand the basics of computational linguistics
  • Build models for general natural language processing tasks
  • Evaluate the performance of a model with the right metrics
  • Visualize, quantify, and perform exploratory analysis from any text data

Who this book is for

Natural Language Processing Fundamentals is designed for novice and mid-level data scientists and machine learning developers who want to gather and analyze text data to build an NLP-powered product. It'll help you to have prior experience of coding in Python using data types, writing functions, and importing libraries. Some experience with linguistics and probability is useful but not necessary.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Natural Language Processing Fundamentals un PDF/ePUB en línea?
Sí, puedes acceder a Natural Language Processing Fundamentals de Sohom Ghosh, Dwight Gunning en formato PDF o ePUB, así como a otros libros populares de Computer Science y Programming in Python. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.


Computer Science

Chapter 1

Introduction to Natural Language Processing

Learning Objectives

By the end of this chapter, you will be able to:
  • Describe what natural language processing (NLP) is all about
  • Describe the history of NLP
  • Differentiate between NLP and Text Analytics
  • Implement various preprocessing tasks
  • Describe the various phases of an NLP project
In this chapter, you will learn about the basics of natural language processing and various preprocessing steps that are required to clean and analyze the data.


To start with looking at NLP, let's understand what natural language is. In simple terms, it's the language we use to express ourselves. It's a basic means of communication. To define more specifically, language is a mutually agreed set of protocols involving words/sounds we use to communicate with each other.
In this era of digitization and computation, we tend to comprehend language scientifically. This is because we are constantly trying to make inanimate objects understand us. Thus, it has become essential to develop mechanisms by which language can be fed to inanimate objects such as computers. NLP helps us do this.
Let's look at an example. You must have some emails in your mailbox that have been automatically labeled as spam. This is done with the help of NLP. Here, an inanimate object – the email service – analyzes the content of the emails, comprehends it, and then further decides whether these emails need to be marked as spam or not.

History of NLP

NLP is an area that overlaps with others. It has emerged from fields such as artificial intelligence, linguistics, formal languages, and compilers. With the advancement of computing technologies and the increased availability of data, the way natural language is being processed has changed. Previously, a traditional rule-based system was used for computations. Today, computations on natural language are being done using machine learning and deep learning techniques.
The major work on machine learning-based NLP started during the 1980s. During the 1980s, developments across various disciplines such as artificial intelligence, linguistics, formal languages, and computations led to the emergence of an interdisciplinary subject called NLP. In the next section, we'll look at text analytics and how it differs from NLP.

Text Analytics and NLP

Text analytics is the method of extracting meaningful insights and answering questions from text data. This text data need not be a human language. Let's understand this with an example. Suppose you have a text file that contains your outgoing phone calls and SMS log data in the following format:
Figure 1.1: Format of call data
Figure 1.1: Format of call data
In the preceding figure, the first two fields represent the date and time at which the call was made or the SMS was sent. The third field represents the type of data. If the data is of the call type, then the value for this field will be set as voice_call. If the type of data is sms, the value of this field will be set to sms. The fourth field is for the phone number and name of the contact. If the number of the person is not in the contact list, then the name value will be left blank. The last field is for the duration of the call or text message. If the type of the data is voice_call, then the value in this field will be the duration of that call. If the type of data is sms, then the value in this field will be the text message.
The following figure shows records of call data stored in a text file:
Figure 1.2: Call records in a text file
Figure 1.2: Call records in a text file
Now, the data shown in the preceding figure is not exactly a human language. But it contains various information that can be extracted by analyzing it. A couple of questions that can be answered by looking at this data are as follows:
  • How many New Year greetings were sent by SMS on 1st January?
  • How many people were contacted whose name is not in the contact list?
The art of extracting useful insights from any given text data can be referred to as text analytics. NLP, on the other hand, is not just restricted to text data. Voice (speech) recognition and analysis also come under the domain of NLP. NLP can be broadly categorized into two types: Natural Language Understanding (NLU) and Natural Language Generation (NLG). A proper explanation of these terms is provided as follows:
  • NLU: NLU refers to a process by which an inanimate object with computing power is able to comprehend spoken language.
  • NLG: NLG refers to a process by which an inanimate object with computing power is able to manifest its thoughts in a language that humans are able to understand.
For example, when a human speaks to a machine, the machine interprets the human language with the help of the NLU process. Also, by using the NLG process, the machine generates an appropriate response and shares that with the human, thus making it easier for humans to understand. These tasks, which are part of NLP, are not part of text analytics. Now we will look at an exercise that will give us a better understanding of text analytics.

Exercise 1: Basic Text Analytics

In this exercise, we will perform some basic text analytics on the given text data. Follow these steps to implement this exercise:
  1. Open a Jupyter notebook.
  2. Insert a new cell. Assign a sentence variable with 'The quick brown fox jumps over the lazy dog'. Insert a new cell and add the following code to implement this:
    sentence = 'The quick brown fox jumps over the lazy dog'
  3. Check whether the word 'quick' belongs to that text using the following code:
    'quick' in sentence
    The preceding code will return the output 'True'.
  4. Find out the index value of the word 'fox' using the following code:
    The code will return the output 16.
  5. To find out the rank of the word 'lazy', use the following code:
    The code generates the output 7.
  6. For printing the third word of the given text, use the following code:
    This will return the output 'brown'.
  7. To print the third word of the given sentence in reverse order, use the following code:
    This will return the output 'nworb'.
  8. To concatenate the first and last words of the given sentence, use the following cod...


  1. Preface
  2. Chapter 1
  3. Introduction to Natural Language Processing
  4. Chapter 2
  5. Basic Feature Extraction Methods
  6. Chapter 3
  7. Developing a Text classifier
  8. Chapter 4
  9. Collecting Text Data from the Web
  10. Chapter 5
  11. Topic Modeling
  12. Chapter 6
  13. Text Summarization and Text Generation
  14. Chapter 7
  15. Vector Representation
  16. Chapter 8
  17. Sentiment Analysis
  18. Appendix
Estilos de citas para Natural Language Processing Fundamentals

APA 6 Citation

Ghosh, S., & Gunning, D. (2019). Natural Language Processing Fundamentals (1st ed.). Packt Publishing. Retrieved from (Original work published 2019)

Chicago Citation

Ghosh, Sohom, and Dwight Gunning. (2019) 2019. Natural Language Processing Fundamentals. 1st ed. Packt Publishing.

Harvard Citation

Ghosh, S. and Gunning, D. (2019) Natural Language Processing Fundamentals. 1st edn. Packt Publishing. Available at: (Accessed: 14 October 2022).

MLA 7 Citation

Ghosh, Sohom, and Dwight Gunning. Natural Language Processing Fundamentals. 1st ed. Packt Publishing, 2019. Web. 14 Oct. 2022.