Hands-On Python Natural Language Processing
eBook - ePub

Hands-On Python Natural Language Processing

Explore tools and techniques to analyze and process text with a view to building real-world NLP applications

Aman Kedia, Mayank Rasu

Share book
  1. 316 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Hands-On Python Natural Language Processing

Explore tools and techniques to analyze and process text with a view to building real-world NLP applications

Aman Kedia, Mayank Rasu

Book details
Book preview
Table of contents
Citations

About This Book

Get well-versed with traditional as well as modern natural language processing concepts and techniques

Key Features

  • Perform various NLP tasks to build linguistic applications using Python libraries
  • Understand, analyze, and generate text to provide accurate results
  • Interpret human language using various NLP concepts, methodologies, and tools

Book Description

Natural Language Processing (NLP) is the subfield in computational linguistics that enables computers to understand, process, and analyze text. This book caters to the unmet demand for hands-on training of NLP concepts and provides exposure to real-world applications along with a solid theoretical grounding.

This book starts by introducing you to the field of NLP and its applications, along with the modern Python libraries that you'll use to build your NLP-powered apps. With the help of practical examples, you'll learn how to build reasonably sophisticated NLP applications, and cover various methodologies and challenges in deploying NLP applications in the real world. You'll cover key NLP tasks such as text classification, semantic embedding, sentiment analysis, machine translation, and developing a chatbot using machine learning and deep learning techniques. The book will also help you discover how machine learning techniques play a vital role in making your linguistic apps smart. Every chapter is accompanied by examples of real-world applications to help you build impressive NLP applications of your own.

By the end of this NLP book, you'll be able to work with language data, use machine learning to identify patterns in text, and get acquainted with the advancements in NLP.

What you will learn

  • Understand how NLP powers modern applications
  • Explore key NLP techniques to build your natural language vocabulary
  • Transform text data into mathematical data structures and learn how to improve text mining models
  • Discover how various neural network architectures work with natural language data
  • Get the hang of building sophisticated text processing models using machine learning and deep learning
  • Check out state-of-the-art architectures that have revolutionized research in the NLP domain

Who this book is for

This NLP Python book is for anyone looking to learn NLP's theoretical and practical aspects alike. It starts with the basics and gradually covers advanced concepts to make it easy to follow for readers with varying levels of NLP proficiency. This comprehensive guide will help you develop a thorough understanding of the NLP methodologies for building linguistic applications; however, working knowledge of Python programming language and high school level mathematics is expected.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Hands-On Python Natural Language Processing an online PDF/ePUB?
Yes, you can access Hands-On Python Natural Language Processing by Aman Kedia, Mayank Rasu in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Year
2020
ISBN
9781838982584
Edition
1
Section 1: Introduction
This section introduces the field of Natural Language Processing (NLP) and its applications. It also provides you with an overview of the ongoing research in this area and what future applications could be expected.
This section comprises the following chapters:
  • Chapter 1, Understanding the Basics of NLP
  • Chapter 2, NLP Using Python
Understanding the Basics of NLP
Natural Language Processing (NLP) is an interdisciplinary area of research aimed at making machines understand and process human languages. It is an evolving field, with a rapid increase in its acceptability and adoption in industry, and its growth is projected to continue. NLP-based applications are everywhere, and chances are that you already interact with an NLP-enabled application regularly (Alexa, Google Translate, chatbots, and so on). The objective of this book is to provide a hands-on learning experience and help you build NLP applications by understanding key NLP concepts. The book lays particular emphasis on Machine Learning (ML)- and Deep Learning (DL)-based applications and also delves into recent advances such as Bidirectional Encoder Representations from Transformers (BERT). We start this journey by providing a brief context of NLP and introduce you to some existing and evolving applications of NLP.
In this chapter, we'll cover the following topics:
  • Programming languages versus natural languages
  • Why should I learn NLP?
  • Current applications of NLP

Programming languages versus natural languages

Language has played a critical role in the evolution of our species and was arguably the key competitive advantage for our hunter-gatherer ancestors over other species. Naturally evolved languages, also called natural languages, allowed our ancestors to communicate more efficiently with their flock. The development of language scripts further accelerated their growth, as important information could now be documented and reproduced, obviating the need for memorizing. Needless to say, we humans have a deep affinity toward our languages, and we cherish the ability to communicate with fellow humans.
A new class of languages called programming languages surfaced around the mid-20th century, with the objective of communicating with machines to get the desired output. With the explosive growth of computers, gaining familiarity with programming languages assumed great significance in order to harness the computational power of these machines. You will come across various profiles on LinkedIn in which people refer to themselves as polyglots, implying that they are proficient in multiple programming languages. While there are similarities between natural languages and programming languages, in that they are used to communicate and have rules and syntax, there are some major differences. The most important difference is that natural languages are ambiguous, and therefore cannot be comprehended by machines. For example, refer to the following statement: Pick an integer and divide it by two; if the remainder is zero, then it is an even number.
For those who are presumably proficient in Math and English, the preceding statement may make complete sense. However, for someone who is new to deciphering human languages, it may refer to either the integer, two, or the remainder. Likewise, natural languages encompass many other elements, such as sarcasm, double negation, rhetorical expressions, and so on, which increases complexity and requires a monumental effort to code every inherent rule of the language for the machine to understand. These factors make natural languages unfit to be used as programming languages.
How, then, do we communicate with computers humanly?

Understanding NLP

Scientists have been working on this precise question since the turn of the last century and, as of today, we have attained reasonable success in this area. The research on how to make computers understand and manipulate natural languages draws from several fields, including computer science, math, linguistics, and neuroscience, and the resulting interdisciplinary area of research is called NLP. Take a look at the following diagram, which illustrates this:
NLP is categorized as a subfield of the broader Artificial Intelligence (AI) discipline, which delves into simulating human intelligence in machines. English scientist Alan Turing, who is considered one of the pioneers of AI, developed a set of criteria (called the Turing test), which tested whether a machine could display intelligent behavior indistinguishable from that of a human. The machine's ability to understand and process natural languages is a prominent criterion of the Turing test.
Most early research in the field of NLP relied on fixed complex rules and mapping-based systems. These systems, although moderately successful, were difficult to scale. Another issue with the rule-based approach is that it does not mimic human learning of language very well. For example, if you are from Asia and are traveling to the USA, you will come across people who greet you by saying, How's it going? or How are you doing? A fixed rule-based language processing system would signal that the person cares about you and is genuinely interested to know about your wellbeing. However, before you prepare to give your long-winded response of how you are actually doing, you will see that the person has already walked by. When you see this pattern reoccurring and observe how other people respond to the same question, your brain overwrites the pre-existing rule and replaces it with a new contextual understanding, which was derived by some form of data analysis.
This data-driven approach is the cornerstone of most modern-day NLP research. With the advent of ML algorithms and the data deluge propelled by the internet and significantly increased computational capacity, NLP solutions have become way more scalable and reliable. The most exciting thing about this NLP revolution is that most of this is driven by open source technology, meaning these solutions are freely available to anyone who wants to consume or contribute to these projects.
We have covered many of these algorithms and tools in this book, including the following:
  • ML algorithms (Naive Bayes; Support Vector Machine (SVM))
  • DL algorithms (Convolutional Neural Network (CNN); Recurrent Neural Network (RNN))
  • Similarity/dissimilarity measures
  • Long Short-Term Memory (LSTM) network; Gated Recurrent Unit (GRU)
  • BERT
  • Building chatbots; sentiment analyzer
  • Predictive analytics on text data
  • Machine translation system
We hope that by the end of this book, you will be able to build reasonably sophisticated NLP applications on your desktop PC.

Why should I learn NLP?

AI is rapidly penetrating various facets of our lives, from being our home assistant to fielding our queries as automated tech support. Various industry outlook reports project that AI will create millions of jobs (projection range between 200 and 500 million) worldwide by the year 2030. The majority of these jobs will require ML and NLP skills, and therefore it is imperative for engineers and technologists to upskill and prepare for the impending AI revolution and the rapidly evolving tech landscape.
NLP consistently features as the fastest-growing skill in demand by Upwork (largest freelancing platform), and the job listings with an NLP tag continue to feature prominently on various job boards. Since NLP is a subfield of ML, organizations typically hire candidates as ML engineers to work on NLP projects. You could be working on the most cutting-edge ideas in large technology firms or implementing NLP technology-based applications in banks, e-commerce organizations, and so on. The exact work performed by NLP engineers can vary from project to project. However, working with large volumes of unstructured data, preprocessing data, reading research papers on the new development in the field, tuning model parameters, continuous improvement, and so on are some of the tasks that are commonly performed. The authors, having worked on several NLP projects and having followed the latest industry trends closely, can safely state that it's a very exciting time to work in the field of NLP.
You can benefit from learning about NLP even if you are simply a tech enthusiast and not particularly looking for a job as an NLP engineer. You can expect to build reasonably sophisticated NLP applications and tools on your MacBook or PC, on a shoestring budget. It is not surprising, therefore, that there has been a surge of start-ups providing NLP-based solutions to enterprises and retail clients.
A few of the exciting start-ups in this area are listed as follows:
  • Luminance: Legal tech start-up aimed at analyzing legal documents
  • NetBase: Real-time social media feed analytics
  • Agolo: Summarizes large bodies of text at scale
  • Idibon: Converts unstructured data to structured data
This area is also witnessing brisk acquisition activities with larger tech companies acquiring start-ups (Samsung acquired Kngine; Reliance Communications acquired chatbot start-up Haptik; and so on). Given the low barriers for entry and easily accessible open source technologies, this trend is expected to continue.
Now that we have familiarized ourselves with NLP and the benefits of gaining proficiency in this area, we will discuss the current and evolving applications of NLP.

Current applications of NLP

NLP applications are everywhere, and it is highly unlikely that you have not interacted with any such application over the past few days. The current applications include virtual assistants (Alexa, Siri, Cortana, and so on), customer support tools (chatbots, email routers/classifiers, and so on), sentiment analyzers, translators, and document ranking systems. The adoption of these tools is quickly growing, since the speed and accuracy of these applications have increased manifold over the years. It should be noted that many popular NLP applications such as, Alexa and conversational bots, need to process audio data, which can be quantified by capturing the frequency of the underlyin...

Table of contents