Natural Language Processing with Flair
eBook - ePub

Natural Language Processing with Flair

Tadej Magajna

Condividi libro
  1. 200 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Natural Language Processing with Flair

Tadej Magajna

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Learn how to solve practical NLP problems with the Flair Python framework, train sequence labeling models, work with text classifiers and word embeddings, and much more through hands-on practical exercisesKey Features• Backed by the community and written by an NLP expert• Get an understanding of basic NLP problems and terminology• Solve real-world NLP problems with Flair with the help of practical hands-on exercisesBook DescriptionFlair is an easy-to-understand natural language processing (NLP) framework designed to facilitate training and distribution of state-of-the-art NLP models for named entity recognition, part-of-speech tagging, and text classification. Flair is also a text embedding library for combining different types of embeddings, such as document embeddings, Transformer embeddings, and the proposed Flair embeddings.Natural Language Processing with Flair takes a hands-on approach to explaining and solving real-world NLP problems. You'll begin by installing Flair and learning about the basic NLP concepts and terminology. You will explore Flair's extensive features, such as sequence tagging, text classification, and word embeddings, through practical exercises. As you advance, you will train your own sequence labeling and text classification models and learn how to use hyperparameter tuning in order to choose the right training parameters. You will learn about the idea behind one-shot and few-shot learning through a novel text classification technique TARS. Finally, you will solve several real-world NLP problems through hands-on exercises, as well as learn how to deploy Flair models to production.By the end of this Flair book, you'll have developed a thorough understanding of typical NLP problems and you'll be able to solve them with Flair.What you will learn• Gain an understanding of core NLP terminology and concepts• Get to grips with the capabilities of the Flair NLP framework• Find out how to use Flair's state-of-the-art pre-built models• Build custom sequence labeling models, embeddings, and classifiers• Learn about a novel text classification technique called TARS• Discover how to build applications with Flair and how to deploy them to productionWho this book is forThis Flair NLP book is for anyone who wants to learn about NLP through one of the most beginner-friendly, yet powerful Python NLP libraries out there. Software engineering students, developers, data scientists, and anyone who is transitioning into NLP and is interested in learning about practical approaches to solving problems with Flair will find this book useful. The book, however, is not recommended for readers aiming to get an in-depth theoretical understanding of the mathematics behind NLP. Beginner-level knowledge of Python programming is required to get the most out of this book.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Natural Language Processing with Flair è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Natural Language Processing with Flair di Tadej Magajna in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Informatica e Elaborazione di dati. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Anno
2022
ISBN
9781801072236
Edizione
1
Argomento
Informatica

Part 1: Understanding and Solving NLP with Flair

In this part, you will learn the basics of NLP and get an overview of the Flair framework. You will set up your environment, install Flair, and explore its basic features. You will learn how to extract knowledge from embeddings and use pre-trained sequence labeling models in Flair. 
This part comprises the following chapters:
  • Chapter 1, Introduction to Flair
  • Chapter 2, Flair Base Types
  • Chapter 3, Embeddings in Flair
  • Chapter 4, Sequence Tagging

Chapter 1: Introduction to Flair

There are few Natural Language Processing (NLP) frameworks out there as easy to learn and as easy to work with as Flair. Packed with pre-trained models, excellent documentation, and readable syntax, it provides a gentle learning curve for NLP researchers who are not necessarily skilled in coding; software engineers with poor theoretical foundations; students and graduates; as well as individuals with no prior knowledge simply interested in the topic. But before diving straight into coding, some background about the motivation behind Flair, the basic NLP concepts, and the different approaches to how you can set up your local environment may help you on your journey toward becoming a Flair NLP expert.
In Flair's official GitHub README, the framework is described as:
"A very simple framework for state-of-the-art Natural Language Processing"
This description will raise a few eyebrows. NLP researchers will immediately be interested in knowing what specific tasks the framework achieves its state-of-the-art results in. Engineers will be intrigued by the very simple label, but will wonder what steps are required to get up and running and what environments it can be used in. And those who are not knowledgeable in NLP will wonder whether they will be able to grasp the knowledge required to understand the problems Flair is trying to solve.
In this chapter, we will be answering all of these questions by covering the basic NLP concepts and terminology, providing an overview of Flair, and setting up our development environment with the help of the following sections:
  • A brief introduction to NLP
  • What is Flair?
  • Getting ready

Technical requirements

To get started, you will need a development environment with Python 3.6+. Platform-specific instructions for installing Python can be found at https://docs.python-guide.org/starting/installation/.
You will not require a GPU-equipped development machine, though having one will significantly speed up some of the training-related exercises described later in the book.
You will require access to a command line. On Linux and macOS, simply start the Terminal application. On Windows, press Windows + R to open the Run box, type cmd and then click OK.
Flair's official GitHub repository is available via the following link: https://github.com/flairNLP/flair. In this chapter we will install Flair version 0.11.
The code examples covered in this chapter are found in this book's official GitHub repository in the following Jupyter notebook: https://github.com/PacktPublishing/Natural-Language-Processing-with-Flair/tree/main/Chapter01.

A brief introduction to NLP

Before diving straight into what Flair is capable of and how to leverage its features, we will be going through a brief introduction to NLP to provide some context for readers who are not familiar with all the NLP techniques and tasks solved by Flair. NLP is a branch of artificial intelligence, linguistics, and software engineering that helps machines understand human language. When we humans read a sentence, our brains immediately make sense of many seemingly trivial problems such as the following:
  • Is the sentence written in a language I understand?
  • How can the sentence be split into words?
  • What is the relationship between the words?
  • What are the meanings of the individual words?
  • Is this a question or an answer?
  • Which part-of-speech categories are the words assigned to?
  • What is the abstract meaning of the sentence?
The human brain is excellent at solving these problems conjointly and often seamlessly, leaving us unaware that we made sense of all of these things simply by reading a sentence.
Even now, machines are still not as good as humans at solving all these problems at once. Therefore, to teach machines to understand human language, we have to split understanding of natural language into a set of smaller, machine-intelligible tasks that allow us to get answers to these questions one by one.
In this section, you will find a list of some important NLP tasks with emphasis on the tasks supported by Flair.

Tokenization

Tokenization is the process of breaking down a sentence or a document into meaningful units called tokens. A token can be a paragraph, a sentence, a collocation, or just a word.
For example, a word tokenizer would split the sentence Learning to use Flair into a list of tokens as ["Learning", "to", "use", "Flair"].
Tokenization has to adhere to language-specific rules and is rarely a trivial task to solve. For example, with unspaced languages where word boundaries aren't defined with spaces, it's very difficult to determine where one word ends and the next one starts. Well-defined token boundaries are a prerequisite for most NLP tasks that aim to process words, collocations, or sentences including the following tasks explained in this chapter.

Text vectorization

Text vectorization is a process of transforming words, sentences, or documents in their written form into a numerical representation understandable to machines.
One of the simplest forms of text vectorization is one-hot encoding. It maps words to binary vectors of length equal to the number of words in the dictionary. All elements of the vector are 0 apart from the element that represents the word, which is set to 1 – hence the name one-hot.
For example, take the following dictionary:
  • Cat
  • Dog
  • Goat
The word cat would be the first word in our dictionary and its one-hot encoding would be [1, 0...

Indice dei contenuti