Mastering Transformers
eBook - ePub

Mastering Transformers

Savaş Yıldırım, Meysam Asgari- Chenaghlu

Condividi libro
  1. 374 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Mastering Transformers

Savaş Yıldırım, Meysam Asgari- Chenaghlu

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Take a problem-solving approach to learning all about transformers and get up and running in no time by implementing methodologies that will build the future of NLP

Key Features

  • Explore quick prototyping with up-to-date Python libraries to create effective solutions to industrial problems
  • Solve advanced NLP problems such as named-entity recognition, information extraction, language generation, and conversational AI
  • Monitor your model's performance with the help of BertViz, exBERT, and TensorBoard

Book Description

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment.By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.

What you will learn

  • Explore state-of-the-art NLP solutions with the Transformers library
  • Train a language model in any language with any transformer architecture
  • Fine-tune a pre-trained language model to perform several downstream tasks
  • Select the right framework for the training, evaluation, and production of an end-to-end solution
  • Get hands-on experience in using TensorBoard and Weights & Biases
  • Visualize the internal representation of transformer models for interpretability

Who this book is for

This book is for deep learning researchers, hands-on NLP practitioners, as well as ML/NLP educators and students who want to start their journey with Transformers. Beginner-level machine learning knowledge and a good command of Python will help you get the best out of this book.

]]>

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Mastering Transformers è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Mastering Transformers di Savaş Yıldırım, Meysam Asgari- Chenaghlu in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Computer Science e Natural Language Processing. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Anno
2021
ISBN
9781801078894

Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications

In this section, you will learn about all aspects of Transformers at an introductory level. You will write your first hello-world program with Transformers by loading community-provided pre-trained language models and running the related code with or without a GPU. Installing and utilizing the tensorflow, pytorch, conda, transformers, and sentenceTransformers libraries will also be explained in detail in this section.
This section comprises the following chapters:
  • Chapter 1, From Bag-of-Words to the Transformers
  • Chapter 2, A Hands-On Introduction to the Subject

Chapter 1: From Bag-of-Words to the Transformer

In this chapter, we will discuss what has changed in Natural Language Processing (NLP) over two decades. We experienced different paradigms and finally entered the era of Transformer architectures. All the paradigms help us to gain a better representation of words and documents for problem-solving. Distributional semantics describes the meaning of a word or a document with vectorial representation, looking at distributional evidence in a collection of articles. Vectors are used to solve many problems in both supervised and unsupervised pipelines. For language-generation problems, n-gram language models have been leveraged as a traditional approach for years. However, these traditional approaches have many weaknesses that we will discuss throughout the chapter.
We will further discuss classical Deep Learning (DL) architectures such as Recurrent Neural Networks (RNNs), Feed-Forward Neural Networks (FFNNs), and Convolutional Neural Networks (CNNs). These have improved the performance of the problems in the field and have overcome the limitation of traditional approaches. However, these models have had their own problems too. Recently, Transformer models have gained immense interest because of their effectiveness in all NLP tasks, from text classification to text generation. However, the main success has been that Transformers effectively improve the performance of multilingual and multi-task NLP problems, as well as monolingual and single tasks. These contributions have made Transfer Learning (TL) more possible in NLP, which aims to make models reusable for different tasks or different languages.
Starting with the attention mechanism, we will briefly discuss the Transformer architecture and the differences between previous NLP models. In parallel with theoretical discussions, we will show practical examples with the popular NLP framework. For the sake of simplicity, we will choose introductory code examples that are as short as possible.
In this chapter, we will cover the following topics:
  • Evolution of NLP toward Transformers
  • Understanding distributional semantics
  • Leveraging DL
  • Overview of the Transformer architecture
  • Using TL with Transformers

Technical requirements

We will be using Jupyter Notebook to run our coding exercises that require python >=3.6.0, along with the following packages that need to be installed with the pip install command:
  • sklearn
  • nltk==3.5.0
  • gensim==3.8.3
  • fasttext
  • keras>=2.3.0
  • Transformers >=4.00
All notebooks with coding exercises are available at the following GitHub link: https://github.com/PacktPublishing/Advanced-Natural-Language-Processing-with-Transformers/tree/main/CH01.
Check out the following link to see Code in Action Video: https://bit.ly/2UFPuVd

Evolution of NLP toward Transformers

We have seen profound changes in NLP over the last 20 years. During this period, we experienced different paradigms and finally entered a new era dominated mostly by magical Transformer architecture. This architecture did not come out of nowhere. Starting with the help of various neural-based NLP approaches, it gradually evolved to an attention-based encoder-decoder type architecture and still keeps evolving. The architecture and its variants have been successful thanks to the following developments in the last decade:
  • Contextual word embeddings
  • Better subword tokenization algorithms for handling unseen words or rare words
  • Injecting additional memory tokens into sentences, such as Paragraph ID in Doc2vec or a Classification (CLS) token in Bidirectional Encoder Representations from Transformers (BERT)
  • Attention mechanisms, which overcome the problem of forcing input sentences to encode all information into one context vector
  • Multi-head self-attention
  • Positional encoding to case word order
  • Parallelizable architectures that make for faster training and fine-tuning
  • Model compression (distillation, quantization, and so on)
  • TL (cross-lingual, multitask learning)
For many years, we used traditional NLP approaches such as n-gram language models, TF-IDF-based information retrieval models, and one-hot encoded document-term matrices. All these approaches have contributed a lot to the solution of many NLP problems such as sequence classification, language generation, language understanding, and so forth. On the other hand, these traditional NLP methods have their own weaknesses—for instance, falling short in solving the problems of sparsity, unseen words representation, tracking long-term dependencies, and others. In order to cope with these weaknesses, we developed DL-based approaches such as the following:
  • RNNs
  • CNNs
  • FFNNs
  • Several variants of RNNs, CNNs, and FFNNs
In 2013, as a two-layer FFNN word-encoder model, Word2vec, sorted out the dimensionality problem by producing short and dense representations of the words, called word embeddings. This early model managed to produce fast and efficient static word embeddings. It transformed unsupervised textual data into supervised data (self-supervised learning) by either predicting the target word using context or predicting neighbor words based on a sliding window. GloVe, another widely used and popular model, argued that count-based models can be better than neural models. It leverages both global and local statistics of a corpus to learn embeddings based on word-word co-occurrence statistics. It performed well on some syntactic and semantic tasks, as shown in the following screensho...

Indice dei contenuti