eBook - ePub

Mastering Transformers

Name: Mastering Transformers
Author: Savaş Yıldırım, Meysam Asgari- Chenaghlu

Savaş Yıldırım, Meysam Asgari- Chenaghlu

Partager le livre

374 pages
English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Mastering Transformers

Savaş Yıldırım, Meysam Asgari- Chenaghlu

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

Take a problem-solving approach to learning all about transformers and get up and running in no time by implementing methodologies that will build the future of NLP

Key Features

Explore quick prototyping with up-to-date Python libraries to create effective solutions to industrial problems
Solve advanced NLP problems such as named-entity recognition, information extraction, language generation, and conversational AI
Monitor your model's performance with the help of BertViz, exBERT, and TensorBoard

Book Description

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment.By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.

What you will learn

Explore state-of-the-art NLP solutions with the Transformers library
Train a language model in any language with any transformer architecture
Fine-tune a pre-trained language model to perform several downstream tasks
Select the right framework for the training, evaluation, and production of an end-to-end solution
Get hands-on experience in using TensorBoard and Weights & Biases
Visualize the internal representation of transformer models for interpretability

Who this book is for

This book is for deep learning researchers, hands-on NLP practitioners, as well as ML/NLP educators and students who want to start their journey with Transformers. Beginner-level machine learning knowledge and a good command of Python will help you get the best out of this book.

]]>

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Mastering Transformers est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Mastering Transformers par Savaş Yıldırım, Meysam Asgari- Chenaghlu en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Natural Language Processing. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Packt Publishing

Année

2021

ISBN

9781801078894

Sujet

Computer Science

Sous-sujet

Natural Language Processing

Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications

In this section, you will learn about all aspects of Transformers at an introductory level. You will write your first hello-world program with Transformers by loading community-provided pre-trained language models and running the related code with or without a GPU. Installing and utilizing the tensorflow, pytorch, conda, transformers, and sentenceTransformers libraries will also be explained in detail in this section.

This section comprises the following chapters:

Chapter 1, From Bag-of-Words to the Transformers
Chapter 2, A Hands-On Introduction to the Subject

Chapter 1: From Bag-of-Words to the Transformer

In this chapter, we will discuss what has changed in Natural Language Processing (NLP) over two decades. We experienced different paradigms and finally entered the era of Transformer architectures. All the paradigms help us to gain a better representation of words and documents for problem-solving. Distributional semantics describes the meaning of a word or a document with vectorial representation, looking at distributional evidence in a collection of articles. Vectors are used to solve many problems in both supervised and unsupervised pipelines. For language-generation problems, n-gram language models have been leveraged as a traditional approach for years. However, these traditional approaches have many weaknesses that we will discuss throughout the chapter.

We will further discuss classical Deep Learning (DL) architectures such as Recurrent Neural Networks (RNNs), Feed-Forward Neural Networks (FFNNs), and Convolutional Neural Networks (CNNs). These have improved the performance of the problems in the field and have overcome the limitation of traditional approaches. However, these models have had their own problems too. Recently, Transformer models have gained immense interest because of their effectiveness in all NLP tasks, from text classification to text generation. However, the main success has been that Transformers effectively improve the performance of multilingual and multi-task NLP problems, as well as monolingual and single tasks. These contributions have made Transfer Learning (TL) more possible in NLP, which aims to make models reusable for different tasks or different languages.

Starting with the attention mechanism, we will briefly discuss the Transformer architecture and the differences between previous NLP models. In parallel with theoretical discussions, we will show practical examples with the popular NLP framework. For the sake of simplicity, we will choose introductory code examples that are as short as possible.

In this chapter, we will cover the following topics:

Evolution of NLP toward Transformers
Understanding distributional semantics
Leveraging DL
Overview of the Transformer architecture
Using TL with Transformers

Technical requirements

We will be using Jupyter Notebook to run our coding exercises that require python >=3.6.0, along with the following packages that need to be installed with the pip install command:

sklearn
nltk==3.5.0
gensim==3.8.3
fasttext
keras>=2.3.0
Transformers >=4.00

All notebooks with coding exercises are available at the following GitHub link: https://github.com/PacktPublishing/Advanced-Natural-Language-Processing-with-Transformers/tree/main/CH01.

Check out the following link to see Code in Action Video: https://bit.ly/2UFPuVd

Evolution of NLP toward Transformers

We have seen profound changes in NLP over the last 20 years. During this period, we experienced different paradigms and finally entered a new era dominated mostly by magical Transformer architecture. This architecture did not come out of nowhere. Starting with the help of various neural-based NLP approaches, it gradually evolved to an attention-based encoder-decoder type architecture and still keeps evolving. The architecture and its variants have been successful thanks to the following developments in the last decade:

Contextual word embeddings
Better subword tokenization algorithms for handling unseen words or rare words
Injecting additional memory tokens into sentences, such as Paragraph ID in Doc2vec or a Classification (CLS) token in Bidirectional Encoder Representations from Transformers (BERT)
Attention mechanisms, which overcome the problem of forcing input sentences to encode all information into one context vector
Multi-head self-attention
Positional encoding to case word order
Parallelizable architectures that make for faster training and fine-tuning
Model compression (distillation, quantization, and so on)
TL (cross-lingual, multitask learning)

For many years, we used traditional NLP approaches such as n-gram language models, TF-IDF-based information retrieval models, and one-hot encoded document-term matrices. All these approaches have contributed a lot to the solution of many NLP problems such as sequence classification, language generation, language understanding, and so forth. On the other hand, these traditional NLP methods have their own weaknesses—for instance, falling short in solving the problems of sparsity, unseen words representation, tracking long-term dependencies, and others. In order to cope with these weaknesses, we developed DL-based approaches such as the following:

RNNs
CNNs
FFNNs
Several variants of RNNs, CNNs, and FFNNs

In 2013, as a two-layer FFNN word-encoder model, Word2vec, sorted out the dimensionality problem by producing short and dense representations of the words, called word embeddings. This early model managed to produce fast and efficient static word embeddings. It transformed unsupervised textual data into supervised data (self-supervised learning) by either predicting the target word using context or predicting neighbor words based on a sliding window. GloVe, another widely used and popular model, argued that count-based models can be better than neural models. It leverages both global and local statistics of a corpus to learn embeddings based on word-word co-occurrence statistics. It performed well on some syntactic and semantic tasks, as shown in the following screensho...