Mastering Transformers
Savaş Yıldırım, Meysam Asgari- Chenaghlu
- 374 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Mastering Transformers
Savaş Yıldırım, Meysam Asgari- Chenaghlu
About This Book
Take a problem-solving approach to learning all about transformers and get up and running in no time by implementing methodologies that will build the future of NLP
Key Features
- Explore quick prototyping with up-to-date Python libraries to create effective solutions to industrial problems
- Solve advanced NLP problems such as named-entity recognition, information extraction, language generation, and conversational AI
- Monitor your model's performance with the help of BertViz, exBERT, and TensorBoard
Book Description
Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment.By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.
What you will learn
- Explore state-of-the-art NLP solutions with the Transformers library
- Train a language model in any language with any transformer architecture
- Fine-tune a pre-trained language model to perform several downstream tasks
- Select the right framework for the training, evaluation, and production of an end-to-end solution
- Get hands-on experience in using TensorBoard and Weights & Biases
- Visualize the internal representation of transformer models for interpretability
Who this book is for
This book is for deep learning researchers, hands-on NLP practitioners, as well as ML/NLP educators and students who want to start their journey with Transformers. Beginner-level machine learning knowledge and a good command of Python will help you get the best out of this book.
]]>
Frequently asked questions
Information
Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications
- Chapter 1, From Bag-of-Words to the Transformers
- Chapter 2, A Hands-On Introduction to the Subject
Chapter 1: From Bag-of-Words to the Transformer
- Evolution of NLP toward Transformers
- Understanding distributional semantics
- Leveraging DL
- Overview of the Transformer architecture
- Using TL with Transformers
Technical requirements
- sklearn
- nltk==3.5.0
- gensim==3.8.3
- fasttext
- keras>=2.3.0
- Transformers >=4.00
Evolution of NLP toward Transformers
- Contextual word embeddings
- Better subword tokenization algorithms for handling unseen words or rare words
- Injecting additional memory tokens into sentences, such as Paragraph ID in Doc2vec or a Classification (CLS) token in Bidirectional Encoder Representations from Transformers (BERT)
- Attention mechanisms, which overcome the problem of forcing input sentences to encode all information into one context vector
- Multi-head self-attention
- Positional encoding to case word order
- Parallelizable architectures that make for faster training and fine-tuning
- Model compression (distillation, quantization, and so on)
- TL (cross-lingual, multitask learning)
- RNNs
- CNNs
- FFNNs
- Several variants of RNNs, CNNs, and FFNNs