eBook - ePub

Generative AI with Python and TensorFlow 2

Name: Generative AI with Python and TensorFlow 2
Author: Joseph Babcock, Raghav Bali

Harness the power of generative models to create images, text, and music

Joseph Babcock, Raghav Bali

Share book

488 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Generative AI with Python and TensorFlow 2

Harness the power of generative models to create images, text, and music

Joseph Babcock, Raghav Bali

Book details

Book preview

Table of contents

Citations

About This Book

Fun and exciting projects to learn what artificial minds can create

Key Features

Code examples are in TensorFlow 2, which make it easy for PyTorch users to follow along
Look inside the most famous deep generative models, from GPT to MuseGAN
Learn to build and adapt your own models in TensorFlow 2.x
Explore exciting, cutting-edge use cases for deep generative AI

Book Description

Machines are excelling at creative human skills such as painting, writing, and composing music. Could you be more creative than generative AI?

In this book, you'll explore the evolution of generative models, from restricted Boltzmann machines and deep belief networks to VAEs and GANs. You'll learn how to implement models yourself in TensorFlow and get to grips with the latest research on deep neural networks.

There's been an explosion in potential use cases for generative models. You'll look at Open AI's news generator, deepfakes, and training deep learning agents to navigate a simulated environment.

Recreate the code that's under the hood and uncover surprising links between text, image, and music generation.

What you will learn

Export the code from GitHub into Google Colab to see how everything works for yourself
Compose music using LSTM models, simple GANs, and MuseGAN
Create deepfakes using facial landmarks, autoencoders, and pix2pix GAN
Learn how attention and transformers have changed NLP
Build several text generation pipelines based on LSTMs, BERT, and GPT-2
Implement paired and unpaired style transfer with networks like StyleGAN
Discover emerging applications of generative AI like folding proteins and creating videos from images

Who this book is for

This is a book for Python programmers who are keen to create and have some fun using generative models. To make the most out of this book, you should have a basic familiarity with math and statistics for machine learning.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Generative AI with Python and TensorFlow 2 an online PDF/ePUB?

Yes, you can access Generative AI with Python and TensorFlow 2 by Joseph Babcock, Raghav Bali in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2021

ISBN

9781800208506

Edition

Topic

Computer Science

Subtopic

Artificial Intelligence (AI) & Semantics

Index

Computer Science

8 Deepfakes with GANs

Manipulating videos and photographs to edit artifacts has been in practice for quite a long time. If you have seen movies like Forrest Gump or Fast and Furious 7, chances are you did not even notice that the scenes with John F. Kennedy or Paul Walker in their respective movies were fake and edited into the movies as required.

You may recall one particular scene from the movie Forrest Gump, where Gump meets John F. Kennedy. The scene was created using complex visual effects and archival footage to ensure high-quality results. Hollywood studios, spy agencies from across the world, and media outlets have been making use of editing tools such as Photoshop, After Effects, and complex custom visual effects/CGI (computer generated imagery) pipelines to come up with such compelling results. While the results have been more or less believable in most instances, it takes a huge amount of manual effort and time to edit each and every detail, such as scene lighting, face, eyes, and lip movements, as well as shadows, for every frame of the scene.

Along the same lines, there is a high chance you might have come across a Buzzfeed video¹ where former US president Barack Obama says "Killmonger was right" (Killmonger is one of Marvel Cinematic Universe's villains). While obviously fake, the video does seem real in terms of its visual and audio aspects. There are a number of other examples where prominent personalities can be seen making comments they would usually not.

Keeping ethics aside, there is one major difference between Gump meeting John F. Kennedy and Barack Obama talking about Killmonger. As mentioned earlier, the former is the result of painstaking manual work done using complex visual effects/CGI. The latter, on the other hand, is the result of a technology called deepfakes. A portmanteau of the words deep learning and fake, deepfake is a broad term used to describe AI-enabled technology that is used to generate the examples we discussed.

In this chapter, we will cover different concepts, architectures, and components associated with deepfakes. We will focus on the following topics:

Overview of the deepfakes technological landscape
The different forms of deepfaking: replacement, re-enactment, and editing
Key features leveraged by different architectures
A high-level deepfakes workflow
Swapping faces using autoencoders
Re-enacting Obama's face movements using pix2pix
Challenges and ethical issues
A brief discussion of off-the-shelf implementations

We will cover the internal workings of different GAN architectures and key contributions that have enabled deepfakes. We will also build and train these architectures from scratch to get a better understanding of them. Deepfakes are not limited to videos or photographs, but are also used to generate fake text (news articles, books) and even audio (voice clips, phone calls). In this chapter, we will focus on videos/images only and the term deepfakes refers to related use cases, unless stated otherwise.

All code snippets presented in this chapter can be run directly in Google Colab. For reasons of space, import statements for dependencies have not been included, but readers can refer to the GitHub repository for the full code: https://github.com/PacktPublishing/Hands-On-Generative-AI-with-Python-and-TensorFlow-2.

Let's begin with an overview of deepfakes.

Deepfakes overview

Deepfakes is an all-encompassing term representing content generated using artificial intelligence (in particular, deep learning) that seems realistic and authentic to a human being. The generation of fake content or manipulation of existing content to suit the needs and agenda of the entities involved is not new. In the introduction, we discussed a few movies where CGI and painstaking manual effort helped in generating realistic results. With advancements in deep learning and, more specifically, generative models, it is becoming increasingly difficult to differentiate between what is real and what is fake.

Generative Adversarial Networks (GANs) have played a very important role in this space by enabling the generation of sharp, high-quality images and videos. Works such as https://thispersondoesnotexist.com, based on StyleGAN, have really pushed the boundaries in terms of the generation of high-quality realistic content. A number of other key architectures (some of which we discussed in Chapter 6, Image Generation with GANs, and Chapter 7, Style Transfer with GANs) have become key building blocks for different deepfake setups.

Deepfakes have a number of applications, which can be categorized into creative, productive, and unethical or malicious use cases. The following are a few examples that highlight the different use cases of deepfakes.

Creative and productive use cases:

Recreating history and famous personalities: There are a number of historical figures we would love to interact with and learn from. With the ability to manipulate and generate realistic content, deepfakes are just the right technology for such use cases. A large-scale experiment of this type was developed to bring famous surrealist painter Salvador Dali back to life. The Dali Museum, in collaboration with the ad agency GS&P, developed an exhibition entitled Dali Lives.² The exhibition used archival footage and interviews to train a deepfake setup on thousands of hours of videos. The final outcome was a re-enactment of Dali's voice and facial expressions. Visitors to the museum were greeted by Dali, who then shared his life's stories with them. Toward the end, Dali even proposed a selfie with the visitors, and the output photographs were realistic selfies indeed.
Movie translation: With the likes of Netflix becoming the norm these days, viewers are watching far more cross-lingual content than ever before. While subtitles and manual dubbing are viable options, they leave a lot to be desired. With deepfakes, using AI to autogenerate dubbed translations of any video is easier than ever. The social initiative known as Malaria Must Die created a powerful campaign leveraging a similar technique to help David Beckham, a famous footballer, speak in nine different languages to help spread awareness.³ Similarly, deepfakes have been used by a political party in India, where a candidate is seen speaking in different languages as part of his election campaign.⁴
Fashion: Making use of GANs and other generative models to create new styles and fashion content is not new. With deepfakes, researchers, bloggers, and fashion houses are taking the fashion industry to new levels. We now have AI-generated digital models that are adorning new fashion line-ups and help in reducing costs. This technology is even being used to create renderings of models personalized to mimic a buyer's body type, to improve the chances of a purchase.⁵
Video game characters: Video games have improved a lot over the years, with many modern games presenting cinema class graphics. Traditionally, human actors have been leveraged to create characters within such games. However, there is now a growing trend of using deepfakes and related technologies to develop characters and storylines. The developers of the game Call of Duty released a trailer showing former US president Ronald Reagan playing one of the characters in the game.⁶
Stock images: Marketing flyers, advertisements, and official documents sometimes require certain individuals to be placed alongside the rest of t...