eBook - ePub

Generative AI with Python and TensorFlow 2

Name: Generative AI with Python and TensorFlow 2
Author: Joseph Babcock, Raghav Bali

Harness the power of generative models to create images, text, and music

Joseph Babcock, Raghav Bali

Buch teilen

488 Seiten
English
ePUB (handyfreundlich)
Über iOS und Android verfügbar

eBook - ePub

Generative AI with Python and TensorFlow 2

Harness the power of generative models to create images, text, and music

Joseph Babcock, Raghav Bali

Angaben zum Buch

Buchvorschau

Inhaltsverzeichnis

Quellenangaben

Über dieses Buch

Fun and exciting projects to learn what artificial minds can create

Key Features

Code examples are in TensorFlow 2, which make it easy for PyTorch users to follow along
Look inside the most famous deep generative models, from GPT to MuseGAN
Learn to build and adapt your own models in TensorFlow 2.x
Explore exciting, cutting-edge use cases for deep generative AI

Book Description

Machines are excelling at creative human skills such as painting, writing, and composing music. Could you be more creative than generative AI?

In this book, you'll explore the evolution of generative models, from restricted Boltzmann machines and deep belief networks to VAEs and GANs. You'll learn how to implement models yourself in TensorFlow and get to grips with the latest research on deep neural networks.

There's been an explosion in potential use cases for generative models. You'll look at Open AI's news generator, deepfakes, and training deep learning agents to navigate a simulated environment.

Recreate the code that's under the hood and uncover surprising links between text, image, and music generation.

What you will learn

Export the code from GitHub into Google Colab to see how everything works for yourself
Compose music using LSTM models, simple GANs, and MuseGAN
Create deepfakes using facial landmarks, autoencoders, and pix2pix GAN
Learn how attention and transformers have changed NLP
Build several text generation pipelines based on LSTMs, BERT, and GPT-2
Implement paired and unpaired style transfer with networks like StyleGAN
Discover emerging applications of generative AI like folding proteins and creating videos from images

Who this book is for

This is a book for Python programmers who are keen to create and have some fun using generative models. To make the most out of this book, you should have a basic familiarity with math and statistics for machine learning.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?

Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.

(Wie) Kann ich Bücher herunterladen?

Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.

Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?

Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.

Was ist Perlego?

Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.

Unterstützt Perlego Text-zu-Sprache?

Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.

Ist Generative AI with Python and TensorFlow 2 als Online-PDF/ePub verfügbar?

Ja, du hast Zugang zu Generative AI with Python and TensorFlow 2 von Joseph Babcock, Raghav Bali im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Informatik & Künstliche Intelligenz (KI) & Semantik. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Verlag

Packt Publishing

Jahr

2021

ISBN

9781800208506

Auflage

Thema

Informatik

Thema

Künstliche Intelligenz (KI) & Semantik

8 Deepfakes with GANs

Manipulating videos and photographs to edit artifacts has been in practice for quite a long time. If you have seen movies like Forrest Gump or Fast and Furious 7, chances are you did not even notice that the scenes with John F. Kennedy or Paul Walker in their respective movies were fake and edited into the movies as required.

You may recall one particular scene from the movie Forrest Gump, where Gump meets John F. Kennedy. The scene was created using complex visual effects and archival footage to ensure high-quality results. Hollywood studios, spy agencies from across the world, and media outlets have been making use of editing tools such as Photoshop, After Effects, and complex custom visual effects/CGI (computer generated imagery) pipelines to come up with such compelling results. While the results have been more or less believable in most instances, it takes a huge amount of manual effort and time to edit each and every detail, such as scene lighting, face, eyes, and lip movements, as well as shadows, for every frame of the scene.

Along the same lines, there is a high chance you might have come across a Buzzfeed video¹ where former US president Barack Obama says "Killmonger was right" (Killmonger is one of Marvel Cinematic Universe's villains). While obviously fake, the video does seem real in terms of its visual and audio aspects. There are a number of other examples where prominent personalities can be seen making comments they would usually not.

Keeping ethics aside, there is one major difference between Gump meeting John F. Kennedy and Barack Obama talking about Killmonger. As mentioned earlier, the former is the result of painstaking manual work done using complex visual effects/CGI. The latter, on the other hand, is the result of a technology called deepfakes. A portmanteau of the words deep learning and fake, deepfake is a broad term used to describe AI-enabled technology that is used to generate the examples we discussed.

In this chapter, we will cover different concepts, architectures, and components associated with deepfakes. We will focus on the following topics:

Overview of the deepfakes technological landscape
The different forms of deepfaking: replacement, re-enactment, and editing
Key features leveraged by different architectures
A high-level deepfakes workflow
Swapping faces using autoencoders
Re-enacting Obama's face movements using pix2pix
Challenges and ethical issues
A brief discussion of off-the-shelf implementations

We will cover the internal workings of different GAN architectures and key contributions that have enabled deepfakes. We will also build and train these architectures from scratch to get a better understanding of them. Deepfakes are not limited to videos or photographs, but are also used to generate fake text (news articles, books) and even audio (voice clips, phone calls). In this chapter, we will focus on videos/images only and the term deepfakes refers to related use cases, unless stated otherwise.

All code snippets presented in this chapter can be run directly in Google Colab. For reasons of space, import statements for dependencies have not been included, but readers can refer to the GitHub repository for the full code: https://github.com/PacktPublishing/Hands-On-Generative-AI-with-Python-and-TensorFlow-2.

Let's begin with an overview of deepfakes.

Deepfakes overview

Deepfakes is an all-encompassing term representing content generated using artificial intelligence (in particular, deep learning) that seems realistic and authentic to a human being. The generation of fake content or manipulation of existing content to suit the needs and agenda of the entities involved is not new. In the introduction, we discussed a few movies where CGI and painstaking manual effort helped in generating realistic results. With advancements in deep learning and, more specifically, generative models, it is becoming increasingly difficult to differentiate between what is real and what is fake.

Generative Adversarial Networks (GANs) have played a very important role in this space by enabling the generation of sharp, high-quality images and videos. Works such as https://thispersondoesnotexist.com, based on StyleGAN, have really pushed the boundaries in terms of the generation of high-quality realistic content. A number of other key architectures (some of which we discussed in Chapter 6, Image Generation with GANs, and Chapter 7, Style Transfer with GANs) have become key building blocks for different deepfake setups.

Deepfakes have a number of applications, which can be categorized into creative, productive, and unethical or malicious use cases. The following are a few examples that highlight the different use cases of deepfakes.

Creative and productive use cases:

Recreating history and famous personalities: There are a number of historical figures we would love to interact with and learn from. With the ability to manipulate and generate realistic content, deepfakes are just the right technology for such use cases. A large-scale experiment of this type was developed to bring famous surrealist painter Salvador Dali back to life. The Dali Museum, in collaboration with the ad agency GS&P, developed an exhibition entitled Dali Lives.² The exhibition used archival footage and interviews to train a deepfake setup on thousands of hours of videos. The final outcome was a re-enactment of Dali's voice and facial expressions. Visitors to the museum were greeted by Dali, who then shared his life's stories with them. Toward the end, Dali even proposed a selfie with the visitors, and the output photographs were realistic selfies indeed.
Movie translation: With the likes of Netflix becoming the norm these days, viewers are watching far more cross-lingual content than ever before. While subtitles and manual dubbing are viable options, they leave a lot to be desired. With deepfakes, using AI to autogenerate dubbed translations of any video is easier than ever. The social initiative known as Malaria Must Die created a powerful campaign leveraging a similar technique to help David Beckham, a famous footballer, speak in nine different languages to help spread awareness.³ Similarly, deepfakes have been used by a political party in India, where a candidate is seen speaking in different languages as part of his election campaign.⁴
Fashion: Making use of GANs and other generative models to create new styles and fashion content is not new. With deepfakes, researchers, bloggers, and fashion houses are taking the fashion industry to new levels. We now have AI-generated digital models that are adorning new fashion line-ups and help in reducing costs. This technology is even being used to create renderings of models personalized to mimic a buyer's body type, to improve the chances of a purchase.⁵
Video game characters: Video games have improved a lot over the years, with many modern games presenting cinema class graphics. Traditionally, human actors have been leveraged to create characters within such games. However, there is now a growing trend of using deepfakes and related technologies to develop characters and storylines. The developers of the game Call of Duty released a trailer showing former US president Ronald Reagan playing one of the characters in the game.⁶
Stock images: Marketing flyers, advertisements, and official documents sometimes require certain individuals to be placed alongside the rest of t...