Section 1 – The Basics of Compiler Construction with LLVM
In this section, you will learn how to compile LLVM by yourself, and how you can tailor the build to your needs. You will understand how LLVM projects are organized, and you will create your first project utilizing LLVM. You will also learn how to compile LLVM and applications using LLVM for a different CPU architecture. Finally, you will explore the overall structure of a compiler, while creating a small compiler yourself.
This section comprises the following chapters:
- Chapter 1, Installing LLVM
- Chapter 2, Touring the LLVM Source
- Chapter 3, The Structure of a Compiler
Chapter 1: Installing LLVM
To learn how to work with LLVM, it is best to begin by compiling LLVM from the source. LLVM is an umbrella project, and its GitHub repository contains the sources for all the projects that belong to LLVM. Each LLVM project is in a top-level directory of the repository. Besides cloning the repository, your system must also have all tools that are required by the build system installed.
In this chapter, you will learn about the following topics:
- Getting the prerequisites ready, which will show you how to set up your build system.
- Building with CMake, which will cover how to compile and install the LLVM core libraries and Clang with CMake and Ninja.
- Customizing the build process, which will talk about the various way we can influence the build process.
Getting the prerequisites ready
To work with LLVM, your development system must run a common operating system such as Linux, FreeBSD, macOS, or Windows. Building LLVM and Clang with debug symbols enabled easily need tens of gigabytes of disk space, so be sure that your system has plenty of disk space available – in this scenario, you should have 30 GB of free space.
The required disk space depends heavily on the chosen build options. For example, building only the LLVM core libraries in release mode, while targeting only one platform, requires about 2 GB of free disk space, which is the bare minimum needed. To reduce compile times, a fast CPU (such as a quadcore CPU with 2.5 GHz clock speed) and a fast SSD would also be helpful.
It is even possible to build LLVM on a small device such as a Raspberry Pi – it just takes a lot of time to do so. I developed the examples in this book on a laptop with an Intel quadcore CPU running at 2.7 GHz clock speed, with 40 GB RAM and 2.5 TB SSD disk space. This system is well-suited for the development task at hand.
Your development system must have some prerequisite software installed. Let's review the minimal required versions of these software packages.
Note
Linux distributions often contain more recent versions that can be used. The version numbers are suitable for LLVM 12. Later versions of LLVM may require more recent versions of the packages mentioned here.
To check out the source from GitHub, you need git (https://git-scm.com/). There is no requirement for a specific version. The GitHub help pages recommend using at least version 1.17.10.
The LLVM project uses CMake (https://cmake.org/) as the build file generator. At least version 3.13.4 is required. CMake can generate build files for various build systems. In this book, Ninja (https://ninja-build.org/) is being used because it is fast and available on all platforms. The latest version, 1.9.0, is recommended.
Obviously, you also need a C/C++ compiler. The LLVM projects are written in modern C++, based on the C++14 standard. A conforming compiler and standard library are required. The following compilers are known to work with LLVM 12:
- gcc 5.1.0 or later
- Clang 3.5 or later
- Apple Clang 6.0 or later
- Visual Studio 2017 or later
Please be aware that with further development of the LLVM project, the requirements for the compiler are most likely to change. At the time of writing, there are discussions to use C++17 and drop Visual Studio 2017 support. In general, you should use the latest compiler version available for your system.
Python (https://python.org/) is used to generate the build files and to run the test suite. It should be at least version 3.6.
Although not covered in this book, there may be reasons why you need to use Make instead of Ninja. In this case, you need to use GNU Make (https://www.gnu.org/software/make/) version 3.79 or later. The usage of both build tools is very similar. It is sufficient to replace ninja in each command with make for the scenarios described here.
To install the prerequisite software, the easiest thing to do is use the package manager from your operating system. In the following sections, the commands you must enter to install the software for the most popular operating systems are shown.
Ubuntu
Ubuntu 20.04 uses the APT package manager. Most of the basic utilities are already installed; only the development tools are missing. To install all the packages at once, type the following:
$ sudo apt install –y gcc g++ git cmake ninja-build
Fedora and RedHat
The package manager for Fedora 33 and RedHat Enterprise Linux 8.3 is called DNF. Like Ubuntu, most of the basic utilities are already installed. To install all the packages at once, type the following:
$ sudo dnf install –y gcc gcc-c++ git cmake ninja-build
FreeBSD
On FreeBSD 12 or later, you must use the PKG package manager. FreeBSD differs from Linux-based systems in that Clang is the preferred compiler. To install all the packages at once, type the following:
$ sudo pkg install –y clang git cmake ninja
OS X
For development on OS X, it is best to install Xcode from the Apple store. While the XCode IDE is not used in this book, it comes with the required C/C++ compilers and supporting utilities. To install the other tools, you can use the Homebrew package manager (https://brew.sh/). To install all the packag...