Chapter 1. Introducing Akka
In this chapter
- Why scaling is hard
- Write once, scale anywhere
- Introduction to the actor programming model
- Akka actors
- What is Akka?
Up until the middle of the â90s, just before the internet revolution, it was completely normal to build applications that would only ever run on a single computer, a single CPU. If an application wasnât fast enough, the standard response would be to wait for a while for CPUs to get faster; no need to change any code. Problem solved. Programmers around the world were having a free lunch, and life was good.
In 2005 Herb Sutter wrote in Dr. Dobbâs Journal about the need for a fundamental change (link: http://www.gotw.ca/publications/concurrency-ddj.htm). In short: a limit to increasing CPU clock speeds has been reached, and the free lunch is over.
If applications need to perform faster, or if they need to support more users, they will have to be concurrent. (Weâll get to a strict definition later; for now letâs simply define this as not single-threaded. Thatâs not really correct, but itâs good enough for the moment.)
Scalability is the measure to which a system can adapt to a change in demand for resources, without negatively impacting performance. Concurrency is a means to achieve scalability: the premise is that, if needed, more CPUs can be added to servers, which the application then automatically starts making use of. Itâs the next best thing to a free lunch.
Around the year 2005 when Herb Sutter wrote his excellent article, youâd find companies running applications on clustered multiprocessor servers (often no more than two to three, just in case one of them crashed). Support for concurrency in programming languages was available but limited and considered black magic by many mere mortal programmers. Herb Sutter predicted in his article that âprogramming languages ... will increasingly be forced to deal well with concurrency.â
Letâs see what changed in the decade since! Fast-forward to today, and you find applications running on large numbers of servers in the cloud, integrating many systems across many data centers. The ever-increasing demands of end users push the requirements of performance and stability of the systems that you build.
So where are those new concurrency features? Support for concurrency in most programming languages, especially on the JVM, has hardly changed. Although the implementation details of concurrency APIs have definitely improved, you still have to work with low-level constructs like threads and locks, which are notoriously difficult to work with.
Next to scaling up (increasing resources; for example, CPUs on existing servers), scaling out refers to dynamically adding more servers to a cluster. Since the â90s, nothing much has changed in how programming languages support networking, either. Many technologies still essentially use RPC (remote procedure calls) to communicate over the network.
In the meantime, advances in cloud computing services and multicore CPU architecture have made computing resources ever more abundant.
PaaS (Platform as a Service) offerings have simplified provisioning and deployment of very large distributed applications, once the domain of only the largest players in the IT industry. Cloud services like AWS EC2 (Amazon Web Services Elastic Compute Cloud) and Google Compute Engine give you the ability to literally spin up thousands of servers in minutes, while tools like Docker, Puppet, Ansible, and many others make it easier to manage and package applications on virtual servers.
The number of CPU cores in devices is also ever-increasing: even mobile phones and tablets have multiple CPU cores today.
But that doesnât mean that you can afford to throw any number of resources at any problem. In the end, everything is about cost and efficiency. So itâs all about effectively scaling applications, or in other words, getting bang for your buck. Just as youâd never use a sorting algorithm with exponential time complexity, it makes sense to think about the cost of scaling.
You should have two expectations when scaling your application:
- The ability to handle any increase of demand with finite resources is unrealistic, so ideally youâd want the required increase of resources to be growing slowly when demand grows, linear or better. Figure 1.1 shows the relationship between demand and number of required resources.
- If resources have to be increased, ideally youâd like the complexity of the application to stay the same or increase slowly. (Remember the good olâ free lunch when no added complexity was required for a faster application!) Figure 1.2 shows the relationship between number of resources and complexity.
Figure 1.1. Demand against resources
Figure 1.2. Complexity against resources
Both the number and complexity of resources contribute to the total cost of scaling.
Weâre leaving a lot of factors out of this back-of-the-envelope calculation, but itâs easy to see that both of these rates have a big impact on the total cost of scaling.
One doomsday scenario is where youâd need to pay increasingly more for more underutilized resources. Another nightmare scenario is where the complexity of the application shoots through the roof when more resources are added.
This leads to two goals: complexity has to stay as low as possible, and resources must be used efficiently while you scale the application.
Can you use the common tools of today (threads and RPC) to satisfy these two goals? Scaling out with RPC and scaling up with low-level threading arenât good ideas. RPC pretends that a call over the network is no different from a local method call. Every RPC call needs to block the current thread and wait for a response from the network for the local method call abstraction to work, which can be costly. This impedes the goal of using resources efficiently.
Another problem with this approach is that you need to know exactly where you scale up or scale out. Multithreaded programming and RPC-based network programming are like apples and pears: they run in different contexts, using different semantics and running on different levels of abstraction. You end up hardcoding which parts of your application are using threads for scaling up and which parts are using RPC for scaling out.
Complexity increases significantly the moment you hardcode methods that work on different levels of abstraction. Quickâwhatâs simpler, coding with two entangled programming constructs (RPC and threads), or using just one programming construct? This multipronged approach to scaling applications is more complicated than necessary to flexibly adapt to changes in demand.
Spinning up thousands of servers is simple today, but as youâll see in this first chapter, the same canât be said for programming them.
1.1. What is Akka?
In this book weâll show how the Akka toolkit, an open source project built by Lightbend, provides a simpler, single programming modelâone way of coding for concurrent and distributed applicationsâthe actor programming model. Actors are (fitting for our industry) nothing new at all, in and of themselves. Itâs the way that actors are provided in Akka to scale applications both up and out on the JVM thatâs unique. As youâll see, Akka uses resources efficiently and makes it possible to keep the complexity relatively low while an application scales.
Akkaâs primary goal is to make it simpler to build applications that are deployed in the cloud or run on devices w...