Software Performance and Scalability
eBook - ePub

Software Performance and Scalability

A Quantitative Approach

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Software Performance and Scalability

A Quantitative Approach

About this book

Praise from the Reviewers:

"The practicality of the subject in a real-world situation distinguishes this book from others available on the market."
—Professor Behrouz Far, University of Calgary

"This book could replace the computer organization texts now in use that every CS and CpE student must take. . . . It is much needed, well written, and thoughtful."
—Professor Larry Bernstein, Stevens Institute of Technology

A distinctive, educational text onsoftware performance and scalability

This is the first book to take a quantitative approach to the subject of software performance and scalability. It brings together three unique perspectives to demonstrate how your products can be optimized and tuned for the best possible performance and scalability:

  • The Basics—introduces the computer hardware and software architectures that predetermine the performance and scalability of a software product as well as the principles of measuring the performance and scalability of a software product
  • Queuing Theory—helps you learn the performance laws and queuing models for interpreting the underlying physics behind software performance and scalability, supplemented with ready-to-apply techniques for improving the performance and scalability of a software system
  • API Profiling—shows you how to design more efficient algorithms and achieve optimized performance and scalability, aided by adopting an API profiling framework (perfBasic) built on the concept of a performance map for drilling down performance root causes at the API level

Software Performance and Scalability gives you a specialized skill set that will enable you to design and build performance into your products with immediate, measurable improvements. Complemented with real-world case studies, it is an indispensable resource for software developers, quality and performance assurance engineers, architects, and managers. It is anideal text for university courses related to computer and software performance evaluation and can also be used to supplement a course in computer organization or in queuing theory for upper-division and graduate computer science students.

Ā 

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Software Performance and Scalability by Henry H. Liu in PDF and/or ePUB format, as well as other popular books in Computer Science & Software Development. We have over one million books available in our catalogue for you to explore.

Information

Part 1
The Basics
I went behind the scenes to look at the mechanism.
—Charles Babbage, 1791–1871, the father of computing
The factors that can critically impact the performance and scalability of a software system are abundant. The three factors that have the most impact on the performance and scalability of a software system are the raw capabilities of the underlying hardware platform, the maturity of the underlying software platform (mainly the operating system, various device interface drivers, the supporting virtual machine stack, the run-time environment, etc.), and its own design and implementation. If the software system is an application system built on some middleware systems such as various database servers, application servers, Web servers, and any other types of third-party components, then the performance and scalability of such middleware systems can directly affect the performance and scalability of the application system.
Understanding the performance and scalability of a software system qualitatively should begin with a solid understanding of all the performance bits built into the modern computer systems as well as all the performance and scalability implications associated with the various modern software platforms and architectures. Understanding the performance and scalability of a software system quantitatively calls for a test framework that can be depended upon to provide reliable information about the true performance and scalability of the software system in question. These ideas motivated me to select the following three chapters for this part:
  • Chapter 1—Hardware Platform
  • Chapter 2—Software Platform
  • Chapter 3—Testing Software Performance and Scalability
The material presented in these three chapters is by no means the clichƩ you have heard again and again. I have filled in each chapter with real-world case studies so that you can actually feel the performance and scalability pitches associated with each case quantitatively.
1
Hardware Platform
What mathematical problems should a computing machine solve?
—Konrad Zuse, 1934
To build new specifications from given specifications by a prescription.
—His answer in 1936
Computing is the deviation of result specifications to any specifications by a prescription.
—His extended definition in 1946
What performance a software system exhibits often solely depends on the raw speed of the underlying hardware platform, which is largely determined by the central processing unit (CPU) horsepower of a computer. What scalability a software system exhibits depends on the scalability of the architecture of the underlying hardware platform as well. I have had many experiences with customers who reported that slow performance of the software system was simply caused by the use of undersized hardware. It’s fair to say that hardware platform is the number one most critical factor in determining the performance and scalability of a software system. We’ll see in this chapter the two supporting case studies associated with the IntelĀ® hyperthreading technology and new Intel multicore processor architecture.
As is well known, the astonishing advances of computers can be characterized quantitatively by Moore’s law. Intel co-founder Gordon E. Moore stated in his 1965 seminal paper that the density of transistors on a computer chip is increasing exponentially, doubling approximately every two years. The trend has continued for more than half a century and is not expected to stop for another decade at least.
The quantitative approach pioneered by Moore has been very effective in quantifying the advances of computers. It has been extended into other areas of computer and software engineering as well, to help refine the methodologies of developing better software and computer architectures [Bernstein and Yuhas, 2005; Laird and Brennan, 2006; Gabarro, 2006; Hennessy and Patterson, 2007]. This book is an attempt to introduce quantitativeness into dealing with the challenges of software performance and scalability facing the software industry today.
To see how modern computers have become so powerful, let’s begin with the Turing machine.
1.1 TURING MACHINE
Although Charles Babbage (1791–1871) is known as the father of computing, the most original idea of a computing machine was described by Alan Turing more than seven decades ago in 1936. Turing was a mathematician and is often considered the father of modern computer science.
As shown in Figure 1.1, a Turing machine consists of the following four basic elements:
Figure 1.1 Concept of a Turing machine.
image
  • A tape, which is divided into cells, one next to the other. Each cell contains a symbol from some finite alphabet. This tape is assumed to be infinitely long on both ends. It can be read or written.
  • A head that can read and write symbols on the tape.
  • A table of instructions that tell the machine what to do next, based on the current state of the machine and the symbols it is reading on the tape.
  • A state register that stores the states of the machine.
A Turing machine has two assumptions: one is the unlimited storage space and the other is completing a task regardless of the amount of time it takes. As a theoretical model, it exhibits the great power of abstraction to the highest degree. To some extent, modern computers are as close to Turing machines as modern men are close to cavemen. It’s so amazing that today’s computers still operate on the same principles as Turing proposed seven decades ago. To convince you that this is true, here is a comparison between a Turing machine’s basic elements and a modern computer’s constituent parts:
  • Tape—memory and disks
  • Head—I/O controllers (memory bus, disk controllers, and network port)
  • Table + state register—CPUs In the next section, I’ll briefly introduce the next milestone in computing history, the von Neumann architecture.
1.2 VON NEUMANN MACHINE
John von Neumann was another mathematician who pioneered in making computers a reality in computing history. He proposed and participated in building a machine named EDVAC (Electronic Discrete Variable Automatic Computer) in 1946. His model is very close to the computers we use today. As shown in Figure 1.2, the von Neumann model consists of four parts: memory, control unit, arithmetic logic unit, and input/output.
Figure 1.2 von Neumann architecture.
image
Similar to the modern computer architecture, in the von Neumann architecture, memory is where instructions and data are stored, the control unit interprets instructions while coordinating other units, the arithmetic logic unit performs arithmetic and logical operations, and the input/output provides the interface with users.
A most prominent feature of the von Neumann architecture is the concept of stored program. Prior to the von Neumann architecture, all computers were built with fixed programs, much like today’s desktop calculators that cannot run Microsoft Office or play video games except for simple calculations. Stored program was a giant jump in making machine hardware be independent of software programs that can run on it. This separation of hardware from software had profound effects on evolving computers.
The latency associated with data transfer between CPU and memory was noticed as early as the von Neumann architecture. It was known as the von Neumann bottleneck, coined by John Backus in his 1977 ACM Turing Award lecture. In order to overcome the von Neumann bottleneck and improve computing efficiency, today’s computers add more and more cache between CPU and main memory. Caching at the chip level is one of the many very crucial performance optimization strategies at the chip hardware level and is indispensable for modern computers.
In the next section, I’ll give a brief overview about the Zuse machine, which was the earliest generation of commercialized computers. Zuse built his machines independent of the Turing machine and von Neumann machine.
1.3 ZUSE MACHINE
When talking about computing machines, we must mention Konrad Zuse, who was another great pioneer in the history of computing.
In 1934, driven by his dislike of the time-consuming calculations he had to perform as a civil engineer, Konrad Zuse began to formulate his first ideas on computing. He defined the logical architecture of his Z1, Z2, Z3, and Z4 computers. He was completely unaware of any computer-related developments in Germany or in other countries until a very late stage, so he independently conceived and implemented the principles of modern digital computers in isolation.
From the beginning it was clear to Zuse that his computers should be freely programmable, which means that they should be able to read an arbitrary meaningful sequence of instructions from a punch tape. It was also clear to him that the machines should work in the binary number system, because he wanted to construct his computers using binary switching elements. Not only should the numbers be represented in a binary form, but the whole logic of the machine should work using a binary switching mechanism (0–1 principle).
Zuse took performance into account in his designs even from the beginning. He designed a high-performance binary floating point unit in the semilogarithmic representation, which allowed him to calculate very small and very big numbers with sufficient precision. He also implemented a high-performance adder with a one-step carry-ahead and precise arithmetic exceptions handling.
Zuse even funded his own very innovative Zuse KG Company, which produced more than 250 computers with a value of 100 million DM between 1949 and 1969. During his life, Konrad Zuse painted several hundred oil paintings. He held about three dozen exhibitions and sold the paintings. What an interesting life he had!
In the next section, I’ll introduce the Intel architecture, which prevails over the other architectures for modern computers. Most likely, you use an Intel architecture based system for your software development work, and you may also deploy your software on Intel architecture based systems for performance and scalability tests. As a matter of fact, I’ll mainly use the Intel platform throughout this book for demonstrating software performance optimization and tuning techniques that apply to other platforms as well.
1.4 INTEL MACHINE
Intel architecture based systems are most popular not only for development but also for production. Let’s dedicate this section to understanding the Intel architecture based machines.
1.4.1 History of Intel’s Chips
Intel started its chip business with a 108 kHz processor in 1971. Since then, its processor family has evolved from year to year through the chain of 4004–8008–8080–8086–80286–80386–80486–Pentium–Pentium Pro–Pentium II–Pentium III/Xeon–Itanium–Pentium 4/Xeon to today’s multicore processors. Table 1.1 shows the history of the Intel processor evolution up to 2005 when the multicore microarchitecture was introduced to increase energy efficiency while delivering higher performance.
TABLE 1.1 Evolution of the Intel Processor Family Prior to the Multicore Microarchitecture Introduced in 2005
image
1.4.2 Hyperthreading
Intel started introducing its hyperthreading (HT) technology with Pentium 4 in 2002. People outside Intel are often confused about what HT exactly is. This is a very relevant subject when you conduct performance and scalability testing, because you need to know if HT is enabled or not on the systems under test. Let’s clarify what HT is here.
First, let’s see how a two physical processor system works. With a dual-processor system, the two processors are separated from each other physically with two independent sockets. Each of the two processors has its own hardware resources such as arithmetic logical unit (ALU) and cache. The two processors share the main memory only through the system bus, as shown in Figure 1.3.
Figure 1.3 Two physical processors in an Intel system.
image
As shown in Figure 1.4, with hyperthreading, only a small set of microarchitecture states is duplicated, while the arithmetic logic units and cache(s) are shared. Compared with a single processor without HT support, the die size of a single processor with HT is increased by less than 5%. As you can imagine, HT may slow down single-threaded applications because of the overhead for synchronizations between the two logical processors. However, it is beneficial for multithreaded applications. Of course, a single processor with HT will not be the same as ...

Table of contents

  1. Cover
  2. Frontmatter
  3. Title
  4. Copyright page
  5. Dedication
  6. Preface
  7. Acknowledgments
  8. Introduction
  9. Part 1: The Basics
  10. Part 2: Applying Queuing Theory
  11. Part 3: Applying API Profiling
  12. Appendix A: Stochastic Equilibrium and Ergodicity
  13. Appendix B: Memoryless Property of the Exponential Distribution
  14. Appendix C: M/M/1 Queues at Steady State
  15. Index