Computing Networks
eBook - ePub

Computing Networks

From Cluster to Cloud Computing

Pascale Vicat-Blanc, Brice Goglin, Romaric Guillier, Sebastien Soudan

Share book
  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Computing Networks

From Cluster to Cloud Computing

Pascale Vicat-Blanc, Brice Goglin, Romaric Guillier, Sebastien Soudan

Book details
Book preview
Table of contents
Citations

About This Book

"Computing Networks" explores the core of the new distributed computing infrastructures we are using today: the networking systems of clusters, grids and clouds. It helps network designers and distributed-application developers and users to better understand the technologies, specificities, constraints and benefits of these different infrastructures' communication systems.

Cloud Computing will give the possibility for millions of users to process data anytime, anywhere, while being eco-friendly. In order to deliver this emerging traffic in a timely, cost-efficient, energy-efficient, and reliable manner over long-distance networks, several issues such as quality of service, security, metrology, network-resource scheduling and virtualization are being investigated since 15 years. "Computing Networks" explores the core of clusters, grids and clouds networks, giving designers, application developers and users the keys to better construct and use these powerful infrastructures.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Computing Networks an online PDF/ePUB?
Yes, you can access Computing Networks by Pascale Vicat-Blanc, Brice Goglin, Romaric Guillier, Sebastien Soudan in PDF and/or ePUB format, as well as other popular books in Informatik & Computernetzwerke. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley-ISTE
Year
2013
ISBN
9781118601914
Edition
1

Chapter 1

From Multiprocessor Computers to the Clouds

1.1. The explosion of demand for computing power

The demand for computing power continues to grow because of the technological advances in methods of digital acquisition and processing, the subsequent explosion of volumes of data, and the expansion of connectivity and information exchange. This ever-increasing demand varies depending on the scientific, industrial and domestic sectors considered.
Scientific applications have always needed increasing computing resources. Nevertheless, a new fact has appeared in the past few years: todayā€™s science relies on a very complex interdependence between disciplines, technologies and equipment.
In many disciplines the scientist can no longer work alone at his or her table or with his or her blank sheet of paper. He or she must rely on other specialists to provide him or her with the complementary and indispensable technical and methodological tools for his or her own research. This is what is called the development of multidisciplinarity.
For example, life science researchers today have to analyze enormous quantities of experimental data that can only be processed by multidisciplinary teams of experts carrying out complex studies and experiments and requiring extensive calculations. The organization of communities and the intensification of exchanges between researchers that has occurred over the past few years has increased the need to mutualize data and collaborate directly.
Thus these teams, gathering diverse and complementary expertise, demand cooperative work environments that enable them to analyze and visualize large groups of biological data, discuss the results and address questions of biological data in an interactive manner.
These environments must combine advanced visualization resources, broadband connectivity and access to important reserves of computing resources. With such environments, biologists hope, for example, to be able to analyze cell images at very high resolution. Current devices only enable portions of cells to be visualized, and this at a low level of resolution. It is also impossible to obtain contextual information such as the location in the cell, the type of cell or the metabolic state.
Another example is that of research on climate change. One of the main objectives is to calculate an adequate estimate of statistics of the variability of climate and thus anticipate the increase in greenhouse gas concentration. The study areas are very varied, going from ocean circulation stability to changes in atmospheric circulation on a continent. It also includes statistics on extreme events. It is a fundamental domain that requires the combination of a lot of data originating from sources that are very heterogeneous and by nature geographically remote. It involves the coupling of diverse mathematical models and crossing the varied and complementary points of view of experts.
As for industrial applications, the expansion and use of digital simulation increases the need for computing power. Digital simulation is a tool that enables the simulation of real and complex physical phenomena (resistance of a material, wearing away of a mechanism under different types of operating conditions, etc.) using a computer program. The engineer can therefore study the operation and properties of the system modeled and predict its evolution. Scientific digital simulations rely on the implementation of mathematical models that are often based on the finite elements technique and the visualization of computing results by computer-generated images. All of these calculations require great processing power.
In addition to this, the efficiency of computer infrastructure is a crucial factor in business. The cost of maintenance, but also, increasingly the cost of energy, can become prohibitive. Moreover, the need to access immense computing power can be sporadic. A business does not need massive resources to be continuously available. Only a few hours or a few nights per week can suffice: externalization and virtualization of computer resources has become increasingly interesting in this sector.
The domestic sector is also progressively requiring increased computing, storage and communication power. The Internet is now found in most homes in industrialized countries. The asymmetric digital subscriber line, otherwise known as ADSL is commonplace. In the near future Fiber To The Home (FTTH) will enable the diffusion of new domestic, social and recreational applications based, for example, on virtual-reality or increased-reality technologies, requiring tremendous computing capacities.
Computing resource needs are growing exponentially. Added to this, thanks to the globalization of trade, the geographical distribution of communicating entities has been amplified. To face these new challenges, three technologies have been developed in the past few years:
ā€“ computer clusters;
ā€“ computing grids; and
ā€“ computing and storage clouds.
In the following sections, we analyze the specificities of these different network computing technologies based on the most advanced communication methods and software.

1.2. Computer clusters

1.2.1. The emergence of computer clusters

The NOW [AND 95] and Beowulf [STE 95] projects in the 1990s launched the idea of aggregating hundreds of standard machines in order to form a high-power computing cluster. The initial interest lay in the highly beneficial performance/price relationship because aggregating standard materials was a lot cheaper than purchasing the specialized supercomputers that existed at the time. Despite this concept, achieving high computing power actually requires masking the structure of a cluster, particularly the time- and bandwidth-consuming communications between the different nodes. Many works were therefore carried out on the improvement of these communications in conjunction with the particular context of parallel applications that are executed on these clusters.

1.2.2. Anatomy of a computer cluster

Server clusters or computer farms designate the local collection of several independent computers (called nodes) that are globally run and destined to surpass the limitations of a single computer. They do this in order to:
ā€“ increase computing power and availability;
ā€“ facilitate load increase;
ā€“ enable load balancing;
ā€“ simplify the management of resources (central processing unit or CPU, memory, disks and network bandwidth).
Figure 1.1 highlights the hierarchical structure of a cluster organized around a network of interconnected equipment (switches). The
Figure 1.1. Typical architecture of a computer cluster
FigureĀ 1.1
machines making up a server cluster are generally of the same type. They are stacked up in racks and connected to switches. Therefore systems can evolve based on need: nodes are added and connected on demand. This type of aggregate, much cheaper than a multiprocessor server, is frequently used for parallel computations. Optimized use of resources enables the distribution of data processing on the different nodes. Clients communicate with a cluster as if it were a single machine. Clusters are normally made up of three or four types of nodes:
ā€“ computing nodes (the most numerous ā€“ there are generally 16, 32, 64, 128 or 256 of them);
ā€“ storage nodes (fewer than about 10);
ā€“ front-end nodes (one or more);
ā€“ there may also be additional nodes dedicated to system surveillance and measurement.
Nodes can be linked to each other by several networks:
ā€“ the computing network, for exchanges between processes; and
ā€“ the administration and control network (loading of system images on nodes, follow-up, load measurement, etc.).
To ensure a large enough bandwidth during the computing phases, computing network switches generally have a large number of ports. Each machine, in theory, has the same bandwidth for communicating with other machines linked to the same equipment. This is called full bandwidth bisection. The computing network is characterized by a very broad bandwidth and above all has a very low latency. This network is a high performance network and is often based on a specific communication topology and technology (see Chapter 2). The speeds of computing networks can reach 10 Gbit/s between each machine, and latency can be as low as a few nanoseconds. The control network is a classic Ethernet local area network with a speed of 100 Mbit/s or 1 Gbit/s. The parallel programs executed on clusters often use the Message Passing Interface communication library, enabling messages to be exchanged between the different processors distributed on the nodes. Computing clusters are used for high performance computing in digital imagery, especially for computer-generated images computed in render farms.
Should a server fail, the administration software of the cluster is capable of transferring the tasks executed on the faulty server to the other servers in the cluster. This technology is used in information system management to increase the availability of systems. Disk farms shared and linked by a storage area network are an example of this technology.

1.3. Computing grids

The term ā€œgridā€ was introduced at the end of the 1990s by Ian Foster and Carl Kesselman [FOS 04] and goes back to the idea of aggregating and sharing the distributed computing power inherent in the concept of metacomputing which has been studied since the 1980s. The principal specificity of grids is to enable the simple and transparent use of computing resources as well data spread out across the world without worrying about their location.
Computing grids are distributed systems that combine heterogeneous and high-performance resources connected by a wide-area network (WAN). The underlying vision of the grid concept is to offer access to a quasi-unlimited capacity of information-processing facilities ā€“ computing power ā€“ in a way that is as simple and ubiquitous as electric power access. Therefore, a simple connection enables us to get access to a global and virtual computer. According to this vision, computing power would be delivered by many computing resources, such as computing servers and data servers available to all through a universal network.
In a more formal and realistic way, grid computing is an evolution of distributed computing based on dynamic resource sharing between participants, organizations and businesses. It aims to mutualize resources to execute intensive computing applications or to process large volumes of data.
Indeed, whereas the need for computing power is becoming increasingly important, it has become ever more sporadic. Computing power is only needed during certain hours of the day, certain periods of the year or in the face of certain exceptional events. Each organization or business, not being able to acquire oversized computing equipment for temporary use, decides to mutualize its computing resources with those of other organizations. Mutualization on an international scale offers the advantage of benefiting from time differences and re-using the resources of others during the day where it is nighttime where they are. The grid therefore appeared as a new approach promising to provide a large number of scientific domains, and more recently industrial communities, with the computing power they need.
Time-sharing of resources offers an economical and flexible solution to access the power required. From the userā€™s point of view, theoretically the origin of the resources used is totally abstract and transparent. The user, in the end, should not worry about any...

Table of contents