Large Scale Network-Centric Distributed Systems
eBook - ePub

Large Scale Network-Centric Distributed Systems

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Large Scale Network-Centric Distributed Systems

About this book

A highly accessible reference offering a broad range of topics and insights on large scale network-centric distributed systems

Evolving from the fields of high-performance computing and networking, large scale network-centric distributed systems continues to grow as one of the most important topics in computing and communication and many interdisciplinary areas. Dealing with both wired and wireless networks, this book focuses on the design and performance issues of such systems.

Large Scale Network-Centric Distributed Systems provides in-depth coverage ranging from ground-level hardware issues (such as buffer organization, router delay, and flow control) to the high-level issues immediately concerning application or system users (including parallel programming, middleware, and OS support for such computing systems). Arranged in five parts, it explains and analyzes complex topics to an unprecedented degree:

  • Part 1: Multicore and Many-Core (Mc) Systems-on-Chip
  • Part 2: Pervasive/Ubiquitous Computing and Peer-to-Peer Systems
  • Part 3: Wireless/Mobile Networks
  • Part 4: Grid and Cloud Computing
  • Part 5: Other Topics Related to Network-Centric Computing and Its Applications

Large Scale Network-Centric Distributed Systems is an incredibly useful resource for practitioners, postgraduate students, postdocs, and researchers.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weโ€™ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere โ€” even offline. Perfect for commutes or when youโ€™re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Large Scale Network-Centric Distributed Systems by Hamid Sarbazi-Azad, Albert Y. Zomaya, Hamid Sarbazi-Azad,Albert Y. Zomaya in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Engineering. We have over one million books available in our catalogue for you to explore.
Part 1
Multicore and Many-Core (MC) Systems-On-Chip
1
A Reconfigurable On-Chip Interconnection Network for Large Multicore Systems
Mehdi Modarressi and Hamid Sarbazi-Azad

Contents

1.1 Introduction
1.1.1 Multicore and Many-Core Era
1.1.2 On-Chip Communication
1.1.3 Conventional Communication Mechanisms
1.1.4 Network-on-Chip
1.1.5 NoC Topology Customization
1.1.6 NoCs and Topology Reconfigurations
1.1.7 Reconfigurations Policy
1.2 Topology and Reconfiguration
1.3 The Proposed NoC Architecture
1.3.1 Baseline Reconfigurable NoC
1.3.2 Generalized Reconfigurable NoC
1.4 Energy and Performance-Aware Mapping
1.4.1 The Design Procedure for the Baseline Reconfigurable NoC
1.4.1.1 Core-to-Network Mapping
1.4.1.2 Topology and Route Generation
1.4.2 Mapping and Topology Generation for Cluster-Based NoC
1.5 Experimental Results
1.5.1 Baseline Reconfigurable NoC
1.5.2 Performance Evaluation with Cost Constraints
1.5.3 Comparison Cluster-Based NoC
1.6 Conclusion
References

1.1 Introduction

1.1.1 Multicore and Many-Core Era

With the recent scaling of semiconductor technology, coupled with the ever-increasing demand for high-performance computing in embedded, desktop, and server computer systems, the current general-purpose microprocessors have been moving from single-core to multicore and eventually to many-core processor architectures containing tens to hundreds of identical cores [1]. Major manufacturers already ship 10-core [2], 16-core [3, 4], and 48-core [5] chip multiprocessors, while some special-purpose processors have pushed the limit further to 188 [6], 200 [7], and 336 [8] cores.
Following the same trend, current multicore system-on-chips (SoC) have grown in size and complexity and consist of tens to hundreds of logic blocks of different types communicating with each other at very-high-speed rates.

1.1.2 On-Chip Communication

As the core count scales up, the rate and complexity of intercore communications increase dramatically. Consequently, the efficiency of on-chip communication mechanisms has emerged as a critical determinant of the overall performance in complex multicore system-on-chips (SoCs) and chip multiprocessors (CMPs). In addition to the performance considerations, on-chip interconnects of a conventional SoC and CMP account for a considerable fraction of the consumed power, and this fraction is expected to grow with every new technology point. The advent of deep submicron and nanotechnologies and supply voltage scaling also brings about several signal integrity and reliability issues [9]. As a result, interconnect design poses a whole new set of challenges for SoC and CMP designers.

1.1.3 Conventional Communication Mechanisms

Conventional small-scale SoCs and CMPs use the legacy bus and ad hoc dedicated links to manage on-chip traffic. With dedicated point-to-point links, the intercore data travel on dedicated wires directly connecting two end-point cores. Thus, they can potentially yield the ideal performance and power results when connecting a few cores. However, when the number of on-chip components increases, this scheme requires a huge amount of wiring to directly connect every component, with less than 10% average wire usage in time [10]. Consequently, the poor scalability due to considerable area overhead is a prohibitive drawback of dedicated links. In addition, the dedicated wires in submicron and nanotechnologies need special attention to manage hard-to-predict power, signal integrity, and performance issues. Furthermore, due to their ad hoc nature, dedicated links are not reusable. These issues bring the design effort to the forefront as the second drawback of the dedicated wires.
Bus architectures are the most common and cost-effective on-chip communication solution for traditional multicore SoCs and CMPs with a modest number of processors. However, bus-based communication schemes, even those utilizing hierarchies of buses, can support a few concurrent communications. Connecting more components to a shared bus would also lead to large bus lengths that in turn result in considerable energy overhead and unmanageable clock skew. Therefore, when the number of devices that need to communicate is high, bus-based systems show poor power and performance scalability [9]. Such scalability problems continue to increase, as technology advances allow more cores to be integrated on a single chip. The scalability and bandwidth challenges of the bus have led to a shift in the board-level interchip communication paradigm and the widely used PCI bus is replaced by the switch-based PCI Express network-on-board.
The on-chip communication has traveled the same path in the past decades: the problems of the bus and dedicated links and the efficiency of packet-based interconnection networks in parallel machines motivated researchers to propose switch-based network-on-chips (NoCs) to connect the cores in a high-performance, flexible, scalable, and reusable manner [10โ€“12].

1.1.4 Network-on-Chip

Networks on chip have now expanded from an interesting area of research to a viable industrial solution for multicore processors ranging from high-end server processors [5] to embedded SoCs [13]. The building blocks of on-chip networks are the routers at every node that are interconnected by short local on-chip wires. Routers multiplex multiple communication flows (in the form of data packets) over the links and manage the traffic in a distributed fashion. Relying on a modular and scalable infrastructure, NoCs can potentially deliver high-bandwidth, low-latency, and low-power communication. From the communication perspective, this allows integration of many components on a single chip.
The benefits of NoCs in providing scalable and high-bandwidth communication are substantial. However, the need for complex and multistage pipelined routers presents several challenges in reaching the potential latency and throughput of NoCs, due to their tight area and power budgets. Authors in [1] show that the bandwidth demands of future server and embedded applications is expected to grow greatly and project that in future CMPs and multicore SoCs, the power consumption of the NoCs implemented by the current methodologies will be about 10 times greater than the power budget that can be devoted to them. Therefore, much research has focused on improving NoC efficiency to bridge the existing gap between the current and the ideal NoC power/performance metrics.
Application-specific optimization is one of the most effective methods to increase the efficiency of the NoC [1]. This class of optimization methods tries to customize the architecture and characteristics of an NoC for a target application. These methods can work at either design time, if the application and its traffic characteristics are known in advance (which is the case for most embedded applications running on multicore SoCs), or at run time for the NoCs used in general-purpose CMPs.
There has been substantial research on application-specific optimization of NoCs, varying from simple methods that update routing tables for each application to sophisticated methods of router microarchitecture and topology reconfiguration [14].

1.1.5 NoC Topology Customization

The performance of a NoC is extremely sensitive to its topology, which determines the placement and connectivity of the network nodes. Proper topology, consequently, is an important target for many NoC customization methods. An equally important problem in specialized multicore SoCs is core (or processing node) to NoC node mapping, which determines on which NoC node each processing core should be physically placed. Mapping algorithms generally try to place the processing cores communicating more frequently near each other; note that when the number of intermediate routers between two communicating cores is reduced, the power consumption and latency of the communication between them decreases proportionally.
Topology and mapping deal with the physical placement of network nodes and links. As a result, the mapping and topology cannot be modified once the chip is fabricated and will remain unchanged during system lifetime. Due to this physical constraint, most current design flows for application-specific multicore SoCs are only effective in providing design time mapping and topology optimization for a single application [15โ€“18]. In other words, they generate and synthesize an optimized topology and mapping based on the traffic pattern of a single application.
This makes problems for today's multicore SoCs that run several different applications (often unknown at design time). Since the intercore communication characteristics can be very different across different applications, a topology that is designed based on the traffic pattern of one application does not necessarily meet the design constraints of other applications. Even the traffic generated by a single application may vary significantly in different phases of its operation. For example, the IEEE 802.11n standard (WiFi) supports 144 communications modes, each with different communication demands among cores [19]. In [20], more than 1500 different NoC configurations (topology, buffer size, and so on) are investigated and it has been shown that no single NoC can be found to provide optimal performance across a range of applications.

1.1.6 NoCs and Topology Reconfigurations

In this chapter, we introduce a NoC with reconfigurable topology, which ...

Table of contents

  1. Cover
  2. Wiley Series on Parallel and Distributed Computing
  3. Title Page
  4. Copyright
  5. Dedication
  6. Preface
  7. Acknowledgments
  8. List of Figures
  9. List of Tables
  10. List of Contributors
  11. Part 1: Multicore and Many-Core (MC) Systems-On-Chip
  12. Part 2: Pervasive/Ubiquitous Computing and Peer-To-Peer Systems
  13. Part 3: Wireless/Mobile Networks
  14. Part 4: Grid and Cloud Computing
  15. Part 5: Other Topics Related to Network-Centric Computing and Its Applications
  16. Index