
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
About this book
The new edition of a bestseller, now revised and update throughout!
This new edition of the unparalleled bestseller serves as a full training course all in one and as the world's largest data storage company, EMC is the ideal author for such a critical resource. They cover the components of a storage system and the different storage system models while also offering essential new material that explores the advances in existing technologies and the emergence of the "Cloud" as well as updates and vital information on new technologies.
- Features a separate section on emerging area of cloud computing
- Covers new technologies such as: data de-duplication, unified storage, continuous data protection technology, virtual provisioning, FCoE, flash drives, storage tiering, big data, and more
- Details storage models such as Network Attached Storage (NAS), Storage Area Network (SAN), Object Based Storage along with virtualization at various infrastructure components
- Explores Business Continuity and Security in physical and virtualized environment
- Includes an enhanced Appendix for additional information
This authoritative guide is essential for getting up to speed on the newest advances in information storage and management.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Information Storage and Management by in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Networking. We have over one million books available in our catalogue for you to explore.
Information
Section III
Backup, Archive, and Replication
In This Section
Chapter 9: Introduction to Business Continuity
Chapter 10: Backup and Archive
Chapter 11: Local Replication
Chapter 12: Remote Replication
Chapter 9
Introduction to Business Continuity
Key Concepts
Business Continuity
Information Availability
Disaster Recovery
BC Planning
Business Impact Analysis
Multipathing Software
In today's world, continuous access to information is a must for the smooth functioning of business operations. The cost of unavailability of information is greater than ever, and outages in key industries cost millions of dollars per hour. There are many threats to information availability, such as natural disasters, unplanned occurrences, and planned occurrences, that could result in the inaccessibility of information. Therefore it is critical for businesses to define an appropriate strategy that can help them overcome these crises. Business continuity is an important process to define and implement these strategies.
Business continuity (BC) is an integrated and enterprise-wide process that includes all activities (internal and external to IT) that a business must perform to mitigate the impact of planned and unplanned downtime. BC entails preparing for, responding to, and recovering from a system outage that adversely affects business operations. It involves proactive measures, such as business impact analysis, risk assessments, BC technology solutions deployment (backup and replication), and reactive measures, such as disaster recovery and restart, to be invoked in the event of a failure. The goal of a BC solution is to ensure the “information availability” required to conduct vital business operations.
In a virtualized environment, BC technology solutions need to protect both physical and virtualized resources. Virtualization considerably simplifies the implementation of BC strategy and solutions.
This chapter describes the factors that affect information availability and the consequences of information unavailability. It also explains the key parameters that govern any BC strategy and the roadmap to develop an effective BC plan.
9.1 Information Availability
Information availability (IA) refers to the ability of an IT infrastructure to function according to business expectations during its specified time of operation. IA ensures that people (employees, customers, suppliers, and partners) can access information whenever they need it. IA can be defined in terms of accessibility, reliability, and timeliness of information.
- Accessibility: Information should be accessible at the right place, to the right user.
- Reliability: Information should be reliable and correct in all aspects. It is “the same” as what was stored, and there is no alteration or corruption to the information.
- Timeliness: Defines the exact moment or the time window (a particular time of the day, week, month, and year as specified) during which information must be accessible. For example, if online access to an application is required between 8:00 a.m. and 10:00 p.m. each day, any disruptions to data availability outside of this time slot are not considered to affect timeliness.
9.1.1 Causes of Information Unavailability
Various planned and unplanned incidents result in information unavailability. Planned outages include installation/integration/maintenance of new hardware, software upgrades or patches, taking backups, application and data restores, facility operations (renovation and construction), and refresh/migration of the testing to the production environment. Unplanned outages include failure caused by human errors, database corruption, and failure of physical and virtual components.
Another type of incident that may cause data unavailability is natural or man-made disasters, such as flood, fire, earthquake, and contamination. As illustrated in Figure 9.1, the majority of outages are planned. Planned outages are expected and scheduled but still cause data to be unavailable. Statistically, the cause of information unavailability due to unforeseen disasters is less than 1 percent.
Figure 9.1 Disruptors of information availability

9.1.2 Consequences of Downtime
Information unavailability or downtime results in loss of productivity, loss of revenue, poor financial performance, and damage to reputation. Loss of productivity includes reduced output per unit of labor, equipment, and capital. Loss of revenue includes direct loss, compensatory payments, future revenue loss, billing loss, and investment loss. Poor financial performance affects revenue recognition, cash flow, discounts, payment guarantees, credit rating, and stock price. Damages to reputations may result in a loss of confidence or credibility with customers, suppliers, financial markets, banks, and business partners. Other possible consequences of downtime include the cost of additional equipment rental, overtime, and extra shipping.
The business impact of downtime is the sum of all losses sustained as a result of a given disruption. An important metric, average cost of downtime per hour, provides a key estimate in determining the appropriate BC solutions. It is calculated as follows:
Average cost of downtime per hour = average productivity loss per hour + average revenue loss per hour
Where:
Productivity loss per hour = (total salaries and benefits of all employees per week)/(average number of working hours per week)
Average revenue loss per hour = (total revenue of an organization per week)/(average number of hours per week that an organization is open for business)
The average downtime cost per hour may also include estimates of projected revenue loss due to other consequences, such as damaged reputations, and the additional cost of repairing the system.
9.1.3 Measuring Information Availability
IA relies on the availability of both physical and virtual components of a data center. Failure of these components might disrupt IA. A failure is the termination of a component's capability to perform a required function. The component's capability can be restored by performing an external corrective action, such as a manual reboot, repair, or replacement of the failed component(s). Repair involves restoring a component to a condition that enables it to perform a required function. Proactive risk analysis, performed as part of the BC planning process, considers the component failure rate and average repair time, which are measured by mean time between failure (MTBF) and mean time to repair (MTTR):
- Mean Time Between Failure (MTBF): It is the average time available for a system or component to perform its normal operations between failures. It is the measure of system or component reliability and is usually expressed in hours.
- Mean Time To Repair (MTTR): It is the average time required to repair a failed component. While calculating MTTR, it is assumed that the fault responsible for the failure is correctly identified and the required spares and personnel are available. A fault is a physical defect at the component level, which may result in information unavailability. MTTR includes the total time required to do the following activities: Detect the fault, mobilize the maintenance team, diagnose the fault, obtain the spare parts, repair, test, and restore the data. Figure 9.2 illustrates the various information availability metrics that represent system uptime and downtime.
Figure 9.2 Information availability metrics

IA is the time period during which a system is in a condition to perform its intended function upon demand. It can be expressed in terms of system uptime and downtime and measured as the amount or percentage of system uptime:

Where system uptime is the period of time during which the system is in an accessible state; when it is not accessible, it is termed as system downtime. In terms of MTBF and MTTR, IA could also be expressed as

Uptime per year is based on the exact timeliness requirements of the service. This calculation leads to the number of “9s” representation for availability metrics. Table 9.1 lists the approximate amount of downtime allowed for a service to achieve certain levels of 9s availability.
Table 9.1 Availability Percentage and Allowable Downtime

For example, a service that is said to be “five 9s available” is available for 99.999 percent of the scheduled time in a year (24 × 365).
9.2 BC Terminology
This section introduces and defines common terms related to BC operations, which are used in the next few chapters to explain advanced concepts:
- Disaster recovery: This is the coordinated process of restoring systems, data, and the infrastructure required to support ongoing business operations after a disaster occurs. It is the process of restoring a previous copy of the data and applying logs or other necessary processes to that copy to bring it to a known point of consistency. After all recovery efforts are completed, the data is validated to ensure that it is correct.
- Disaster restart: This is the process of restarting business operations with mirrored consistent copies of data and applications.
- Recovery-Point Objective (RPO): This is the point in time to which systems and data must be recovered after an outage. It defines the amount of data loss that a business can endure. A large RPO signifies high tolerance to information loss in a business. Based on the RPO, organizations plan for the frequency with which a bac...
Table of contents
- Cover
- Section I: Storage System
- Section II: Storage Networking Technologies
- Section III: Backup, Archive, and Replication
- Section V: Securing and Managing Storage Infrastructure
- Section V: Securing and Managing Storage Infrastructure
- Appendix A: Application I/O Characteristics
- Appendix B: Parallel SCSI
- Appendix C: SAN Design Exercises
- Appendix D: Information Availability Exercises
- Appendix E: Network Technologies for Remote Replication
- Appendix F: Acronyms and Abbreviations
- Glossary
- Foreword
- Introduction