Enterprise servers play a mission-critical role in modern computing environments, especially from a business continuity perspective. Several models of IT capability have been introduced over the last two decades. Enhancing Business Continuity and IT Capability: System Administration and Server Operating Platforms proposes a new model of IT capability. It presents a framework that establishes the relationship between downtime on one side and business continuity and IT capability on the other side, as well as how system administration and modern server operating platforms can help in improving business continuity and IT capability.

This book begins by defining business continuity and IT capability and their importance in modern business, as well as by giving an overview of business continuity, disaster recovery planning, contingency planning, and business continuity maturity models. It then explores modern server environments and the role of system administration in ensuring higher levels of system availability, system scalability, and business continuity. Techniques for enhancing availability and business continuity also include

Business impact analysis

Assessing the downtime impact

Designing an optimal business continuity solution

IT auditing as a process of gathering data and evidence to evaluate whether the company's information systems infrastructure is efficient and effective and whether it meets business goals

The book concludes with frameworks and guidelines on how to measure and assess IT capability and how IT capability affects a firm's performances. Cases and white papers describe real-world scenarios illustrating the concepts and techniques presented in the book.

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Auerbach Publications

Year

Print ISBN

eBook ISBN

Topic

Subtopic

Index

Chapter 1 Introduction: Downtime and Modern Business

Background

Advances in and the proliferation of modern portable computing devices and the wide availability of mobile and LAN/WAN/Internet connections have created an environment in which people connect to their favorable applications and/or websites on a continuous basis. They want to be “always on” in accessing business data, getting up-to-date information, responding to busines and private e-mails, posting files on social networks, and so on. However, in spite of powerful servers, the wide availability of the Internet, Wi-Fi and mobile connections, and high data transfer rates, end users still experience messages such as “server is down”, “site is temporary unavailable”, “your request can’t be completed now, try later”, “DNS failure”, “service unavailable”, “down for maintenance”, “sorry, something went wrong”, “network is unreachable”, “we’ll be back soon, thank you for your patience”, and so on. If users get these messages when they connect to Facebook, Instagram, and other social network sites, just from sharing messages/files, such messages do not result in negative financial effects for them. However, these and similar messages will in most cases result in, for instance, customers’ decisions to switch to another site/vendor/provider when trying to connect to an e-business/e-commerce site and consequently yield some negative financial effects to the company/vendor/provider. Not only “out of use” messages but also delays in application servers’ response time may cause customers to immediately switch to competitors and hence the loss of customers and money.

From a technology point of view, such messages can be the result of several types of hardware and software origins, such as server hardware glitches, server operating system crashes, application bugs, failures in data communication devices and lines, network disconnections, and bad IT operations. However, application servers may also go down and/or become unreachable for several hours/days due to electricity cuts, power outages, natural disasters, and pandemic diseases as well.

Technical issues related to IT infrastructure devices may occur within all types of information architecture models that are in use today such as the client-server on-premises model and client-cloud model. In many cases, these problems may cause the unavailability of application servers or of whole networks, which simply means the unavailability of information. If an application server goes down or a network is unreachable for some time, this situation is known as “system downtime”, and can be caused by a server hardware glitch, a server operating system crash, network component failure, and similar issues. These so-called downtime points in both on-premises client-server architecture and client-cloud architecture are considered mission critical for continuous computing and business continuity in the modern e-business world. According to the IT Disaster Recovery Preparedness Council Report (2015), one hour of downtime can cost small companies as much as $8,000, midsize companies up to $74,000, and large enterprises up to $700,000. Ponemon Institute (2016) reported that the average cost of a data center outage has steadily increased from $505,502 in 2010 to $740,357 today (or a 38% net change). An Information Technology Intelligence Consulting (ITIC) report (ITIC Report, 2016) found that 98% of organizations say a single hour of downtime costs over $100,000; 81% of respondents indicated an amount of over $300,000. And a record one-third, or 33%, of enterprises report that one hour of downtime costs their firms $1 million to over $5 million.

Emerson Network Power and the Ponemon Institute (2016) revealed that the average cost of data center downtime across industries was approximately $7,900 per minute. Raphael (2013) reported that a 49-minute failure of Amazon’s services on January 31, 2013, resulted in close to $5 million in missed revenue. Similar outages happened in January/February/March 2013 to Dropbox, Facebook, Microsoft, Google Drive, and Twitter. According to Aberdeen Report (2014), the average cost of an hour of downtime for large companies is $686,250; $215,638 for medium companies; and $8,581 for small companies. Gartner (2014) noted that “Based on industry surveys, the number we typically cite is $5,600 p/minute, which extrapolates to well over $300K p/hour”. With regard to network downtime, “the cost of improving availability remains high and downtime is less acceptable, making rightsizing network availability the key goal for enterprise network designers”. (Gartner, 2014). Emerson Report (2014) found that the most frequently cited total expense of unplanned outages includes: IT equipment failure; cybercrime; UPS system failure; water, heat, or CRAC failure; generator failure; weather incursion; accidental/human error. International Data Corporation (IDC) (2014) noted that IT applications and services have become a critical element in how companies interact with their customers, deliver new products and services, and improve the productivity of their own workforce. The biggest cloud outages in 2014 include those of Amazon Web Services, Verizon Wireless, Dropbox, Adobe, Samsung, Microsoft Lync, and Microsoft Exchange Online (Raphael, 2013). An Avaya report (2014) revealed that 80% of companies lose revenues when the network goes down, with the average company losing $140,003 per incident. Quorum Report (2013) found that hardware failures are the most common type within small and mid-sized businesses with the percentage of 55%, while in 22% of disasters, the reason was human error (system and network administrators’ mistakes).

According to the 2017 Veeam Availability Report (2017), 82% of enterprises face a gap between what users expect and what IT can deliver. This “availability gap” is significant – unplanned downtime costs enterprises an average of $21.8 million each per year. The Uptime Institute’s report (2018) revealed that the number of respondents that experienced an IT downtime incident or severe service degradation in the past year (31%) increased over last year’s survey (about 25%). And in the past three years, almost half of 2018 survey respondents had an outage. This is a higher-than-expected number. The Veeam Data Availability Report (2017) revealed that, on average, each downtime incident lasts about 90 minutes, costs on average $150k per outage, and represents $21.8M per year in losses. Gartner reported that the average cost of an IT outage is $5,600 per minute, and because there are so many differences in how businesses operate, downtime, at the low end, can be as much as $140,000 per hour, $300,000 per hour on average, and as much as $540,000 per hour at the higher end (Opiah, 2019). Veeam (2016) reported that

company executives, including CIOs and CFOs, have zero tolerance for downtime and data loss. These companies have established high availability requirements for the applications and critical data the organization uses on a daily basis. Sadly, most companies have not found a way to match these expectations with the harsh realities of maintaining the demands of the Always-On Enterprise™. In fact, 82% of CIOs admit to not being able to meet the demand for 24.7.365 Availability of IT services.

Recent Uptime Institute research (Uptime Institute report, 2019) found that major failures are not only still common, but that the consequences are high, and possibly higher than in the past – a result of our high reliance on IT systems in all aspects of life. In 2018, there were major outages of financial systems, daylong outages of 911 emergency service call numbers, aircraft losing services from ground-based IT landing systems, and healthcare systems lost during critical hours. The number of respondents that experienced an IT downtime incident or severe service degradation in the past year (31%) increased over last year’s survey (about 25%). And in the past three years, almost half of our 2018 survey respondents had an outage. This is a higher-than-expected number. The Business Continuity Institute (BCI Report, 2018) reported that the uptake of business continuity arrangements has experienced an upward trend. An increasing number of organizations embed business continuity to protect their supply chains, which also has a positive impact on other areas such as insurance and top management commitment.

Application defects, hardware failures, and operating system crashes may take different forms, such as bugs in programs, badly integrated applications, and process/file corruptions. Network problems, in addition to hardware glitches on data communication devices, include problems such as those related to Domain Name System (DNS) servers, network configuration files, network protocols. Human error may also cause data unavailability, which includes accidental or intentional removal of files, fault operations, and hazardous activities including sabotage, strikes, and vandalism. Accidental or intentional removal of system files performed by a system administrator can shut down the whole server and make applications/data unreachable. Another example is the loss of key IT personnel or the leaving of expert staff due to several reasons, for example, bad managerial decisions on IT staffing policy.

Adeshiyan et al. (2010) stated that traditional high-availability and disaster recovery solutions require proprietary hardware, complex configurations, application specific logic, highly skilled personnel, and a rigorous and lengthy testing process. Jarvelainen (2013) proposed a framework for business continuity management to the context of business information systems. Zambon et al. (2011) stated that having a reliable information system is crucial to safeguard enterprise revenues. Martin (2011) cited the results of a study by Emerson Network Power and the Ponemom Institute that revealed that the average data center downtime event costs $505,500, with the average incident lasting 90 minutes. ITIC Report (2009) revealed that “server hardware and server operating system reliability has improved vastly since the 1980s, 1990s and even in just the last two to three years”. This report underscores that common human error poses a bigger threat to server hardware and server operating system reliability than technical glitches. Venkatraman (2013) noted that more than a third of respondents viewed human error as the most likely cause of downtime. Clancy (2013) stated that it takes an average of 30 hours to recover from failures, which can be devastating for a business of any size. Sun et al. (2014) proposed a Markov-based model for evaluating system availability and estimating the availability index. Bhatt et al. (2010) considered IT infrastructure as the “enabler of organizational responsiveness and competitive advantage”. Versteeg and Bouwman (2006) defined the main elements of a business architecture as business domains within the new paradigm of relations between business strategy and information technologies. Yoo (2011) stated that the shift to cloud computing also means that applications providers will place less emphasis on the operating system running on individual desktops and greater focus on the operating system running on the relevant servers. Duffy et al. (2010) stated that “although the operating system is an integral component of a computer-based information system, for many MIS majors the study of operating systems falls into this ‘dry’ category” Lawler et al. (2008) explored the risks of IT application downtime and the increasing dependence on critical IT infrastructures and discussed several disaster tolerance techniques. Brende and Markov (2013) considered the most important risks inherent to cloud computing and focused on the risks that are relevant to the IT function being migrated to the cloud. Sandvig (2007) noted that four server-side technologies are needed to support e-business: a web server, server side programming technology, a database application, and a server operating system. ITIC Report (2009) indicated that “server hardware and server operating system reliability has improved vastly since the 1980s, 1990s”. According to this report, common human error poses a bigger threat to server hardware and server operating system reliability than technical glitches. In summary, it is more than evident that continuous computing features of modern server operating systems in terms of their availability, scalability, and reliability affect a business in such a way that more or less downtime simply means more or fewer financial losses. CIO (2013) reported that “Web-based services can crash and burn just like any other type of technology”. Marshall (2013) related a story about the cloud service provider Nirvana that “has told its customers they have two weeks to find another home for their terabytes of data because the company was closing its doors and shutting down its services”. Clancy (2013) stated that “Hardware failure is the biggest culprit, representing about 55 percent of all downtime events at SMBs, while human error accounts for about 22 percent of them”. According to Information Today Report (2012), network outages (50%) were the leading cause of unplanned downtime within the last year. Human error (45%), server failures (45%) and storage failures (42%) followed closely behind. An example of human error is an accidental or intentional operation of removing fi...

Cover
Half Title
Title Page
Copyright Page
Contents
Foreword
List of Acronyms and Abbreviations
Preface
Chapter 1 Introduction: Downtime and Modern Business
Chapter 2 Economics of Downtime
Chapter 3 Business Continuity and IT Capability
Chapter 4 Server Operating Environments for Business Continuity
Chapter 5 System Administration and Business Continuity
Chapter 6 Enhancing Availability and Business Continuity: Methods, Techniques, and Technologies
Chapter 7 IT Auditing – System Administration – Business Continuity – IT Capability
Chapter 8 IT Capability and Organizational Business Performance
Chapter 9 Development of IT Capability
Chapter 10 Real-World Stories
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Enhancing Business Continuity and IT Capability by Lejla Turulja,Nijaz Bajgorić,Semir Ibrahimović,Amra Alagić in PDF and/or ePUB format, as well as other popular books in Computer Science & Operations. We have over one million books available in our catalogue for you to explore.

About this book

Tools to learn more effectively

Information

Background

Table of contents

Frequently asked questions