ITIC 2020 Global Server Reliability Report

Transcription

INFORMATION TECHNOLOGYINTELLIGENCE CONSULTINGITIC 2020 Global Server Hardware,Server OS Reliability ReportFebruary/March2020 Copyright 2020, Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

Table of ContentsExecutive Summary.3Introduction .6Reliability and Uptime by the Numbers .8Reliability versus Availability . 10Reliability Dollars and Sense: The Actual Cost of Downtime . 15Analysis . 16IBM and Lenovo Reliability Success: Innovation, High Performance, Security andTop Technical Support Deliver High Reliability . 17Other Notable Survey Findings . 20Server Hardware Platform Overview . 22IBM Power Systems and IBM Z. 22Lenovo ThinkSystem . 23HPE Integrity . 24Huawei KunLun and Fusion Servers . 252020 Reliability Trends: Top Notch Security is Crucial . 25Hourly Cost of Downtime Continues to Rise . 27Minimum Reliability Requirements Rise . 28Conclusions . 30Recommendations . 32Survey Methodology . 35Survey Demographics . 35Appendices. 36 Copyright 2020 Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.Page 2

Executive SummaryIBM, Lenovo maintain top server Reliability ranking for 12th straight yearCisco, Hewlett-Packard Enterprise (HPE) and Huawei close gap, challenge leaderswith strong reliability showingsIBM Z, Power Systems, Lenovo x86 and Huawei KunLun hardware deliver highestAvailability; outages of shortest durations; service interruptionsIBM Power Systems, Lenovo ThinkSystem, HPE Integrity and Huawei KunLunregister as much as 26x better reliability than least efficient rival “White box”platforms and up to 34x better economies of scaleSecurity/Data Breaches; Human Error; Software bugs are top external threatsHigh reliability, uptime and availability are imperative in today’s ―always on‖ Digital networks.For the 12th straight year, IBM’s Z mainframe and Power Systems, achieved the highest serverreliability rankings, along with Lenovo’s ThinkSystem servers which delivered the best uptimeamong all Intel x 86 servers for the last seven consecutive years, in ITIC’s 2020 Global ServerHardware and Server OS Reliability survey.ITIC’s latest survey data finds that the most reliable mainstream server platforms – the IBMPower Systems, Lenovo ThinkSystem, Hewlett-Packard Enterprise (HPE) and Huawei KunLundeliver up to 26x more uptime and availability than the least dependable unbranded ―White box‖servers. Additionally, the superior uptime of the above top ranked mission critical hardwaremakes them up to 34x more economical and cost effective than the least stable White boxservers.High end mission critical servers from IBM and Lenovo both registered under two (2) minutes ofper server, per annum unplanned downtime due to inherent flaws in the underlying hardware orcomponent parts. Cisco, Hewlett-Packard Enterprise (HPE) and Huawei server platforms wereclose behind: each recorded approximately two minutes or a few seconds more downtimeattributable to inherent issues with the hardware. Among mainstream servers, IBM POWER8and POWER9, along with the Lenovo x86 ThinkSystem servers; the HPE Integrity Superdome Xand Huawei’s mission critical KunLun servers continue to deliver the highest levels ofreliability/uptime among 18 server platforms. (See Exhibit 1). Copyright 2020 Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.Page 3

The least consistent hardware - unbranded White box servers - averaged 53 minutes ofunplanned per server, per annum downtime due to problems or failures with the server or itscomponents (e.g. hard drive, memory, cooling systems etc.). This represents an increase of four(4) minutes of downtime compared with ITIC’s 2019 Global Server Hardware, Server OS MidYear Update survey.ITIC’s independent Web-based survey polled over 1,200 businesses worldwide from November2019 through February 2020. The study compares and analyzes the reliability and availability ofover one dozen mainstream server platforms and one dozen operating system (OS) distributions.To obtain the most accurate and unbiased results, ITIC accepts no vendor sponsorship.IBM’s System Z server is in a class of its own. It maintained its best in class rating among allserver platforms. An 83% majority of IBM respondent organizations said their firms achievedfive and six nines – 99.999% and 99.9999% - or greater uptime. Nine-in-10 IBM Z customersreported that the mainframe recorded just 0.62 seconds of unplanned per server downtime eachmonth and 7.44 seconds annually due to inherent flaws in the server hardware or its componentparts. And less than one-half of one percent of IBM Z respondents said the mainframeexperienced unplanned outages exceeding four (4) hours of annual downtime.The economic annual downtime cost comparisons among the top performing and the leastreliable server hardware platforms is staggering.A single hour of downtime estimated at 300,000 equates to 4,998 per server/per minute.According to that metric, organizations using the most reliable IBM POWER8 and POWER9;Lenovo x86-based ThinkSystem; HPE Integrity or Huawei KunLun servers that experienced justunder or just over two (2) minutes would spend 9,996 in annual per server downtime costs dueto inherent flaws in server hardware or component parts (See Table 2).By contrast, corporations using Dell PowerEdge servers which experienced 26 minutes of perserver/per minute downtime at the same 300,000 per hourly downtime rate would rack upyearly outage costs of 130,026 for a single server.Corporations deploying the least reliable unbranded White box servers that registered 53 minutesof per server, per minute downtime in the latest ITIC 2020 Global Reliability survey can expectto incur downtime losses of 264,894 specifically related to server hardware flaws and bugs inthe OS and applications. The four additional minutes of downtime – from 49 minutes per serverin 2019 to 53 minutes of per server outage time in 2020, represents a cost increase of 19,992compared with the White box server 2019 per server, per minute downtime price tag of 244,902.Time is money.The higher monetary costs associated with unbranded White box servers are not surprising. Theunbranded White box servers frequently incorporate inexpensive components. And some Copyright 2020 Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.Page 4

businesses recklessly run unsupported or pirated versions of operating systems and applications.The aforementioned hourly downtime examples are for just one server. Downtime costs canmount quickly and reach into the millions for corporations with dozens or hundreds of highlyunreliable servers.Among the other top survey findings: Reliability: IBM Power Systems and Lenovo ThinkSystem hardware and the Linuxoperating system distributions were once again either first or second in every reliabilitycategory, including server, virtualization and security.Availability: IBM Z mainframe, Power Systems, Lenovo ThinkSystem, HPE Integrityand Huawei KunLun all provided the highest levels of server, applications and serviceavailability. That is, when the servers did experience an outage due to an inherent systemflaw, they were of the shortest duration – typically one-to-five minutes.Technical Support: Businesses gave high marks to IBM, Lenovo, HPE, Huawei andDell tech support. Only 1% of IBM and Lenovo customers and 2% of HPE and Huaweiusers gave those vendors ―Poor‖ or ―Unsatisfactory‖ customer support ratings.Hard Drive Failures Most Common Technical Server Flaw: Faulty hard drives are thechief culprits in inherent server reliability/quality issues (58%) followed by Motherboardissues (43%) and processor problems (38%).IBM, Lenovo and Huawei KunLun Servers Had Fewest Hard Drive Failures: IBM,Lenovo and Huawei’s KunLun platforms experienced the fewest hard drive quality orfailure issues among all of the server distributions within the first one, two and threeyears of service. Less than one percent – 0.4% - of IBM Z mainframes, for example,experienced technical problems with their hard drives in the first year of usage, followedby the IBM Power Systems and Lenovo ThinkSystem with one percent (1%) each duringthe first 12 months of deployment.Security is Top External Issue Negatively Impacting Reliability: Security and databreaches now have the dubious distinction of being the top cause of downtime.Minimum Reliability Requirements Increase: An 88%majority of corporations nowrequire a minimum of “four nines‖ of uptime - 99.99% for mission critical hardware,operating systems and main line of business (LOB) applications. This in an increase offive (5) percentage points from ITIC’s 2018 Reliability survey.Patch Time Increases: Seven-in-10 businesses now devote from one hour to over fourhours applying patches. This is primarily due to a spike in wide ranging security issuessuch as Email Phishing scams, Ransomware, CEO fraud as well as malware and viruses.Increased Server Workloads Cause Reliability Declines: The survey data found thatreliability declined in 67% of servers over four (4) years old, when corporations failed toretrofit or upgrade the hardware to accommodate increased workloads and larger, morecompute intensive applications. This is up 23% from the 45% of businesses that saiduptime declined due to higher workloads in the ITIC 2018 Reliability poll.Hourly Downtime Costs Rise: A 98% majority of firms say hourly downtime costsexceed 150,000 and 88% of respondents estimate hourly downtime expenses exceed Copyright 2020 Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.Page 5

300,000. Just over one-third of ITIC survey respondents - 34% - estimate the cost of asingle hour of downtime now tops one million ( 1,000.000).Server hardware, server operating system – and by extension, virtualization reliability, uptimeand availability are the core foundational elements of the overarching health of an organization’sentire Digital Age ecosystem and the life blood of daily business operations.Exhibit 1. Server Reliability by Hardware Platform Running Linux OSExhibit 1. IBM Power Systems, Lenovo ThinkSystem Most Reliable ServersUnplanned Downtime by Server Hardware PlatformPer Minute/Per Server in 2020Unplanned Downtime Due to Inherent system or componentflaws per Minute/per Server53White Box servers w/Linux43HPE ProLiant w/Linux43Oracle OpenSolaris UltraSPARC37Oracle x86 w/Linux26Dell PowerEdge w/Linux10InspurFujitsu PrimergyCisco UCS w/LinuxHPE Superdome w/LinuxHuawei KunLun & FusionServer w/Linux3.52.321.82Lenovo ThinkSystem w/Linux1.64IBM Power w/Linux1.54IBM Z w/Linux or z/OS0.62Copyright 2020 ITIC All Rights ReservedSource: ITIC 2020Source: ITIC 2020 Global Server Hardware Server OS Reliability SurveyIntroductionFor the past 12 years, the ITIC Global Server Hardware, Server OS Reliability Report hascompared the reliability of up to18 mainstream server platforms, and over one dozen serveroperating system distributions (Linux, UNIX, Ubuntu, Debian, Z/OS and Microsoft Windows) Copyright 2020 Information Technology Intelligence Consulting Corp. (ITIC) All rights reserved.Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.Page 6

and one dozen server hardware virtualization layers. It also delves into the internal issues thatimprove or undermine core server hardware and OS reliability.The report quantifies and qualifies the overarching reliability of mainstream server hardware,based on key metrics and corporate policies including: Automated and manual patch managementPercentage of

By contrast, corporations using Dell PowerEdge servers which experienced 26 minutes of per server/per minute downtime at the same 300,000 per hourly downtime rate would rack up yearly outage costs of 130,026 for a single server. Corporations deploying the least reliable unbranded White box servers that registered 53 minutes of per server, per minute downtime in the latest ITIC 2020 Global .