How Nutanix Works On FUJITSU PRIMERGY The Definitive Guide To Hyper .

Transcription

How Nutanix Works on FUJITSU PRIMERGY The Definitive Guide toHyper-convergedInfrastructure

Table of ContentsIT at a Crossroads.5Time for a Different Approach?.6What is Hyper-converged Infrastructure?.6The Nutanix Solution .7Nutanix Community Edition and Community Edition On-Demand. 8Prism and Acropolis .8How Nutanix Software Is Deployed. 9Nutanix Leads the Pack.9Acropolis .11Distributed Storage Fabric (DSF) . 11Acropolis Hypervisor (AHV). 11App Mobility Fabric (AMF). 11Distributed Storage Fabric (DSF). 12Infrastructure Resilience. 13Tunable Redundancy. 13Replication Factor versus RAID. 13Data Path Redundancy. 13Nutanix Software Upgrades and Data Path Redundancy.14Integrity Checks.14Availability Domains.14Performance Acceleration .15Intelligent Tiering. 15Data Locality. 15Automatic Disk Balancing. 15VM Flash Mode.16Shadow Clones. 17Capacity Optimization .19Deduplication.19Compression.19Pro Tip: Compression .20EC-X.20

Data Protection .21What are RTO and RPO? . 21Converged Local Backups With Snapshots And Time Stream. 22Integrated Remote Backup and DR Using Async Replication. 22Self-Service File Restore . 22Cloud Connect. 23Metro Availability and Sync Replication. 23Security .25Data-at-Rest Encryption . 25Two-Factor Authentication & Cluster Lockdown.26Security Automation with SaltStack.26Hypervisors & Application Mobility . 27It’s a Multi-Hypervisor World . 27Application Mobility Fabric. 27Hypervisor scalability limits.29AHV Live Migration .30AHV Data Protection.30VM High Availability (VM-HA). 31AHV Networking. 31High Availability Out of the Box. 31Data Center Management with Nutanix Prism .33Prism is Highly Available by Design. 33The Prism Approach.33Prism Keyboard Shortcuts.36Performing Software Upgrades.36Pro Tip: Prism Central . 37Whom Would You Rather Call?.38About Fujitsu.39

4

1.IT at a CrossroadsIT is increasingly being asked to spend less time on infrastructureand allocate more time and budget to application services thatadd business value. Despite a continuous stream of IT hardwareand software enhancements, the infrastructure challenges faced byIT teams continue to rise. The IT infrastructure and virtualizationsoftware required to meet the needs of business is complex andexpensive, and data center management has become painful. Fartoo much time and effort are focused on just keeping the lights on.Legacy infrastructure—with separate storage, storage networks, andservers—is not well suited to meet the growing demands ofenterprise applications or the fast pace of modern business. Thesilos created by traditional infrastructure have become a barrier tochange and progress, adding complexity to every step from orderingto deployment to management. New business initiatives requirebuy-in from multiple teams, and IT needs have to be predicted3-to-5 years in advance. As most IT teams know, this is almostimpossible to get right. In addition, vendor lock-in and increasinglicensing costs are stretching budgets to the breaking point.1.2.3.4.INHERENT COMPLEXITYINEFFICIENT SILOSFORKLIFT SCALINGPAINFUL MANAGEMENTFIGURE 1: Challenges of legacy three-tier infrastructure5

Time for a Different Approach?Enterprise IT teams today are looking for ways to deliver onpremises IT services with the speed and operational efficiencyof public cloud services such as Amazon Web Services (AWS),Microsoft Azure and Google Compute Engine.Taking cues from web giants, hyper-converged infrastructurecombines compute and storage resources with intelligentsoftware to eliminate common pain points associated with legacyinfrastructure.Nutanix software and Fujitsu PRIMERGY server hardware delivera comprehensive enterprise cloud platform that bridges the widegap that exists between traditional infrastructure and public cloudservices. The solution delivers turnkey infrastructure that integratesservers, storage and virtualization along with end-to-end systemsmanagement and operations management capabilities. This allowsenterprises to deploy infrastructure in minutes, and shift their focusto applications that power the business.WHAT IS HYPER-CONVERGED INFRASTRUCTURE?Hyper-converged infrastructure combines compute and storageresources with intelligent software to create flexible buildingblocks that replace legacy infrastructure consisting of separateservers, storage networks, and storage arrays.While hyper-convergence is not an end point in itself, it is thefundamental building block for enterprise cloud. This book gives anoverview of the Nutanix Enterprise Cloud on PRIMERGY solutionand walks through how different features and functionality work toprovide a fast, highly scalable and efficient data center solution forenterprises of all sizes.6

2.The Nutanix Enterprise Cloudon PRIMERGY SolutionNutanix and Fujitsu converge the entire data center stack includingcompute, storage, storage networking, and virtualization. Complexand expensive legacy infrastructure is replaced by simple 1U and2U PRIMERGY server-based appliances that enable enterprises tostart small and scale one node at a time. Each server, also knownas a node, includes Intel-powered x86 hardware with flash SSDsand HDDs. Nutanix software running on each node distributes alloperating functions across the cluster for superior performance , STORAGE &VIRTUALIZATIONSTORAGENETWORKSTORAGEFIGURE 2: Nutanix and Fujitsu converge compute, storage, and virtualization insimple, scalable building blocks7

A single Nutanix Enterprise Cloud on PRIMERGY cluster can havean unlimited number of nodes. Different hardware platforms areavailable to address varying workload needs for compute andstorage.NUTANIX COMMUNITY EDITION ANDCOMMUNITY EDITION ON-DEMANDCommunity Edition is a free, 100% software solution that letsenterprises easily evaluate the latest Nutanix software technologyat zero cost on existing hardware.Hardware platforms include “compute heavy” and “storage heavy”options. All nodes include flash to optimize storage performance,and all-flash nodes are available to deliver maximum I/O throughputwith minimum latency for all enterprise applications.Prism and AcropolisNutanix software has two required components: Acropolis andPrism. Acropolis is a distributed data plane with enterprise storageand virtualization services, and the ability for applications to moveseamlessly across hypervisors and in the long run cloud providers.Prism is a distributed management plane that uses advanced dataanalytics and heuristics to simplify and streamline commonworkflows, eliminating the need for separate management solutionsfor servers, storage networks, storage and entApp Mobility FabricOperational InsightsAHV (Acropolis Hypervisor)PlanningDistributed Storage FabricFIGURE 3: Key functions of Acropolis and Prism8

“My key requirements were to havesomething that was simple, easy tomanage, and ideally a single pane ofglass. I wanted a solution that was verypowerful and also very versatile. Forme, Nutanix ticked all of those boxes.”PURDIP BAHRAIT Manager, Joseph Chamberlain College10

3.AcropolisNutanix Acropolis has three major components:DISTRIBUTED STORAGE FABRIC (DSF) Enterprise storage services for applications, eliminating the needfor separate solutions from vendors such as NetApp, EMC, and HPE Includes a comprehensive set of capabilities for performanceacceleration, data reduction, data protection, and much more Full support for VMware vSphere, Microsoft Hyper-V and NutanixAHV HypervisorAHV HYPERVISOR Natively built virtualization solution Based on the proven Linux KVM hypervisor, AHV is hardened tomeet the most stringent enterprise security requirements Integrated management through PrismAPP MOBILITY FABRIC (AMF) Intelligent virtual machine (VM) placement, migration, hypervisorconversion, and cross-hypervisor high availability for maximumflexibility11

Distributed Storage Fabric (DSF)The Acropolis Distributed Storage Fabric is designed to simplifystorage and data management for virtual environments. By poolingflash and hard disk drive storage across a Nutanix Enterprise Cloudon PRIMERGY cluster and exporting it out to the virtualization layeras iSCSI, NFS and SMB shares, DSF eliminates the need for SAN andNAS solutions.Nutanix Controller VMTier 1 Workloads(running on all nodes)NODE 1VMVMVM(one per node)NODE 2CVMVMVMVMNODE IMERGYPRIMERGYCVMDistributed Storage ityTieringDRResilienceFIGURE 4: Acropolis Distributed Storage Fabric joins HDD and SSD resources fromacross a cluster into a storage pool.How DSF Organizes DataThere are a few key concepts that are important with regard to howDSF organizes data:Storage Pool. A group of physical storage devices including SSDand HDD devices across the entire cluster. The storage pool spansmultiple Nutanix nodes and is expanded as the cluster scales.Storage Container. A logical segment of a Storage Pool. Containerstypically have a 1-to-1 mapping with a VM datastore.vDisk. A vDisk is any file over 512KB on DSF including .vmdk filesand VM hard disks. vDisks are composed of extents which aregrouped and stored on disk as an extent group.12

Infrastructure ResilienceThe Nutanix Enterprise Cloud on PRIMERGY platform is faultresistant, with no single points of failure and no bottlenecks.TUNABLE REDUNDANCYWith Tunable Redundancy, each Nutanix container is configuredwith a replication factor (RF) of two or three. RF 2 ensures thattwo copies of data are maintained at all times, allowing the clusterto survive the failure of a single node or drive. When RF is setto 3 (RF 3), three copies of the data are maintained in a cluster,providing resilience against two simultaneous failures.REPLICATION FACTOR VERSUS RAIDRAID has been a popular way of protecting against drive failureswhile limiting the extra storage capacity required. Rebuilding amulti-TB drive can take days to complete, creating a risk of dataloss should further failures occur. RAID has gone from single todouble and even triple-parity to try to reduce this risk.Nutanix Replication Factor (RF) eliminates reliance on RAID, theneed for expensive spare drives that sit idle, and the performancepenalty that comes with multiple parity calculations.DATA PATH REDUNDANCYData Path Redundancy ensures high availability in the event aNutanix Controller VM (CVM) becomes unavailable or needs to bebrought down for upgrade. If a CVM becomes unavailable for anyreason, Nutanix CVM autopathing automatically re-routes requeststo a “healthy” CVM on another node. This failover is fully transparentto the hypervisor and applications.Data Path Redundancy is possible because every node in a clusterhas access to all copies of data—I/O requests can be servicedimmediately by any node in the system.13

NUTANIX SOFTWARE UPGRADES AND DATA PATH REDUNDANCYNutanix software upgrades take advantage of reliable data pathredundancy. While the local CVM is unavailable because ofsoftware upgrade or a failure, VMs running on the node use datapath redundancy to satisfy I/O through a CVM on another node—transparent to users and applications.INTEGRITY CHECKSAcropolis has a variety of features to proactively identify and fixissues related to data consistency and integrity, bit rot failures, andhard disk corruption. Detection of silent data corruption and repair of data consistencyerrors Automatic data integrity checks during every read Automatic isolation and recovery during drive failuresAVAILABILITY DOMAINSAvailability Domains offer greater protection from hardware failuresby allowing Nutanix clusters to survive the failure of a node or block(multi-node enclosure). Availability domains are created based onthe granularity at which failures are likely to occur.With DSF, data replicas will be written to other blocks in thecluster to ensure that in the case of a block failure or planneddowntime, the data remains available. This is true for both RF2and RF3 scenarios, as well as in the case of a block failure. Aneasy comparison would be “node awareness”, where a replicawould need to be replicated to another node which will provideprotection in the case of a node failure. Block awareness furtherenhances this by providing data availability assurances in the caseof block outages.14

4.Performance AccelerationDSF includes a number of capabilities that enhance performance:INTELLIGENT TIERINGDSF continually monitors data access patterns and optimizesdata placement on either the SSD or HDD tier, achieving the bestperformance without administrator intervention.The SSD tier provides maximum performance for hot data andrandom I/O, while the HDD tier provides maximum capacity andeconomy for cold data and sequential I/O.DATA LOCALITYDSF ensures that as much of a VM’s data as possible is storedon the node where the VM is running. This negates the need forread I/O to go through the network. Keeping data local optimizesperformance and minimizes network congestion.Every VM’s data is served locally from the CVM and storedpreferentially on local storage. When a VM is moved from onenode to another using vMotion or Live Migration (or during an HAevent), the migrated VM’s data automatically follows the VM in thebackground based on read patterns.AUTOMATIC DISK BALANCINGAutomatic disk balancing ensures that data is distributed uniformlyacross the entire cluster. Any node in a Nutanix Enterprise Cloudon PRIMERGY cluster can utilize storage resources across thecluster, without requiring time-consuming and error-prone manual15

rebalancing.Automatic Disk Balancing reacts to changing workloads andallows heterogeneous nodes to be mixed in a single cluster. Onceutilization reaches a set threshold, disk balancing keeps it uniformamong nodes.VM FLASH MODEVM Flash Mode pins specific VMs or vDisks to the cluster-wide SSDtier, so that IOPS and latency-sensitive workloads can be mixedwith other workloads in a single cluster, without compromising onperformance. VM Flash Mode gives fine-grained control over I/Operformance. For instance, all database transaction logs can bepinned in flash. Or all financial data can be pinned in flash duringquarter-end reporting.SSDHDDSSDHDDSSDHDDSSDHDDHOTCluster-wide SSD TierCluster-wide HDD TierCOLDFIGURE 5: VM Flash Mode allows individual vDisks to be “pinned” in the cluster-wide SSD tier for maximum IOPS and low latency.16

SHADOW CLONESShadow Clones significantly improve performance by cachingvirtual machine data across a Nutanix cluster. Unique to Nutanix,Shadow Clones benefit scenarios where there are multiple VMsreading a single source of data, such as deployment servers andrepositories. VDI deployments, where many linked clones forwardread requests to a central master (e.g., Citrix MCS Master VM orVMware View replica disks), are an ideal example.With Shadow Clones, Nutanix actively monitors vDisk accesstrends. If there are requests originating from more than two remoteController VMs (CVMs), as well as the local CVM, and all of therequests are read I/O, the vDisk will be marked as immutable. Oncethe disk has been marked immutable, the vDisk is then cachedlocally by each CVM so read operations are now satisfied locally bydirect-attached storage resources.17

18

5.Capacity OptimizationDSF incorporates a wide range of storage optimization technologiesthat work together to make efficient use of the available capacity ina cluster.DEDUPLICATIONNutanix delivers two types of data deduplication to accelerateapplication performance and optimize storage capacity.Performance-tier deduplication removes duplicate data in thecontent cache (SSD and memory) to reduce the footprint ofan application’s working set. In addition, global post-processMapReduce deduplication reduces repetitive data in the capacitytier to increase the effective storage capacity of a cluster. Bothforms of deduplication can be easily configured and managed atvDisk granularity.When deduplication is enabled, data is fingerprinted on ingestusing a SHA-1 hash. Deduplication operations are software-drivenand leverage the hardware-assist capabilities of the Intel chipset forthe SHA-1 fingerprint generation. Because SHA-1 is a strong hash,deduplication is performed based on a fingerprint match.COMPRESSIONData can be compressed inline as it is written to the system, or postprocess after the data has been written. Inline and Post processcompression is intelligently determined based on sequential orrandom access patterns to enable optimal performance. Postprocess compression is executed as a series of MapReduce jobs.19

DSF uses the Google Snappy compression algorithm, providinggood compression ratios with minimal computational overhead andextremely fast compression and decompression rates.PRO TIP: COMPRESSIONUse inline compression most of the time; it will not impactrandom write performance. Inline compression pairs perfectlywith erasure coding.EC-XNutanix Enterprise Cloud on PRIMERGY systems include aninnovative erasure coding technology—Nutanix EC-X—that providesresilience and can increase usable capacity by up to 75%. EC-Xreduces the capacity cost of replication factor (RF) without takingaway any of the resilience benefits and with no impact on writeperformance.EC-X encodes a strip of data blocks on different nodes andcalculates parity. In the event of a disk or node failure, parity is usedto calculate any missing data blocks. DSF uses an extent group asthe data block, and each data block in a strip must be on a differentnode and belong to a different vDisk. The number of data and parityblocks in a strip is configured based on the desired number offailures to withstand.20

6.Data ProtectionNutanix offers natively integrated data protection and continuousavailability at the VM level. A range of options is available to meetthe recovery point objective (RPO) and recovery time objective(RTO) of different e StreamHoursHoursCloud cationZeroMINORINCIDENTSMAJORINCIDENTSFIGURE 6: Nutanix data protection optionsWHAT ARE RTO AND RPO?Recovery Time Objective (RTO) defines how much time you haveto recover if an IT failure occurs.Recovery Point Objective (RPO) defines the maximum amount ofdata you are willing to lose.21

CONVERGED LOCAL BACKUPS WITHSNAPSHOTS AND TIME STREAMNutanix Time Stream can create unlimited local snapshots—withVM and application-level consistency—and recover data instantly tomeet a wide range of backup and data protection requirements.Time Stream uses VM-centric snapshots to provide production-leveldata protection without sacrificing performance. Nutanix utilizesa redirect-on-write algorithm that dramatically improves systemefficiency for snapshots.Commvault IntelliSnap integration combines Commvault backupcapabilities with enterprise storage features from Nutanix.INTEGRATED REMOTE BACKUP AND DRUSING ASYNC REPLICATIONNutanix DR and replication capabilities are built on snapshottechnology. VM snapshots can be asynchronously replicated orbacked up to another data center based on a user-defined schedule.Replication topologies are flexible and bi-directional, enablingone-to-one, one-to-many, and many-to-many deployments. Duringreplication, data is compressed and replicated at the sub-block levelfor maximum efficiency and lower WAN bandwidth consumption.Nutanix Prism interface offers a simplified view of all local andremote snapshots, allowing administrators to restore a VM from asnapshot with a single click. In case of disaster, can failover to thesecondary data center can be done with a single click.SELF-SERVICE FILE RESTOREAcropolis data protection includes self-service file restore, whichallows users to recover individual files from VM snapshotswithout getting an administrator involved.22

CLOUD CONNECTNutanix Cloud Connect lets enterprises use public cloud services,such as Amazon Web Services (AWS) and Microsoft Azure, as along-term backup destination for all types of workloads, makingthem a logical extension of their own data centers.Cloud Connect for AWS provides a live Nutanix cluster in thecloud running on EC2 instances and using the Elastic Block Storefor metadata and S3 storage for backups. For Microsoft Azure,the Nutanix software runs on Azure Compute and storage is fromAzure Page Blob. Data transfer is WAN optimized, reducing thestorage footprint and networking bandwidth by over 75%. Supportfor Amazon Virtual Private Cloud (VPC) and Azure Virtual Network(VNET) allows secure data transfer over an IP connection.METRO AVAILABILITY AND SYNC REPLICATIONFor critical workloads requiring zero RPO, and near-zero RTO,Nutanix offers Metro Availability ensuring continuous dataavailability across separate sites within a metro. Metro Availability issimple to set up and manage using Prism.Metro Availability can be set up bi-directionally between twosites connected over a metro area network. The only networkrequirement is a round-trip latency of less than five milliseconds.Data is written synchronously to both sites, so it is always availableto applications in the event a site fails or needs to undergomaintenance. Virtual machines can be non-disruptively migratedbetween sites for planned maintenance events or other needs.23

“We were focused on flexibility andinnovation. We were looking for a partnerwho would be able to understand ourbusiness needs. With Nutanix, there wasa willingness to listen and propose aninnovative solution.”LAURENT PERRIAULTDirector of Operations, Claranet24

7.SecurityNutanix Acropolis is hardened by default. It utilizes the principleof least privilege, and delivers a true defense-in-depth model. Itscustom security baseline exceeds the requirements of the U.S.Department of Defense.Nutanix combines features such as two-factor authentication anddata-at-rest encryption with a security development lifecycle. Thisis integrated into product development to help meet the moststringent security requirements. Nutanix Enterprise Cloud onPRIMERGY systems are certified across a broad set of evaluationprograms to ensure compliance with the strictest standards.DATA-AT-REST ENCRYPTIONData-at-rest encryption is delivered through self-encrypting drives(SED) that are factory-installed in PRIMERGY hardware. Thisprovides strong data protection by encrypting user and applicationdata for FIPS 140-2 Level 2 compliance. Acropolis interfaces withthird-party key management servers using the industry-standardKey Management Interface Protocol (KMIP) instead of storing thekeys in the cluster.25

TWO-FACTOR AUTHENTICATION & CLUSTER LOCKDOWNNutanix solutions enforce two-factor authentication for systemadministrators in environments requiring additional layers ofsecurity. When implemented, administrator logins require acombination of a client certificate and username/password.Nutanix also offers Cluster Shield, which restricts access to aNutanix cluster in security-conscious environments such asgovernment and healthcare data centers. Cluster Shield not onlydisables interactive shell logins automatically but can also enablemore restrictive access based on those keys.Nutanix uses a unique, well-defined Security Development Lifecycle(SecDL) to incorporate security into every step of the softwaredevelopment process, from design and development to testing andhardening. Threat modeling is used to assess and mitigate customerrisk from code changes. SecDL testing is fully automated duringdevelopment, and all security-related code modifications are timedduring minor releases to minimize risk.The Nutanix Security Technical Implementation Guide (STIG) iswritten in the eXtensible Configuration Checklist Description Format(XCCDF), allowing it to be read by various automated assessmenttools, such as Host Based Security System (HBSS). This providesdetailed information on how to assess a Nutanix Enterprise Cloudon PRIMERGY system to determine compliance with the STIGrequirement, cutting down the accreditation time from 9-12 monthsto a matter of minutes.SECURITY AUTOMATION WITH SALTSTACKSaltStack is a robust, open-source automation and managementframework that provides a simple way to check and fix a systembaseline. Acropolis uses SaltStack to self-heal any deviation fromthe security baseline configuration of the operating system.26

8.Hypervisors & Application MobilityAcropolis provides an open platform for virtualization andapplication mobility by taking advantage of the same underlyingweb-scale architecture.IT’S A MULTI-HYPERVISOR W

Full support for VMware vSphere, Microsoft Hyper-V and Nutanix AHV Hypervisor AHV HYPERVISOR Natively built virtualization solution Based on the proven Linux KVM hypervisor, AHV is hardened to meet the most stringent enterprise security requirements Integrated management through Prism APP MOBILITY FABRIC (AMF)