Why Ceph Is The Best Choice Of OpenStack Backend Storage - Apistek

Transcription

Why Ceph is the Best Choice ofOpenStack Backend Storage周振倫Aaron JOUEFounder & CEO1

Agenda About AmbeddedWhy Ceph is the Best Choice of OpenStack Storage?The Use of OpenStack Block and Shared File SystemUnified StorageHigh AvailabilityScalableHow does Ceph Support OpenStack?Build a no Single Point of Failure Ceph Cluster2

About Ambedded TechnologyY2017Y2016 Y2014-2015Y2013 Won Computex 2017 Best Choice Golden AwardProduct Mars 200 is successfully deployed to Franceand Taiwan tier 1 telecom companiesLaunch the 1st ever Ceph Storage Appliance powered byGen 2 ARM microserverAwarded as the 2016 Best of INTEROP Las Vegas Storageproduct. Defeat VMware virtual SAN.Deliver 2000 Gen 1 microservers to partner Cynny for its CloudStorage Service. 9 Petabytes capacity on service till now.Demo in ARM Global Partner Meeting UK Cambridge.Founded in Taiwan Taipei,Office in National Taiwan University Innovative Innovation Center3

Why Ceph is the Best Choice? Stable for production, great contributors Ceph dominate the OpenStack block storage (Cinder)and shared file system driver in use. Ceph is unified storage which supports object, block andfile system. Open Source, scalable, no single point of failure Self management: auto balance, self healing, CRUSHmap etc. OpenStack build-in integration with Ceph Geo replication4

Which OpenStack block storage (Cinder) driversare in use?Ceph 201720164%6%VMWare VMDK3%4%SolidFileEMC3%0%Source: OpenStack User Survey6%10%20%30%40%50%60%70%5

Which OpenStack Shared File Systems (Manila)driver(s) are you Huawei4%4%IBM GPFS4%Windows SMB2%Quobyte2%EMC2%26%22%2017201611%15%4%0%Source: OpenStack User Survey7%10%20%30%40%50%60%6

Ceph is Unified StorageAPPAPPObject StorageObject StorageRADOS GatewayA bucket-basedLIBRADOSA library allowing REST gateway,apps to directly compatible withaccess RADOS, S3 and Swiftwith support forC, C , Java,Python, Ruby,and PHPRADOSOSDOSDOSDHOST/VMCLIENTBlock StorageFile SystemRBDA reliable andfully-distributedblock device, witha Linux kernelclient and aQEMU/KVM driverCephFSA POSIXcompliantdistributed filesystem, with aLinux kernel clientand support forFUSEMDSOSDOSDOSDMONMDSMONMONA reliable, autonomous, distributed object store comprised of self-healing, selfmanaging, intelligent storage nodes7

Ceph is Scalable & No Single Point of Failure CRUSH algorithm – distribute object across OSDsaccording to pre-defined a failure domain. No controller, no bottleneck limit the scalability, Nosingle point of failure Clients use maps and Hash calculation to write/readobjects to/from OSDs8

CRUSH Algorithm & Replication321232313 21Client ServerCompute Object Location1CRUSH rule ensures data is stored indifferent node/chassis/rackCluster map from MONPrimary OSDPlacement Group 123Failure domain can be defined as: Node, Chassis, Rack, Data Center9

Auto Balance vs. Auto Scale-OutBALANCEDEMPTYFULLBALANCEDEMPTYWhen new nodes join thecluster, the total capacitywill scale FULLBALANCEDEMPTYAutonomic scale out theperformance andcapacity load

Self HealingAuto-detection,Re-generate themissing copies of dataThe re-generateddata copies will saveto existing clusterfollow the CRUSH rulesetting via UVSAutonomic!The Self Healing will be active when cluster detectthe data risk, no need with human hand-in.11

OSD Self-Heal vs. RAID Re-buildTest ConditionMicroserver Ceph ClusterDisk ArrayDisk number/capacity16 x 10TB HDD OSD16 x 3TB HDDData ProtectionReplica 2RAID 5Data Stored in the disk3TBNot relatedTime for re-heal/re-build5 hours, 10 min.41 HoursAdministrator involvement Re-heal activateautomaticallyRe-build afterreplacing a new diskRe-heal rate169 MB/s10MB/s/OSD21 MB/sRe-heal time vs. totalnumber of disksMore disk - less recovertimeMore disk - longerrecover time*OSD Backfilling configuration is default value12

Ceph and NCEMANILANOVAlibvirtKVM/QEMURADOS SDOSDOSDOSDOSDOSDOSDOSDOSDRADOS CLUSTER13

Build a no Single Point of Failure Ceph Cluster Hardware will always fail Protect data by software intelligence instead ofusing hardware redundancy Smallest and Configurable Failure Domain14

Issues of Using Single Server Nodewith Multiple Ceph OSDs One Server failure causes many OSDs down. CPU utility is only 30%-40% when network is saturated. Thebottleneck is network - not computing. The power consumption and thermal / heat is eatingyour budget15

1x OSD vs. 1 x Micro ServerTraditional server- 1 to many causeshigher risk of aserver failure- CPU utility is lowdue to NetworkbottleneckNetworkARM micro serverclusterNetwork- 1 to 1 to reducefailure risk- Aggregatednetwork bandwidthwithout bottleneckClient #120GbTraditionalServer #120GbTraditionalServer #220GbTraditionalServer #3xNxNxN4x 100GbMicro serverclusterMSClient #2MxNSxNMS4x 100GbMicro serverclusterMSMxNSxNMS4x10GbMicro serverclusterMSMxNSMSxN16

Mars 200: 8-Node ARM Microserver Cluster8x 1.6GHz ARM v7 Dual Core microserver- 2G Bytes DRAMStorage Device- 8x SATA3 HDD/SSD OSD - 8G Bytes Flash: System disk- 5 Gbps LAN- 8x SATA3 Journal SSD- 5 Watts power consumptionEvery node can be OSD, MON, MDS, GatewaysOOB BMC portDual uplink switches- Total 4x 10 GbpsHot Swappable Micro Server HDD/SSD Ethernet Switch Power supply17

The Benefit of Using1 Node to 1 OSD Architecture on CEPH Minimize the failure domain to single OSD. The MTBF of a micro server is much higher than an all-in-onemotherboard: 122,770hr Dedicated H/W resource stabilizes OSD service– CPU, Memory, Network, SATA interface, SSD Journal disk Aggregated network bandwidth with failoverLow power consumption and cooling cost savingsOSD, MON, gateway are all in the same box3 x 1U chassis forms a high availability cluster. 15x 9 availabilitywith 3 replications.18

The Basic High Availability ClusterScale it out19

Ceph Storage Appliance1U 8 NodesHigh Density20172U 8 NodesFront Panel DiskAccess20

Ceph Management GUI Demo21

Build a OpenStack A-Team Taiwan晨宇創新數位無限The Power of Partnership22

Aaron Joueaaron@ambedded.com.tw23

Why Ceph is the Best Choice? Stable for production, great contributors Ceph dominate the OpenStack block storage (Cinder) and shared file system driver in use. Ceph is unified storage which supports object, block and file system. Open Source, scalable, no single point of failure Self management: auto balance, self healing, CRUSH