Data Integrity Means And Practices - Digitalpreservation.gov

Transcription

Data IntegrityMeans and PracticesRaymond A. ClarkeSr. Enterprise Storage Solutions Specialist, Sun Microsystems - Archive & Backup SolutionsSNIA Data Management Forum, Board of Directors1

Backup vs. Archiving – there’s a differenceBoth are required in today's environmentsBACKUPARCHIVINGSinngle/Multiple copiesMultiple points in timeMultiple copyInfinite time periodsRecover data/informationDue to corruption or lossMeet RPO and RTO objectivesMaintain copy for disasterrecoveryOffline volume remounted andmanually searchedMaximize efficiency andoptimizationRegulatory compliance,Provenance, FixityEnable eDiscoveryMeet best practiceSearch Criteria Online files recalledbased on key word/date criteriaSSDReplicationHigh Performance DiskEncryptionCapacity DiskDe-duplicationVTL - ATLReplicationPrimaryDataDiskArchiveLoC - Data Integrity - Sun MicrosystemsEncryptionTapeArchiveVTL – ATLDeep Archive2

Why is Backup & Archive So Important?. because The History of DataGrowth is Exponential!24 Words - Pythagorean Theorem67 Words - Archimedes Principal179 Words - 10 Commandments286 Words - Lincoln's GettysburgAddress1300 Words - US Declaration ofindependence26911 Words .EU REGULATION ON THE SALEOF CABBAGESLoC - Data Integrity - Sun Microsystems3

Building a Terminology BridgeArchive: the report advocates that IT practices adopt a more consistent usage of the term‘archive’ with other departments within the organization. To the archival, preservation, andrecords management communities, an “archive” is a specialized repository withpreservation services and attributes.Preservation: managing information in today’s datacenter with requirements to safeguardinformation assets for eDiscovery, litigation evidence, security, and regulatory compliancerequires that many classes of information be preserved from time of creation. Preservationis a set of services that protect, provide availability, integrity and authenticity controls,include security and confidentiality safeguards, and include an audit log, control ofmetadata, and other practices for each preservation object. The old IT practice of placinginformation into an archive when it becomes inactive or expired no longer works forcompliance or litigation support, and only adds cost.Authenticity: is defined in a digital retention and preservation context as a practice of verifying adigital object has not changed. Authenticity attempts to identify that an object iscurrently the same genuine object that it was “originally” and verify that it has notchanged over time unless that change is known and authorized. Authenticity verificationrequires the use of metadata. The critical change for IT practices is that metadata is now veryimportant and must be safeguarded with the same priorities the data is. IT term bridge/Source:LoC - Data Integrity - Sun Microsystems4

What is an Archive?A Searchable Repository That Provides Business Benefits Security Accessibility Integrity Scale Long Life Open Standards (Access anddata format) Cost and “Data” Effective Eco ResponsibleLoC - Data Integrity - Sun Microsystems5

Demands of a New Archive RealityIs the ratio for archivingsolutions changing?10 / 90versus2 / 18 / 80Next Generation Archives need toaddress a new dimension of themassive resting data – How do yousearch Petabytes of data from theedge?The new ratio has evolved into aWrite / Read / Search relationship(2 / 18 / 80) – different demands onthe infrastructureBusiness semantics need to drivedata management not systematicschemasVirtualization and Search becomecritical to the presentation of thedata, something new is needed.Compute and Store need toConvergeLoC - Data Integrity - Sun Microsystems6

Most Data Remains UntouchedTier 0Ultra Highperformance/Ultra Highvalue InformationAverage Distributionof Data by TierAge inDays1-3%Probability ofRe-reference15-20%70-80%340-60%720-25%301-5%90 Near 0%Value Index %T0 – 99.999 T1 - 99.999 T2 – 99.99T3 – 99.920-25%Green1 50-60%Type of TechnologyDRAM SSD, Flash Memory HDD, Hi-Perf DiskEnterprise-class HDD, RAID, Mirrors, ReplicationMidrange HDD, SATA, Virtual Tape, MAID, Integrated VirtualTape LibrariesHigh- Capacity Tape, MAID, Manual Tape, Shelf StorageSource: Horison Information Strategies Fred Moore www.horison.comLoC - Data Integrity - Sun MicrosystemsTier 1High-value, High lngest,OLTP, RevenueGenerating, Highperformance DataTier 2Backup/recovery Apps,Reference data, Vital andSensitive Data, Lowervalue active dataTier 3Fixed Content,Compliance, Archive,Long-term Retention,Green Storage Apps

Why Tape Continues to Make Good SenseFunctionTapeDiskLong span of media15 30 years on all newmedia.3 5 years for most HDDsPortabilityMedia is completelyremovable and easilytransported.Disks are difficult toremove and safelytransport.Move data to remotelocation for DR with orwithout electricityData/Media can be moveremotely with or withoutelectricity.Difficult to move disk datato remote location for DRwithout electricity.Inactive data does notconsume energyGreen storageVery rarely, except withMAID (questionable ROI).Encryption for highestsecurity levelEncryption available onessentially all tape drivestypes.Available on selected diskproducts.LoC - Data Integrity - Sun Microsystems

Make a Fool-Proof System and Naturecomes up with a more creative Fool! Human Error is the most likely and unpredictable source ofproblems The smartest people sometimes are the most likely to make anerror How a well-designed system provides mitigation Consider and mitigate all possible failure scenarios Provide user-friendly, simple management interface Eliminate human interaction as far as possible by policy-drivenautomated processes Use Quorum to validate critical actions“the smarter the person, the dumber the mistake”LoC - Data Integrity - Sun Microsystems

Store Data for Forever!Future-proof Data Storage for data preservation Archive files are self-describing, standard No lock-in, open TAR format Move data to newer, more reliable media over timetransparently WORM enforcement throughout the archiveLoC - Data Integrity - Sun Microsystems

System Basics User/Application Storage Layer Abstraction New Data Aged Data Policies Multi-Tiered, Multi-copy Archival LocalRemoteDistributedCacaded Continuous Data Protection, On-disk WORM andEncryption.LoC - Data Integrity - Sun Microsystems

Tape Encryption Technology Encryption Engine located between the Compression and formatting Functions Encrypted Data is highly randomized so encryption must be done post-compressionto retain the benefits of Compression All tape-based encryption products use AES-256 – the most powerful commerciallyavailable encryption algorithm All Firmware and Hardware encryption processes are validated by Known Answer Test atpower-on Drive is designed to ensure that data cannot be encrypted with a corrupted keySecure and Authenticated KeyTransferLoC - Data Integrity - Sun Microsystems

Typical Small ConfigurationSingle SiteDR SiteKeyManagementGUIOff-siteStorageServerCloud, In-house or3rd party DR FacilityCustomer Key Management NetworkKMAClusterKMA 2KMAKMA 1Off-site TapeRepositoryDataCenterLibraryL700 LibraryLoC - Data Integrity - Sun Microsystems

Key Life CycleLoC - Data Integrity - Sun Microsystems

What do we need to protect against?ThreatMitigationKey Management Appliance FailureKMS design replicates database to all KMA’s incluster. Database Backup protects universal multiplefailuresNetwork FailureKMS design can ride through temporary interruptions,managed switches can provide redundant networkconnection.Data Center Fire, Flood etc.KMS replication to off-site KMA’s in cluster. Backupdatabase to off-site server. Off-site tape vaulting.3rdParty DR Services.LoC - Data Integrity - Sun Microsystems

Mitigation for Small ConfigurationSingle SiteDR SiteOff-siteStorageServerCloud, In-house or3rd party DR FacilityKeyManagementGUIDatabase Transferusing Restore function(Quorum)Customer Key Management NetworkKMAClusterKMA 2KMAKMA 1Off-site TapeRepositoryLibraryL700 LibraryLoC - Data Integrity - Sun Microsystems

AES-256 The most powerful commercially available algorithmAES-256 uses a 256-bit keyA 256 bit number has 1.16X1077 permutationsIn July 2007, the population of the world was 6,602,224,175If you gave everyone in the world a super-computer that tries a key valueevery nanosecond, it would take 5.56 X 1050 years to try all combinations Assumes that key values are adequately random “At 20 to 30 x 109 years, the sun will expand into a red ball and die,overwhelming Earth with the heat. Oceans will boil and evaporate, andother planets near the sun also will burn” January 15, 1997 With AES-256, it is imperative that your system protects itself againstmalicious or inadvertent loss of keysLoC - Data Integrity - Sun Microsystems

FIPS 140-2 Security LevelsModules are evaluated against 12 sets of criteria and assigned aSecurity LevelThe Security Level of the Complete Module is determined by thelowest Security Level per criterion Security Level 1 is “Basic" Security Level 2 adds "Tamper Evidence" often by using approvedlabels. Security Level 3 is "Tamper Resistant" often by encapsulating thedevice in thick epoxy Security Level 4 is “Tamper Respondent" for example activecircuitry will erase keys if anyone tampers with the device.LoC - Data Integrity - Sun Microsystems

Sun T10000B FIPS CertificatePrior to Certification of the Module, theimplementation of each cryptographicalgorithm used in the module must betested and FIPS-certifiedLoC - Data Integrity - Sun Microsystems

Thank YouforTHANKYOU!YourTime andAttentionRaymond.Clarke@sun.com(212) 558-9321Raymond.Clarke@Sun.com(212) 558-9321

Means and Practices Raymond A. Clarke Sr. Enterprise Storage Solutions Specialist, Sun Microsystems - Archive & Backup Solutions . Enable eDiscovery Meet best practice Search Criteria Online files recalled based on key word/date criteria . information assets for eDiscovery, litigation evidence, security, and regulatory compliance .