Data Security In Hadoop

Transcription

Data Security in HadoopEric Mizell – Director, Solution EngineeringPage 1 Hortonworks Inc. 2011 – 2014. All Rights Reserved

What is Data Security?Data Security for Hadoop allows you to administer asingular policy for authentication of users, authorize dataaccess, protect data at rest and in motion, and collect arecord of interactions with that data.It allows you to coordinate enforcement of this policyacross the entire Hadoop stack.Page 2 Hortonworks Inc. 2011 – 2014. All Rights Reserved

What is Security?5 areas of security focusAdministrationCentrally management &consistent securityAuthenticationAuthenticate users and systemsAuthorizationProvision access to dataAuditMaintain a record of data accessData ProtectionProtect data at rest and in motionPage 3 Hortonworks Inc. 2011 – 2014. All Rights Reserved

Why Security?Security needs are changing YARN unlocks the data lake5 areas of security focus Multi-tenant: Multiple applications for data accessAdministration Changing and complex compliance environmentCentrally management &consistent security ETL of non-sensitive data can yield sensitive dataAuthenticationAuthenticate users and systemsAuthorizationProvision access to dataAuditMaintain a record of data accessData ProtectionProtect data at rest and in motionPage 4 Hortonworks Inc. 2011 – 2014. All Rights ReservedFall 2013Largely silo’d deploymentswith single workload clustersSummer 201465% of clusters hostmultiple workloads

HDP delivers comprehensive security for HadoopHDP 2.1Hortonworks Data PlatformBATCH, INTERACTIVE & REAL-TIMEDATA ACCESSGOVERNANCE &INTEGRATIONData Workflow,Lifecycle QLNoSQL Stream Search In-Memory OthersHiveHBaseHCatalog orizationAccountingData ProtectionProvision,Manage &MonitorTezStorage: HDFSResources: YARNAccess: Hive, Pipeline: FalconCluster: KnoxYARN: Data Operating System1 HDFS N(Hadoop Distributed File System)DATA MANAGEMENTPage 6ISVEnginesSECURITY Hortonworks Inc. 2011 – 2014. All Rights ReservedAmbariZookeeperSchedulingCOMPREHENSIVE SECURITYMeet all security requirementsacross authentication,authorization, audit & dataprotectionCENTRAL ADMINISTRATIONProvide one location foradministering security policies andfor viewing and managing auditacross the platformOozieCONSISTENT INTEGRATIONIntegrate with other security andidentity management systems, forcompliance with IT policies

Start of 2014: Hadoop Data Security CapabilitiesRequirements: Authenticate, authorize, provide auditabilityof data access and protect data at rest and in motionAdministrationCentrally management &consistent securityAuthenticationAuthenticate users and systemsAuthorizationProvision access to dataAuditMaintain a record of data accessData ProtectionProtect data at rest and in motionPage 7 Hortonworks Inc. 2011 – 2014. All Rights ReservedDid not existKerberos & Apache KnoxFragmented across Components Hive: ATZ-NG HDFS: ACL’s Sentry (interactive SQL & Search)Fragmented across Components3rd party add-ons, Integrationw/Apache Falcon & OS capabilitiesStart of 2014:State of Hadoop Security Largely disjoint patchwork A few projects address spot needs No coverage across all workloads No central administration andenforcement

May 2014: Hortonworks Acquires XA SecureRequirements: Authenticate, authorize, provide auditabilityof data access and protect data at rest and in motionAdministrationCentrally management &consistent security Authentication Authenticate users and systemsAuthorizationProvision access to dataAuditMaintain a record of data accessData ProtectionProtect data at rest and in motionPage 8Hortonworksacquired XA SecureAccelerates delivery against the enterpriserequirements for central security administrationand enforcement across all Hadoop workloadsfrom, batch, interactive SQL and real-time Hortonworks Inc. 2011 – 2014. All Rights ReservedFounded in 2013, XASecure provides anenterprise ready, cross-platform, security layerbuilt from the ground up for Hadoop, providingcentralized capabilities around data security,authorization, auditing and overall governance.

Current State of Security in HDPHDP provides central administration and coordinated enforcementof security policy across the entire Hadoop ecosystem of projects.AdministrationCentrally management &consistent securityAuthenticationAuthenticate users and systemsCentral administrationKerberos & Apache KnoxWith additional XA Securefeatures, HDP is a leader inHadoop SecurityNEW FEATURESAuthorizationProvision access to dataAuditMaintain a record of data accessData ProtectionProtect data at rest and in motionPage 9Authorization for HDFS,Hive, HBase, etc.Compliance controls3rd party add-ons, Integrationw/Apache Falcon & OScapabilities Hortonworks Inc. 2011 – 2014. All Rights Reserved Centralized security policy enforcementGranular access control across HDFS,Hive, and HBaseUniversal audit trackingCompliance conformance controls

Central Security AdministrationHDP Advanced Security Delivers a ‘single pane of glass’ forthe security administrator Centralizes administration ofsecurity policy Ensures consistent coverage acrossthe entire Hadoop stack Apache ArgusPage 10 Hortonworks Inc. 2011 – 2014. All Rights ReservedAll delivered in OpenSource under thegovernance of theASF in Fall 2014

AuthenticationFor more than 20 years, Kerberos has been the de-facto standard forstrong authentication no other option exists.What does Kerberos Do? Establishes identity for clients, hosts and services Prevents impersonation/passwords are never sent over the wire Integrates w/ enterprise identity mgmt tools such as LDAP & Active Directory More granular auditing of data access/job executionThe design and implementation of Kerberos security in native ApacheHadoop was delivered by Hortonworker, Owen O’Malley in 2010Page 11 Hortonworks Inc. 2011 – 2014. All Rights Reserved

Perimeter Security with Apache KnoxIncubated and led by Hortonworks,Apache Knox provides a simple and openframework for Hadoop perimeter security.Single, simple point ofaccess for a clusterCentral controls ensureconsistency across one ormore clustersIntegrated with existingsystems to simplifyidentity maintenance Single Hadoop access point Eliminates SSH “edge node” REST API hierarchy Central API management SSO Integration –Siteminder and OAM* Consolidated API calls Central audit control LDAP & AD integration Multi-cluster support Simple Service levelAuthorizationPage 12 Hortonworks Inc. 2011 – 2014. All Rights Reserved

Authorization and AuditAuthorizationFine grain access control HDFS – Folder, File Hive – Database, Table, Column HBase – Table, Column Family, ColumnFlexibilityin definingpoliciesAuditExtensive user access auditing inHDFS, Hive and HBase IP Address Resource type/ resource Timestamp Access granted or deniedPage 13 Hortonworks Inc. 2011 – 2014. All Rights ReservedControlaccess intosystem

Data ProtectionHDP allows you to apply data protection policy atthree different layers across the Hadoop stackLayerWhat?How ?StorageEncrypt data while it is at restPartners, OS level encrypt, Custom CodeTransmissionEncrypt data as it movesSupported in HDP 2.1Upon AccessApply restrictions when accessedPartner, Custom CodePage 14 Hortonworks Inc. 2011 – 2014. All Rights Reserved

Thank You!Eric Mizell – Director, Solution EngineeringPage 15 Hortonworks Inc. 2011 – 2014. All Rights Reserved

Current State of Security in HDP HDP provides central administration and coordinated enforcement of security policy across the entire Hadoop ecosystem of projects. With additional XA Secure features, HDP is a leader in Hadoop Security NEW FEATURES Centralized security policy enforcement Granular access control across HDFS, Hive, and HBase