Data Protection And Information Governance Across Data Silos

Transcription

11/16/2017Data Protection and InformationGovernance Across Data Silos Patrick McGrath – Director Solutions Marketing – Archive, Search, Analytics 2017 COMMVAULT SYSTEMS, INC. ALL RIGHTS RESERVED.What are these?1

11/16/2017Creating new value from old contentUC Berkeley Prosopography Services4Goldmine or Minefield?Common info themes with many organizations? Information expensive to manage, conserve/preserve and use Most information is dark or inaccessible, constrained by medium, volume or steward Understanding of information constrained by issues such as language, context andtimeliness Information managed, governed and stored centrally within the organization Decisions influenced by information from within the organizationHowever, times have changed52

11/16/2017Everyone approaches information from a different angleContent Creators, Researchers“Get out of my way – I have work to do!”Legal“What do you mean you cant find it?”Records, Archivists, Librarians“I wont accept this without my 193 fields of metadata!”Information Technology“I don’t care. Stop using so much storage!”6 Data Sprawl3

11/16/2017Digital Transformation and the Internet of ThingsForces fueling the move to the cloudExplosion of Digital Data90% of all digital data evercreated was created in thelast 2 years.Source: Sintef ITCCloud Usage on the RiseConnected Devices Skyrocketing70%20.8BILLION of organizations aremanaging some of their datain cloud infrastructure. things could be connected to theInternet by 2020, about 5.5 milliondevices added every day.Source: IDCSource: Gartner8Data sprawlContaining sensitive informationData subject areasCustomersVendor/PartnersEmployeesSpread across Locations/JurisdictionsProviders, SLA’sDevices (e.g. Laptops, IoT)Applications ERP CRM ECM EDW Archive BackupsProductsIntellectual PropertyHow to managevisibility andconsistency of datahandling?ContractsIn different parts of the Organization4

11/16/2017Heart of the problem“ while the average large UK business now uses 24 systems tomanage and store personal data, 1 in 5 use over 40 systems to do so.”-Nick Ismail, Information Age (citing 2017 OnePoll Survey)“There is one application for every 5-10 employees generating copiesof the same files leading to massive amounts of duplicate, idle data ”-Michael Vizard, ITBusinessEdge.com10Data copies and silosPersonal Cloud & okPSTsMailServerReplicateRemote FileServersDatacentreFile ServersServerBackupEnd ianceArchiveComplianceReplicaDept. kupFileAnalyticsServerBackupFile ArchiveMultipleBackupsArchiveBackup5

11/16/2017Complexity hinders compliance and increases risk?Silo?CLOUD DATASaaS?DATA CENTERSLEGACY SYSTEMSSilox?PAIN: LACK OF CONTROL AND ANALYSIS Archive and Search systems create silosLack common search and collateMultiple access controls to manageGaps in coverage present riskDrives demand for more ‘data lakes’projectsPAIN: BACKUP AND RECOVERY RISKS PAIN: VISIBILITY OF EXTERNAL DATAToo many siloed solutions & repositoriesImpossible to set common policiesReporting is a challengeVariable controls for access & auditComplexity leads to gaps in coverage Data held externally is difficult to track Protection managed by 3rd party Limited ability to archive or manageretention Risk of data on unsanctioned Clouds Mobile and Shadow ITChange drivers Customer demandsRansomwareData Privacy / GDPR Hack leads to data encryption,loss or copying EU personal data privacy External competition Workforce competition Compliance and security Unless price paid, could lead to Halt of business operations forcritical data Publication of sensitive data Could also lead to notifiableloss incident Serious consequences ( ) Focus on EU resident personal data Global companies also liable Process and technology changefor many Consent, requests, breach notification, etc.Key Takeaways:Know your critical and sensitive data.Get rid of it if you don’t need it!136

11/16/2017Where is this highly controlled data?AIIM Report – Understanding GDPR in 201714Control over the controlled data impacted by AIIM Report – Understanding GDPR in 2017157

11/16/2017UC Berkeley data breach incidentsCalifornia SB1386 EnactedJul 20032003UCB Grad Division IncidentUCB J-School IncidentUCB E&I IncidentMar 2005Aug 2009May 201520032005200720092011201320162015May 2006May 2009Feb 2016UCB RSSP IncidentUCB UHS IncidentUCB BFS IncidentAug 2014UCB Cap Proj / Real Estate Incident16Data Breach Response PhasesBreach DetectedDeclare diationGDPR 72 Hours to notify authorities “Without undue delay” to notify victims YOU are responsible for the data handling of your providers8

11/16/2017 So, how to deal withlandmines and goldmines?Primary Data SourceMailboxesDataCenterApps &DatabasesCloudEndpointsBig Data, Analytics,360 Dashboards &Reporting, BusinessIntelligenceIoT &ExternalDR/HotsiteDev/Test1. Ingest Unstructured (Files, Social)Structured Data (DB)Metadata & Usage InfoDedupe IoT Big Data Backup/Archive (Store) – OR In Place Indexing (No-Store)4. Recover3. Govern Access Retention/DispositionApplications estigationsCase5. Use6. ExtendOperationalRecovery Monitor Alerts/Notifications Process Initiation/Automation2. UnderstandData Platform ContentsUsageMeaning and ContextData Profiling/Entity ExtractionRecommendations9

11/16/2017File system data source example – Storage optimizationDuplicatesOrphaned filesSensitive dataSensitive Data Detection2110

11/16/2017Data Analytics ApplicationsData collection, indexing, analyticsvisualization and action!Profile-Based Applications Architecture to enable content-awareapplicationsFilesADLive Data Fine tuned for the specific knowledge anduse case for a desired outcomeAppsData IndexIngest User profile based applicationsContent Index,Federation,Virtualize, EnrichIngestData ServicesFilesEmailSAAS Core capabilitiesEdgeStored DataVirtual Repository Data indexingInfrastructure Data detection Visualizations / ReportsSANTraditional WorkflowMixed & ConvergedSoftware DefinedCloud Data policy automation API access Audit trailsSearch & Analytics UnlocksSensitive Data ManagementDISCOVER: Discover risk data,across file, endpoint, email andstructured data, and present for riskevaluation and action taking,removal or retention by definedpolicy.Simplify Responseto Access,Rectification andErasure RequestsMap EnterpriseInformationData CenterCleanupMonitor forPersonal Data inUnauthorizedLocationsAutomateClassification &RetentionDISCOVERIdentifyAnomalous AccessMANAGEOptimizeAccessibilityMANAGE: Ensure risk data isalways managed to standardswith ongoing risk assessments.PROTECT: Minimize use of riskdata and protect from loss,breach or damage.PROTECTDetectRansomwareAccelerate BreachNotificationPlanningAutomate StorageTiering &DispositionDemonstrateComplianceOptimize BusinessContinuityEncrypt & ProtectEnd User ComputerData11

11/16/2017 Getting to the right information quickly Search and machine learningBeta Coming Soon!Compliance search – Commvault and LucidworksAI intelligence with ease of use12

11/16/2017Prioritize and reduce costs with Machine LearningDo it at scale 5M documents/hourLower costsIntegrated AI80-90% reductionPowerful but easy touseIncrease relevancyFind and review whatmatters26Leveraging the AI ecosystemInitiate InvestigationArchiveDefine Review Set to AnalyzeBrainspace Pulls Informationfrom the Review SetInvestigation starts by defining the searchparameters of a Review SetThe plugin streams data from the ReviewSet into Brainspace using the templatedfield mapProcessAnalyzeAnalytics and SyncCollectionSync toReview SetCreate aCollection inBrainspaceOverlay FullReportUser can performactions on theReview Set based onthe synchronizedBrainspace tagsUser creates acollection inBrainspaceusing VisualAnalyticsFull report datais synced intoCommvaultwhen build iscompleteBuildExecuteVisualAnalyticsBrainspace receivesstreamed text andmetadata to create aCluster Wheel,perform CommAnalysis, and displaya dashboard2713

11/16/2017Integration StepsFirst step: Define Review Set to AnalyzeSecond Step: Transfer Review Set to BrainspaceThird Step: Perform Brainspace AnalyticsFourth Step: Create a Collection from your Analytics Result and Sync back to CommvaultFifth Step: Take Action in Review Set from Brainspace TagSecond Step: Transfer Review Set to BrainspacePick Review SetIngestion Fields PreconfiguredBuild Process2914

11/16/2017Third Step: Perform Brainspace Analytics Analytics Dashboard Transparent Concept Search Cluster Wheel Conversation Analysis Communication Analysis Advanced Document Classification ‐ Predictive Coding Advanced Document Classification ‐ Continuous Multi‐Modal Learning(CMML)Analytics DashboardThe overview dashboard is completelyinteractive and provides insight at a glancefor the entire dataset, including:oooooDuplicates & Near‐DuplicatesTimelineFaceted ListsConcept SearchDocument Results15

11/16/2017Transparent Concept SearchBrainspace’s next‐generation TransparentConcept Search provides the advantagesof concept searching without thetraditional drawbacks.Transparent Concept Search significantlyreduces the time and expense resultingfrom over‐inclusive document retrievalby allowing users interact with theconcept expansion to boost or eliminateconcepts.No black box.Cluster WheelThe Brainspace Cluster Wheel showcasesour dynamic learning by organizing alldocuments into conceptually similarclusters.The wheel is animated and interactive,showing neighborly populations ofdocuments, making early assessmentintuitive even for extremely largedatasets.16

11/16/2017Conversation AnalysisUsing Conversations allows you tovisualize email activities withina dataset.Users can track the flow ofinformation throughout anorganization by exploring whatemails have been sent to who anddetermine what email domains havebeen most accessed.Communication AnalysisWho said what to whom?Brainspace’s communication analysisview adapts to any active query andprovides interactive exploration of emailconversations, including:ooooooInteractive Social GraphTo, CC and BCC filteringSender/Recipient VolumeTop relationshipsTop TermsAlias Consolidation17

11/16/2017Advanced Document ClassificationPredictive CodingBrainspace’s Predictive Coding uses ourpatented machine learning technologytogether with Logistic Regression andActive Learning to help you review lessand decrease your associated costs.Brainspace gives you more control byallowing you to set your target recall atthe beginning and allow you to adjust itby providing feedback on depth for recallperformance throughout the process.Advanced Document ClassificationContinuous Multi-Modal Learning(CMML)The Continuous Multi‐Modal Learning, orCMML, workflow can be carried outentirely in Brainspace, and integratessupervised learning with Brainspace’stagging system.CMML focuses on finding targetdocuments during training, rather thanon producing a predictive model toidentify documents for later review.18

11/16/2017Fourth Step: Create a Notebook from the Analytics Resultand Sync Back to CommvaultCreate NotebookSelect tagSyncFifth Step: Take Action in Review Set from BrainspaceTag19

11/16/2017ConclusionGiven explosive growth of Data volumesNumber of silosSecurity threatsCompliance requirementsConsiderations Develop Information Governance as a core capabilityAlign data protection to the needs of the data and the businessIncrease data intelligenceDrive data visibility across silosAutomate data policy Questions? Discussion20

11/16/2017Thank you.PROTECT. ACCESS. COMPLY. SHARE.COMMVAULT.COM 888.746.3849 GET-INFO@COMMVAULT.COM 2017 COMMVAULT SYSTEMS, INC. ALL RIGHTS RESERVED.Patrick McGrathDirector, Solutions Marketing, Contentpmcgrath@commvault.com@patrickiest21

Commvault when build is complete Brainspace receives streamed text and metadata to create a Cluster Wheel, perform Comm Analysis, and display a dashboard Analytics and Sync Archive Process Analyze Build User can perform actions on the Review Set based on the synchronized Brainspace tags Collection Syn