R For The Analysis Of Clinical Data - Lexjansen

Transcription

R for the Analysis of Clinical DataGreg JonesOracle Health SciencesJun 2018Copyright 2017,2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended forinformation purposes only, and may not be incorporated into any contract. It is not acommitment to deliver any material, code, or functionality, and should not be relied uponin making purchasing decisions. The development, release, and timing of any features orfunctionality described for Oracle’s products remains at the sole discretion of Oracle.Copyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataR - Agenda1. R: Open, Available & Extensible, Scalable2. Clean, Transform, Aggregate Data for R Analysis3. Use Case 1: CERN (the European Organization for Nuclear Research)4. Use Case 2: NHS Business Services Authority (the UK National Health Service)5. Use Case 3: Cross Study Analysis6. Use Case 4: Healthcare Analysis7. Use Case 5: R for Machine Learning Analysis8. R: Regulatory Considerations9. R In Commercial Applications10. R: The FutureCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataR - Open, AvailableR - Extensible, ScalableThe growth in the rangeof inter-connecteddevices acrosshealthcare represents anexponential growth inthe volume of datacollected in ever moreelaborate Clinical TrialsThis growth in thevolume of data presentsnew challenges forClinical Data Scientistsand requires newsolutions and new toolsfor cross-study analysisR is used by a growingnumber of data analystsinside corporations andacademia, whetherbeing used to set adprices, find new drugsmore quickly or finetune financial modelsIt is also free. R is freefor anyone to use andmodify so statisticians,engineers and datascientists can improvethe software’s code orwrite variations forspecific tasksTo meet these demands,Clinical Data Scientistsare increasingly choosingopen source solutions toleverage the active opensource communities ofexperienced developersand statisticiansThe R scripting languageis increasingly popularand supports big data,predictive analytics, andoffers the potential toleverage machinelearning and artificialintelligencePackages written for Radd advancedalgorithms, richlycoloured and texturedgraphs and miningtechniques to dig deeperinto databases, objectstores, data lakes and bigdata sourcesPharma companies havecreated customizedpackages for R to letscientists manipulatetheir own data duringnonclinical drug studiesrather than send theinformation off to astatisticianCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataR is Gaining PopularityData Science job trends for R (blue) and SAS (orange)Copyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

Structured DataLife Sciences Warehouse Cloud etyDataUnstructured viceDataUnstructured EHRDataScientificPublicationDataLet’s start with the traditional Data Sources for a Clinical Trial.There are traditional Data Sources that are typically Structured in nature. These DataSources represent data in tables and columns format.EHRDataLabs &LIMS DataThere are also relatively newer, non-traditional Data Sources that can be Unstructured innature. These Data Sources typically provide data in non table and column format. Forexample, large text files that come from Scientific Publications.Some of these Data Sources – like EHR Data – are Real World Data Sources added to themore traditional Data Sources that come from running Clinical Trials. By combining theseRWD Data Sources together we have the opportunity to accelerate Clinical Research anddo it more efficiently.Also, as you can see EHR and Device/Wearable Data Sources can be both Structured andUnstructured in nature. Therefore, flexibility is critical.Copyright 2018, Oracle and/or its affiliates. All rights reserved. 6

Structured DataLife Sciences Warehouse Cloud ECGDataIxRSDataEHRDataDevices/WearablesLabs & LIMSDataData Factory component of LSW Cloud PlatformData FactoryLife Sciences WarehouseCloud PlatformUnstructured DataData Management WorkbenchCSDW/Life Sciences DataDevices/Wearables- Highly optimized to drive low cost, efficient Submission PreparationProcesses-DataDataManagementManagers areWorkbenchprimary Users to prepare Data for multiple types ofdownstream analysisWorks closely together with LSH to support the full lifecycle for a acquisition and transformation platform. Enables automation of critical processes for highly productive loading,cleaning, transforming and aggregating Data to prepare for multiple typesof downstream analysis by the Reviewer community.Several Oracle products in the Data Factory componentFirst is Life Sciences Hub. This product is the Structured Data Source DataWarehouse.Data LakeBig Data Cloud ServiceCloud Enabling TechnologiesBigData Cloudthealastmajor piece.It maintainskeyServiceaspectsis forregulatedClinical Trial Data Warehouseincluding Lifecycle Management, Versioning, Auditing, Traceability &Thiscomponentis optional and can be added to the Data Factory at anyPoolingcapabilities.time when there’s a need to support non-traditional Unstructured DataSources such as Unstructured EHR Data.Typical Unstructured Data Source Use Cases result in “processing” theUnstructured Data Source and driving a reduced or summarized resultinto DMW for inclusion in the Submission Preparaion Processes.Combining the Big Data Cloud Service with DMW and LSH gives you afederated environment where we “use the right tools for the right job” tocontrol and manage your Structured and Unstructured Data Sourcestogether for rapidly cleaning, transforming, and aggregating all the Data7Copyright 2018, Oracle and/or its affiliates. All rights reserved. Sources in preparing them for Analysis.

Structured DataLife Sciences Warehouse Cloud ECGDataIxRSDataEHRDataDevices/WearablesLabs & LIMSDataData FactoryLife Sciences WarehouseCloud PlatformUnstructured lesEDC &LabReconLineListings SDTM DatasetsData Management WorkbenchLife Sciences HubConnectedDeviceDataMapsData Lake REVIEW ModelsAI & MLAERecon VisualizationconnectorBig Data Cloud ServiceCloud Enabling rdsAlertsAnalytics and Reporting PlatformPerform the Data Review Activities across the Clinical Development Organization.Support the various roles across the Organization including: Data Managers, SafetyAnalysts, Medical Reviewers, etc.Use Oracle Analytics Cloud Service and LSW Cloud Platform to build and deliver thevariousCopyrightReports,Visualizations, and Dashboards 2018, Oracle and/or its affiliates. All rights reserved. 8

Structured DataLife Sciences Warehouse Cloud ECGDataIxRSDataEHRDataDevices/WearablesLabs & LIMSDataData FactoryLife Sciences WarehouseCloud PlatformKRIsData Management WorkbenchLife Sciences HubUnstructured DataEDC taData Lake SDTM Datasets REVIEW Models VisualizationconnectorAEReconAI & tionDataBig Data Cloud ServiceCloud Enabling sDiscovery LabRAlertsThe Discovery Lab - generally a new capability in a modernAnalytics PlatformEnables exploration of massive volumes of historical Clinical Trial& Real World Data Sources by Data Scientists using various toolsfor Visualization, Machine Learning, etc. to derive insights forfuture trial design.Data VisualizerCopyright 2018, Oracle and/or its affiliates. All rights reserved. 9

R for the Analysis of Clinical DataUse Case 1 - CERN Established in 1954, CERN (the EuropeanOrganization for Nuclear Research) is thelargest particle-physics laboratory in the world CERN uses big data, cloud computing, andanalytics to help researchers unravel themysteries of the universe The CERN team is building Machine Learningmodels to predict potential failures These models use R and run in the OracleDatabaseCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataUse Case 2 - NHS Business Services Authority The NHS (UK NationalHealth Service) is the largestand oldest single-payerhealthcare system in theworld The NHS Business ServicesAuthority learned to makethe most of its data thanksto analytics tools, andidentified huge potentialsavingsCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataUse Case 2 - NHS Business Services Authority Data analysis using R: The NHS was able to identify potential savings of over GB 1 billion By providing accurate, reliable data back to clinicians and policy makers it hasenabled antibiotic prescribing to be reduced by 7%.“The overall solution is very fast, and our investment very quickly provided value. Wecan now do so much more with our data, resulting in significant savings for the NHSas a whole.”Nina Monckton, NHS Business Services AuthorityCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataUse Case 3 - Cross Study Analysis The team used the Oracle R Distribution 3.1.1and RStudio to prepare the analysis:1. Connect to standardized DMW data2. Combine data across multiple studies usingrbind3. Train predictive analytics algorithms4. Create complex visualizations in R5. Apply term analysis6. Export to SAS V5 xptCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataUse Case 4 – Transaction Analysis Historicalobservationof studyperformance Time seriesalgorithms touncovertrends Predictiveanalysis ofsystem loadCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataUse Case 5 – Healthcare Predictions: Re-admission Rates, Heart Disease Likelihood Machine Learning can predict hospital readmission rates Machine Learning can predict likelihood ofheart diseasehttps://www.youtube.com/watch?v IichF5pBt cs/data-visualization/library.htmlCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataRegulatory Considerations The FDA's Statistical Software Clarifying Statement declares that any suitablesoftware can be used in a regulatory submission– XPT file format is an open standard, not restricted to SAS– XPT files can be read into R with the read.xport function, and data can be exported with thewrite.xport function in the SASxport package– RStudio, a popular editor for R, uses the Haven package to import SAS datasets The R Foundation also provides guidance on how R complies with other FDAregulations– Regulatory Compliance and Validation Issues - A Guidance Document for the Use of R inRegulated Clinical Trial Environments.Copyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataR in Commercial Applications Oracle Analytics Cloud and Data Visualization Desktop use R for their Advanced Analytics andMachine Learning functions, allowing users to leverage existing R packages and upload theirown to power their analyses Oracle R Distribution - Oracle's supported redistribution of open source R, provided as a freedownload from Oracle, enhanced with high performance linear algebra libraries ROracle - An open source R package, maintained by Oracle and enhanced to use the OracleCall Interface (OCI) libraries to handle database connections - providing a high-performanceinterface to Oracle Database Oracle R Enterprise – ORE makes the open source R statistical programming language andenvironment ready for the enterprise with scalability, performance, and ease of deployment Oracle R Advanced Analytics for Hadoop - High performance native access to the HadoopDistributed File System (HDFS) and MapReduce programming framework for R usersCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

R for the Analysis of Clinical DataThe Future R use is clearly growing across many industries and it is seen as one of thekey tools for today’s Clinical Data Scientist R is embedded in many leading industry solutions R can power Machine Learning and Artificial Intelligence The availability of a commercial distribution of R can re-assure users in evenhighly regulated industries Confirmation from the FDA that it can be used to analyse clinical studiesleaves no barriers to R adoption across the clinical trial lifecycle and beyondCopyright 2018, Oracle and/or its affiliates. All rights reserved. Confidential – Oracle Internal

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

R for the Analysis of Clinical Data R - Open, Available Confidential – Oracle Internal R is used by a growing number of data analysts inside corporations and academia, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models It is also free. R