Helium: A Data Driven User Tool For SAR Analysis - ChemAxon

Transcription

Helium: A Data Driven User Tool for SAR AnalysisMay 17th 2011Karen Worsfold

Today’s presentation – Questions to be answeredWhy did GSK undertake development of Helium?How was the Helium idea conceived?How did GSK develop Helium to ensure user satisfaction and acceptance?What is the architecture of Helium?What functionality does Helium provide?What synergies does GSK see with combining Excel, JChem for Excel andHelium?What is Helium’s current status and future plans?

Where we came fromIT Resources Supporting SAR in Discovery90’s00’sToday

Where we are headedIT Resources Supporting SAR in Discovery90’s00’sTodayFuture?Tibco SpotfireExcel 2007Jchem for Excel

The Birth of ‘Helium’One stop shopping for all of your SAR data needsGiving the user control over what data they wanted rather thanneeding to know where the data was storedFirst prototype was a blank screen .users had too muchflexibility and didn’t know where to start!Next incarnation was integrated with Tibco Spotfire– Helium to gather the data and Spotfire to view the data in multiple ways (tables,scatterplots, bar charts etc)– Spotfire had the ability for plug-in panels to be created eg a Structure Viewer– Spotfire supported large data sets– At the time, cost was not a significant factor

The ‘Cold Turkey’Using Agile Development Techniques we ran a Proof of Conceptto see if it could also include a forms view allowing us to replaceISIS Base– Gave to Discovery scientists and asked them to stop using their legacytools and just use Helium with Spotfire.As they identified a task they needed to do, we developed the Heliumfeature to enable it– Significant Improvement was made in usability– Data retrieval functionality and usability of Helium were received ‘VeryPositively’– Less positive feedback with the attempt to integrate a “forms” view– Still received resistance from the average bench chemist due to Spotfire

What next for ‘Helium’?Decision to provide a simple grid view tool, a more complex visualisationtool and a forms view tool– Helium in Excel– Helium in Tibco Spotfire– Instant JchemDecision on what tool to use was based on what they wanted to do withthe data and not what data they needed

‘Helium’, the SequelIn 2009, GSK purchases theChemaxon Suite of tools– JChem for Excel– Instant JChem– JChem cartridgeAnd so Helium was to beintegrated with in Excel and useJChem for Excel API to managestructuresBenefitsJChem for Excel enhances afamiliar and comfortable toolInstant JChem provides flexibleforms toolJChem Cartridge underpins ourchemistry web services

The Development Approach – Keys to SuccessInternallyUser interaction and OwnershipIdentified Lead End User (Senior Researcher from Discovery)Interactive End User Group covering all disciplines and sites within R&DAgile development approach with regular deliverablesWeekly End User Group meetingsExtended End Users engaged at key pointsAccessed “Live” data, making the tool immediately useful‘Viral’ release of the software to R&DExternallySuperb support and response from ChemAxonCollaborative relationship in solving problemsExcellent communication between the two companies

Helium ArchitectureWeb Services areused to Supply anynon-biological DataChemistry LookupStructure SearchBiological Data is accesseddirectly through OracleBiologicalDataTranslation ServiceProperty InformationServiceDerived PropertyServiceData is accessed through the Company WAN Clients can be in any R&D location, and can be from many differentdisciplines(Biology, Chemistry, Computational Chemistry, Compound Management) Dataset sizes can range from tens of compounds or structures to tens of thousands

Keeping Helium up to dateIn Excel– Microsoft ClickOnce for managing the customizationClient code regularly checks for updates and installsInsures that update installation is simpleKeeps installations of the software from “falling behind”In Spotfire– Each log-on to the server checks for updates and promptsuser to install

The components of Helium Microsoft Excel 2007 JChem for Excel Helium General functionality for formatting, sorting, presentation and creating equations Chemical Structure presentation, file import and export, etc. Access to GSK web services, GSK biological data and consistency with legacy data

Helium User InterfaceHelium Ribbon – fornon data specific tasksJChem Ribbon toaccess JC4XLFunctionalityDatatype SensitiveTask Panel. Thisdisplay appears whena “structure” data typeis selected,Worksheet withdatatyped Helium Data

Helium Functionality Datatyping assigned byregular expressions Generic data types(e.g.Project Id) can be manuallyset

Helium Functionality ListData TypeProject IDProject IDCompound NumberCompound NumberCompound NumberCompound NumberCompound NumberCompound NumberCompound NumberCompound NumberCompound NumberCompound ILESSMILESSMILESSMILESExternal IDExternal IDTaskGet Compound NumbersValidate Project IDGet Biological DataGet Critical ResultsGet LNB RefsGet ParentGet SMILESGet StructureGet Structure for Parent CompoundGet VersionsIdentify duplicate structuresValidate CompoundsCalculate Derived PropertiesCalculate Simple PropertiesCanonicalizeGet Exact StructuresGet Similar CompoundsRemove IsotopesRemove Salts / SolvatesRemove StereochemistrySearch Sub-structure in GSK databasesStandardize ValencesSub-structure search within tableGet SMILESGet StructureData TypeLnb RefLnb RefLnb RefLnb RefLnb RefLnb RefUser uctureStructureStructureStructureStructureParent Compound NumberParent Compound NumberParent Compound NumberParent Compound NumberParent Compound NumberNullNullNullTaskGet Compounds NumbersGet Registration InformationGet SMILESGet StructureGet VersionsValidate LnbRefsGet Registered Compounds LNB RefsCalculate Derived PropertiesCalculate Simple PropertiesGet Exact StructuresGet Similar CompoundsRemove IsotopesRemove Salts / SolvatesRemove StereochemistrySub-structure Search in GSK databasesStandardize ValencesSub-structure Search within tableGet ParentGet SMILESGet StructuresGet VersionsValidate CompoundsGet Biological DataGet Similar CompoundsSearch Sub-structure in GSK databases

Helium Performance (worst case conditions)ActivityRetrieve 33,885 Compound Numbers from 110 Project IdsRetrieve SMILES from 33,885 Compound NumbersRetrieve Structures from 33,885 Compound NumbersTime28"5' 12"26' 28"Canonicalize 33,885 Structures8' 05"Remove Salts and Solvates from 33,885 Structures9' 35"Remove Stereochemistry from 33,885 Structures10' 05"Get Lot Ids from 33,885 Compound Numbers11' 40"Validate 33,885 Compound Numbers11' 38"Check for Duplicate Structures for 33,885 Compound Numbers9' 12"Retrieve Parent Compound Number for 33,885 Compound Numbers8' 30"Retrieve MF, MW, BEW for 33,885 Compound Numbers37' 47"This performance level represents a significant improvement over previous tools!Typical SAR datasets are much smaller

Helium, JChem and Excel – the PossibilitiesWith the various tools available within Excel 2007, Helium and JChem forExcel, the researcher has an extensive, flexible tool for interrogating data:Excel conditional formattingoptions give great highlightingoptions for identifying patternsin your data.

Helium, JChem and Excel – the PossibilitiesWith the various tools available within Excel 2007, Helium and JChem forExcel, the researcher has an extensive, flexible tool for interrogating data:Easy column filteringFor Example after R-Groupdecomposition, you canconvert structure to SMILESand then use the SMILES asfilters in Excel to segregateyour data

Helium, JChem and Excel - the BenefitsBringing new capabilities to be included in SAREasily allows comparison of assay data vs. off target activity vs. liabilityA single, non-threatening interface to Discovery dataMost researchers are familiar with Excel and its functionalityExcel files are universally portable and support data interchangeOne application to support and maintain vs. the legacy list of multiple toolsWe benefit from the advancement of Excel in the future (Excel 2010)

Helium in Spotfire

Current Deployment StatusHelium for Excel Version 2.1 is available currently with functionality toreplace 6 legacy applications and being used by 1200 researchers.Helium for Spotfire was released in Feb 2011 and installed by 500 users

Bio-IT World Best Practices Awards competitionWinner of Knowledge ManagementBest Practices:GlaxoSmithKlineHelium in Excel: A New Paradigm forData Insight(nominated by Ceiba web5236694.htm

Future PlansSome of the plans/ideas we are working on– Implementing Instant JChem to replace ISIS H-Views for formsbased data delivery– Working with Chemaxon to improve copy / paste functionality ofOLE objects (especially ChemDraw) to other Microsoftapplications– Making Helium available outside of GSK– Enhancing Helium to access more datasources such asinventories keyed on barcodes– Helium in a Browser

Thank you for your attention!Questions?

Next incarnation was integrated with Tibco Spotfire -Helium to gather the data and Spotfire to view the data in multiple ways (tables, scatterplots, bar charts etc) -Spotfire had the ability for plug-in panels to be created eg a Structure Viewer -Spotfire supported large data sets -At the time, cost was not a significant factor