High Throughput Sequencing Facility

Transcription

High Throughput Sequencing FacilityPiotr Mieczkowski, PhDDepartment of Genetics, School of Medicine at UNC

Outline of the talk Organization of Genomic Core in PolishAcademy of Science (IBB) in 1998 (Poland) User perspective– Microarray Facility atUniversity of North Carolina (USA). High Throughput Sequencing Facility atUniversity of North Carolina (USA). Discussion

Beginning of Genomics (IBB, PL)Genomic Core Lab in IBB (1997)We had machines and peoplededicated to make the facilitysuccessful, but we had nofunding

Initial start up for Facility development (IBB, PL tUsersIdeas for firstresearch project

Initial project for Genomics Lab (IBB-Poland) in 1997 We are to small to compete with Europe or US institutionsWe do not have capacity and funding to sequence genomesWe need to find our niche which we can develop in the futureWe need to increase our visibility and collaboration with other Polish researchinstitutes.Sequencing plasmids will be the best option for small sequencing Genomics Lab

Structure of Facility entpersonnelServiceContractsamortizationequipment

Conclusions Institutional support is necessary for successful start upof new facility. Institutional investment in one relevant project was agood start for facility and brought attention of allinvestigators. Federal institution supporting many aspects ofoperation in facility (salaries, amortization ofequipment) gives opportunity for better pricing even ifthe cost of the reagents is high (higher than in US). The same pricing for service provided was offered to allcustomers, both domestic and foreign.

UNC

Microarray printing Microarray was emerging technology in 1995-2004. Availability of the commercial microarray slides was limited by price. Only very richlabs could afford to buy them. UNC established Microarray facility equipped in medium size printer and scanner. User had to provide DNA or oligonucleotides for printing and pay 50 per slide forservice and glass slide processing (commercial glass slide for printing - 15)We built our own microarray printer to reduce the cost for ourexperiments

Array PlatformThe array platform we used was based on PCR product ( containing theOpen Reading Frames (or ORF) and Intergenic regions) printed on thepoly-lysine slides.Below: image of the home-printed microarray slide covered by entireSaccharomyces cerevisiae genome.About 12,100 spots cover the genome at 1000 bp resolution

Conclusions Facility did not provide expected expertisein production of custom microarrays. It was more cost effective to prepareamplicons and build own printer than to usethe facility service.

High Throughput Sequencing FacilityMission Be one of the best high-throughput genomic centers at publicuniversity. Support a cutting-edge research environment by providing broad,affordable access to the latest genomic technology. Help researchers to design and interpret genomic experiments;promote innovation by developing new and novel techniques Integrate the facility with other technology centers at the Universityof North Carolina; form partnerships within and beyond the UNCsystem

Initial start up for Facility development Small grants15 per yearIs it right tUsers

Structure of Facility Funding in equipment

Structure of Facility Funding in ment

Increased number of instruments require more investment in the infrastructure2008Electrical PowerEfficient Air-condition systemVery Fast Internet Connection2013

Upgrades of equipment are important.Do not be too attached to old equipment

UNC HTSF 2014 6 HiSeq 25004 HiSeq 20001 PacBio1 PGM Ion Torrent2 Ion Proton3 MiSeqAlso on campus:454 (Microbiome)454 jr. (Viral genomics)2 MiSeq2 NextSeq500

Number of sequencers is always associated withincreasing number of large studies2008TCGA (8000 RNAseq)NEXUSUNCgenesUNCseq(1000 ExomeCapture)NCGENES(1000 ExomeCapture)LIB, NEC (3500 lowpass humangenomes)2013Total around 6000 samples in 2013

ProjectsTCGA (The Cancer Genome Atlas project) - Gene Expression Patterns in HumanTumors Identified Using Transcript SequencingThe goal of this project is the study of genome-wide transcript regulation withchromatin organization to provide a critical portrait of the cancer genome that canbe integrated with other data, including mutations and copy number events.LIB, NEC - Deep Sequencing Studies for Cannabis and Stimulant DependenceThe goal of this project is to identify sequence variants that affect cannabis andstimulant dependence.NC GENES: North Carolina Clinical Genomic Evaluation by NextGen ExomeSequencing.The goal of this project is to establish a set of best practices to guide futureimplementation of robust genomic technologiesNC NEXUS, North Carolina Newborn Exome Sequencing for Universal ScreeningThe goal of this project is to identify, confront and overcome challenges toimplement deep sequencing technology to enhance current newborn screening.40% sequencing capacity for medium and small projects

World Rank 2011 -14th positionWorld Rank 2013 -15th positionSecond largest NGS facility on East .org/

Cores by Recharge T RECOVERY1% 1%INSECTCULTUREMEDIA1%FUNCTIONALGENOMICS1%1% RESHISTOLOGY COREMACRO INTERACION1%1% FAC1% FACSTRUC X-RAYCORE1%BAC ENGINEERING1% COREVIRONOMICS1%CORE FACIMMUNOLOGY1%CORE LABSYSTEMS PROTEOMICS1%1%MODELSMOUSE CARDIO1%PROTEOMICS CORE FAC1% COREUNC MICROBIOME1% FACILITYMICROSCOPY1%ADME MASS SPEC1%NUCLEIC ACID SYNTH1%NC TRACS INSTITUTE2%HISTOPATHOLOGY LAB2%ONCOLOGY PROTOCAL2%ANIMAL CLIN LAB2%2011HIGHTHROUGHPUTGENO; 15%2013 – 30%GENE THERAPY VECTOR11%TELSA IMAG SCANNER7%PATHOLOGY ELECTRON M2%ANIMAL EXPER FACILIT2%CANCER CTR DNA MICRANATOMIC PATH CORE7%3%PHARM ANAL CHEM CORE3%DNA SEQUENCING CTRRETROVIROLOGY LAB6%4% MOUSE COLONY TRANSG FLOW CYTOMETRY FAC4%5%

Capacity of HTSF3,500,000,000,000 bp/weekUNC HTSF

Bioinformatics pipelineData ServerDataAnalysisClusterTapes3 yearsDataStorage6-12 monthsDataAnalysisClusterDataAnalysisCluster

Unified bioinformaticsCenter TSFCampusInfrastructureRENCIPrimary analysisInfrastructureSpecific projects

Error rateThe more samples are processed, the chance for sample swap isgreater.Large projects are using SNP (microarray, sanger sequencing,Sequenom) for sample identification check.1. It is becoming necessary to use Laboratory InformationManagement System (LIMS) to control flow of the samples andreagents in the facility.Organized storage for incoming and processed samples.2. Automation is essential for maintenance growth andreproducibility without increased number of sample swap cases.

1. LIMSFacilitiesLIMS SystemBioprocessing msHTSFBSPLimsLIBLimsNEClimsCLARITYLimsGenoLogic

BSP LIMS Library Preparation site

2. AutomationTransfer from tubes to plateFragmentationQuality ControlLibrary prep processing96 samples standardprotocol1-48 samples standardprotocolQuality ControlPooling librariesSequencingCustom protocol ormanual prep

Automation in HTSFSonicationMedium and Large Scale library prep and DNA capture systemCaliper – Sciclone system96 tip pipetting headAgilent Capture protocolsKAPA DNA library protocolIllumina DNA/RNA protocols

Automation in HTSFTecan – Freedom Evo system – 8 tips- 2x48 (96) samples per week – DNA libraryprep- Automated sample normalization steps- PCR and qPCR preparation- Integrated fluorescent plate readerInfinite F200- Can be adapted to small and mediumscale protocols for Illumina and Ion Torrent- We have all necessary components forDNA/RNA extraction using Qiagen kitsTecan – Freedom Evo system – 8 tips and96 tips- For Illumina and Ion torrent protocols- Experimental platform for new protocols

Automation in HTSFSage Science2 x Pippin – Automated size selectionsystem1 x Blue Pippin – large fragment selectionfor PacBio protocolFluidigmAccess Array System – for large scaleamplicon generation

Automation necessary for reproducibilityAutomated Illumina Protocols on the Tecan NGS WorkstationValidated Protocols:Nextera DNA Sample PreparationNextera Rapid Capture (Exome, Extended Exome,Custom)During Validation:TruSeq Stranded mRNA PreparationTruSeq Stranded Total RNA PreparationTruSeq Nano DNA Sample PreparationNextera XT DNA Sample PreparationDuring Script Development:TruSeq DNA PCR-Free Sample PreparationTruSeq SmallRNA Sample PreparationTruSeq ChIP Sample PreparationTruSeq Custom Amplicon Library PreparationAdditional Protocols generated on Tecan NGS Workstation- KAPA Hyper Kit- KAPA NGS quantitation kit- Amliseq – Life TechnologiesTecan – Illumina - UNC

Research and Development are important foraccessing the newest technologies Testing of the new chemistry – sequencing and library prep (Illumina)Testing of the new hardwareResearch on chemistry for Molecular TaggingCustom library preps according to user requestAutomation protocols (Tecan, Illumina, KAPA, Rubicon Genomics)People Trained staff is critical for high quality of the service. We are always keeping employment on minimum level (but oncompetitive salaries). Training and attending conferences are integral part of staff education.

Reorganization of Genomics Cores at UNC - 2014

Conclusions Initial startup should be designed well. Development of the facility should be correlated with thepresence of stable, large projects. Infrastructure (space, electrical power, air-conditioningand such) needs to satisfy the current equipmentrequirements and provide the room for future additions. Flexibility in protocols and collaboration with users isessential. LIMS and Automation – provide the control over errors. People – are the key to success.

AcknowledgementsMieczkowski Lab HTSFEwa MalcDonghui TanLiz SheffieldMaryam ClausenAlicia BrandtNick SchuchUma VeluvoluScot WaringTara SkellyHemant KelkarTristan De BuysscherCorbin JonesChristopher Baker

- TruSeq Custom Amplicon Library Preparation Additional Protocols generated on Tecan NGS Workstation - KAPA Hyper Kit - KAPA NGS quantitation kit - Amliseq -Life Technologies. Research and Development are important for accessing the newest technologies Testing of the new chemistry -sequencing and library prep (Illumina) Testing of the new hardware Research on chemistry for .