Bioinformatics Session - University Of Nebraska Omaha

Transcription

Bioinformatics SessionKathryn Dempsey, Ph.D.Research AssociateSchool of Interdisciplinary InformaticsUniversity of Nebraska at OmahaEmail: kdempsey@unomaha.eduPhone: 402-554-2562

OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions

OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions

WHAT IS BIOINFORMATICS? The term was originally coined in the mid1980s to refer to analysis of biologicalsequences* Later, used to describe all computerapplications in biological sciences.– Definition varies Bioinformatics is a new scientific discipline with foundationsin computer science and molecular biology (and chemistryand mathematics and statistics and. . .) Very few formally trained bioinformaticians—most havemigrated from other fields (myself included)

WHAT IS scalculusstatisticsdiscrete ydatabasestext miningparallelcomputing

Brief History of BioinformaticsCharles Darwinpublishes “TheOrigin ofSpecies”Alan Turingdevelops the“Turing Machine”Gregor Mendelpublishes allelicworkFirst use of“bioinformatics”23andMefoundedHuman genomesequencedFrancis & Crick’sdiscovery of thedouble helixstructureMicrosoftis born!Internetis born!Linus Torvaldsannounces LinuxPerl isreleasedWorld’s ics” firstused

WHY DO WE NEED BIOINFORMATICS?Biomedical “Big data”! High-throughput experiments Genomes, personalized sequencing Complexity of disease Health records, public health Disaster atistics

“Big Data”Image Source: b0Hs*VFr8IePv5QFBJdhDH/BigData.001.jpg

BioinformaticsTRADITIONAL BIOLOGYGatherVS.LotsBIOINFORMATICS? of sis-to-data-driven-research.htmlFind patterns,needles in dataMake acomputationalmodel!Test the model

TYPICAL PROJECTS

TYPICAL PROJECTS

THE CENTRAL DOGMA OFBIOINFORMATICS

DNA (Deoxyribose Nucleic Acid) Genetic material Polymer of nitrogenousbases (A, T, G & C) Contains hereditaryinformation (Genes) in thechromosome Chromosome is a thread likelinear strand of DNA andassociated proteins(histones) Chromosomes constitute thegenome of an organism

A Quick note on Chromosomes Human cells are diploid Chromosomes––––Autosomes (22, 2 copies)X Y (2 total – XX or XY)23 pairs (2x23) or46 chromosomes Total Strawberries are normallydiploid but growers havemodified them The ones we will usetoday are octaploid! Normal – 7 types, 2 copies 14 chr– 7 types, 8 copies 56 chr Bananas– 11 types, 2 copies 22chr

Rosalind Franklin!

THE CENTRAL DOGMA OFBIOINFORMATICS

RNA (RiboNucleic Acid) Polymer of nucleotides (A,U, G & C) The temporary copy of agene Copied in the nucleus,transported to cytoplasm tobecome protein

THE CENTRAL DOGMA OFBIOINFORMATICS

Proteins– Functional unit of life– Polymer of 20 naturallyoccurring amino acids– Made from RNAmolecules duringtranslation by ribosome

Onward to Activity 1 Central Dogma– If you remember one thing, remember this! Bioinformatics has roots in biology To learn what the human genome is, we mustfirst get the genome out of the cells!

ACTIVITY 1:Extracting DNA from a Strawberryhttp://www.youtube.com/watch?v hOpu4iN5Bh4&noredirect erry-with-basic-kitchen-items.w654.jpg

OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions

OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions

Why don’t we have personalized medicine?Where is the cure for cancer?Why is AIDS still misunderstood?

Personalized Medicine Study 54yr old male volunteerPlasma and serum used for testing14 month time courseComplete medical exams and labs at eachmeeting (20 time points total) Extensive sampling at 2 periods of viralinfection:– HRV (human rhinovirus) – common cold– RSV (respirtatory synticial) - bronchitis

Personalized Medicine

Techniques Used Summary of techniques ple collectionHRV and RSV detectionWhole-genome sequencingWhole-exome sequencingSanger-DNA sequencingWhole-transcriptome sequencing: mRNA-SeqSmall RNA sequencing: microRNA-SeqSerum Shotgun Proteome ProfilingSerum Metabolome ProfilingSerum Cytokine ProfilingAutoantibodyome ProfilingTelomere Length AssayGenome PhasingOmics Data Analysis

Why don’t we have personalized medicine?Where is the cure for cancer?Why is AIDS still misunderstood?We don’t know everything.There’s lots and lots of data.Life is complex.Everyone is unique.

Databases Pubmed – Journal articles on biomedicalresearch OMIM – Disease genes in humans Genbank – All known data on genes and theirproteins, and their DNA sequence PDB – 3-D proteins structures MGD – Organism specifc (mouse)

If I give you a gene sequence, tellme which of the billions of knownsequences is most similar to it.ToolsBLAST

If I give you a bunch ofsequences, tell me where the arethe same and where they areToolsdifferent.Alignment (Clustal, MUSCLE, Tcoffee)

If I give you a bunch of sequencesfrom different animals, tell me howthey are related.ToolsPhylogenetics

MouseATTCAGATCA RatTTTCAGATCG HorseTACCAATCGC

ATTCAGATCA80% SimilarTTTCAGATCG20% SimilarTACCAATCGC

99.9%96-99.4%94%50-60%

We already have some databases and tools . .but we need more to solve those questions. Example: A disease where only one set of 3DNA bases is missing.– Do you know what this disease is? Activity 2:– The knowledge in bioinformatics databases– How to use some tools: BLAST, Alignment,Translation!

OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions

Motivationo Massive amounts of datao Many generation methodso FEW ANALYSIS methodso “Signal corruption”o How to model data?o How to extractknowledge?Jeong et al.,2001. Nature 411: 41-42.

A network: Elements and theirinteractions. Nodes elements Edges interactions Any relationship canbe modeled using thenetwork model

Gene1Gene2ProtAProtBJohnBob

Young edgesMid edgesAged edges

EXAMPLE: SYNTHETIC LETHALITY 1)(mutant)(healthy)Uncontrolled cellgrowthCell death Cell death byapoptosis

Activity 3: Networks

The nodes with the mostinteractions are almost alwaysgoing to get the signal . .whether it be the flu or thewinning lotto numbers.

Networks! Human social networks work this way Cell networks work this way Many other networks act this way Flu pandemic planningVaccination planningDrug targets for the cellNational security planning

Conclusions DNA RNA Protein We need bioinformatics– Understanding cellular systems– Personalized medicine– Prevention vs. treatment Many skills gained–––––Biomedical researchComputer scienceMathematicsTeam science . & many more!

A Career in Bioinformatics Skills needed Programming (e.g., Perl, Python, Java, C , PHP)Database administration (e.g. MySQL, Oracle )UNIX/Linux Operating SystemInformation Management Types of Jobs Scientific curators, Software Developer, Network Engineering,Administrator/analyst, Bio-Statistics or any jobs wherebiologists are currently hired. Types of Employers Pharmaceutical, Biotech and Software development companies Academic Institutes and Hospitals Research Institutes (JCVI,JGI, Tgen, )SummerBroadWorkshop Institute,2013

Questions?Contact us!Kdempsey@unomaha.edu@Science KateDkbastola@unomaha.edu

Bioinformatics Session Kathryn Dempsey, Ph.D. Research Associate School of Interdisciplinary Informatics University of Nebra