Transcription
Bioinformatics SessionKathryn Dempsey, Ph.D.Research AssociateSchool of Interdisciplinary InformaticsUniversity of Nebraska at OmahaEmail: kdempsey@unomaha.eduPhone: 402-554-2562
OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions
OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions
WHAT IS BIOINFORMATICS? The term was originally coined in the mid1980s to refer to analysis of biologicalsequences* Later, used to describe all computerapplications in biological sciences.– Definition varies Bioinformatics is a new scientific discipline with foundationsin computer science and molecular biology (and chemistryand mathematics and statistics and. . .) Very few formally trained bioinformaticians—most havemigrated from other fields (myself included)
WHAT IS scalculusstatisticsdiscrete ydatabasestext miningparallelcomputing
Brief History of BioinformaticsCharles Darwinpublishes “TheOrigin ofSpecies”Alan Turingdevelops the“Turing Machine”Gregor Mendelpublishes allelicworkFirst use of“bioinformatics”23andMefoundedHuman genomesequencedFrancis & Crick’sdiscovery of thedouble helixstructureMicrosoftis born!Internetis born!Linus Torvaldsannounces LinuxPerl isreleasedWorld’s ics” firstused
WHY DO WE NEED BIOINFORMATICS?Biomedical “Big data”! High-throughput experiments Genomes, personalized sequencing Complexity of disease Health records, public health Disaster atistics
“Big Data”Image Source: b0Hs*VFr8IePv5QFBJdhDH/BigData.001.jpg
BioinformaticsTRADITIONAL BIOLOGYGatherVS.LotsBIOINFORMATICS? of sis-to-data-driven-research.htmlFind patterns,needles in dataMake acomputationalmodel!Test the model
TYPICAL PROJECTS
TYPICAL PROJECTS
THE CENTRAL DOGMA OFBIOINFORMATICS
DNA (Deoxyribose Nucleic Acid) Genetic material Polymer of nitrogenousbases (A, T, G & C) Contains hereditaryinformation (Genes) in thechromosome Chromosome is a thread likelinear strand of DNA andassociated proteins(histones) Chromosomes constitute thegenome of an organism
A Quick note on Chromosomes Human cells are diploid Chromosomes––––Autosomes (22, 2 copies)X Y (2 total – XX or XY)23 pairs (2x23) or46 chromosomes Total Strawberries are normallydiploid but growers havemodified them The ones we will usetoday are octaploid! Normal – 7 types, 2 copies 14 chr– 7 types, 8 copies 56 chr Bananas– 11 types, 2 copies 22chr
Rosalind Franklin!
THE CENTRAL DOGMA OFBIOINFORMATICS
RNA (RiboNucleic Acid) Polymer of nucleotides (A,U, G & C) The temporary copy of agene Copied in the nucleus,transported to cytoplasm tobecome protein
THE CENTRAL DOGMA OFBIOINFORMATICS
Proteins– Functional unit of life– Polymer of 20 naturallyoccurring amino acids– Made from RNAmolecules duringtranslation by ribosome
Onward to Activity 1 Central Dogma– If you remember one thing, remember this! Bioinformatics has roots in biology To learn what the human genome is, we mustfirst get the genome out of the cells!
ACTIVITY 1:Extracting DNA from a Strawberryhttp://www.youtube.com/watch?v hOpu4iN5Bh4&noredirect erry-with-basic-kitchen-items.w654.jpg
OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions
OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions
Why don’t we have personalized medicine?Where is the cure for cancer?Why is AIDS still misunderstood?
Personalized Medicine Study 54yr old male volunteerPlasma and serum used for testing14 month time courseComplete medical exams and labs at eachmeeting (20 time points total) Extensive sampling at 2 periods of viralinfection:– HRV (human rhinovirus) – common cold– RSV (respirtatory synticial) - bronchitis
Personalized Medicine
Techniques Used Summary of techniques ple collectionHRV and RSV detectionWhole-genome sequencingWhole-exome sequencingSanger-DNA sequencingWhole-transcriptome sequencing: mRNA-SeqSmall RNA sequencing: microRNA-SeqSerum Shotgun Proteome ProfilingSerum Metabolome ProfilingSerum Cytokine ProfilingAutoantibodyome ProfilingTelomere Length AssayGenome PhasingOmics Data Analysis
Why don’t we have personalized medicine?Where is the cure for cancer?Why is AIDS still misunderstood?We don’t know everything.There’s lots and lots of data.Life is complex.Everyone is unique.
Databases Pubmed – Journal articles on biomedicalresearch OMIM – Disease genes in humans Genbank – All known data on genes and theirproteins, and their DNA sequence PDB – 3-D proteins structures MGD – Organism specifc (mouse)
If I give you a gene sequence, tellme which of the billions of knownsequences is most similar to it.ToolsBLAST
If I give you a bunch ofsequences, tell me where the arethe same and where they areToolsdifferent.Alignment (Clustal, MUSCLE, Tcoffee)
If I give you a bunch of sequencesfrom different animals, tell me howthey are related.ToolsPhylogenetics
MouseATTCAGATCA RatTTTCAGATCG HorseTACCAATCGC
ATTCAGATCA80% SimilarTTTCAGATCG20% SimilarTACCAATCGC
99.9%96-99.4%94%50-60%
We already have some databases and tools . .but we need more to solve those questions. Example: A disease where only one set of 3DNA bases is missing.– Do you know what this disease is? Activity 2:– The knowledge in bioinformatics databases– How to use some tools: BLAST, Alignment,Translation!
OVERVIEW Introductions Part I: What is bioinformatics?– Activity 1: Extraction of DNA Part II: How is bioinformatics useful?– Activity 2: Finding Disease Genes Conclusions
Motivationo Massive amounts of datao Many generation methodso FEW ANALYSIS methodso “Signal corruption”o How to model data?o How to extractknowledge?Jeong et al.,2001. Nature 411: 41-42.
A network: Elements and theirinteractions. Nodes elements Edges interactions Any relationship canbe modeled using thenetwork model
Gene1Gene2ProtAProtBJohnBob
Young edgesMid edgesAged edges
EXAMPLE: SYNTHETIC LETHALITY 1)(mutant)(healthy)Uncontrolled cellgrowthCell death Cell death byapoptosis
Activity 3: Networks
The nodes with the mostinteractions are almost alwaysgoing to get the signal . .whether it be the flu or thewinning lotto numbers.
Networks! Human social networks work this way Cell networks work this way Many other networks act this way Flu pandemic planningVaccination planningDrug targets for the cellNational security planning
Conclusions DNA RNA Protein We need bioinformatics– Understanding cellular systems– Personalized medicine– Prevention vs. treatment Many skills gained–––––Biomedical researchComputer scienceMathematicsTeam science . & many more!
A Career in Bioinformatics Skills needed Programming (e.g., Perl, Python, Java, C , PHP)Database administration (e.g. MySQL, Oracle )UNIX/Linux Operating SystemInformation Management Types of Jobs Scientific curators, Software Developer, Network Engineering,Administrator/analyst, Bio-Statistics or any jobs wherebiologists are currently hired. Types of Employers Pharmaceutical, Biotech and Software development companies Academic Institutes and Hospitals Research Institutes (JCVI,JGI, Tgen, )SummerBroadWorkshop Institute,2013
Questions?Contact us!Kdempsey@unomaha.edu@Science KateDkbastola@unomaha.edu
Bioinformatics Session Kathryn Dempsey, Ph.D. Research Associate School of Interdisciplinary Informatics University of Nebra