The Design Of Animal Experiments - CCAC

Transcription

The design of animalexperimentsMichael FW Festingc/o Understanding Animal Research, 25 ShaftsburyAv. London, UK.michaelfesting@aol.com1

Principles of Humane ExperimentalTechnique(Russell and Burch 1959) Replacement e.g. in-vitro methods, less sentient animals Refinement e.g. anaesthesia and analgesia, environmentalenrichment Reduction Research strategy Controlling variability Experimental design and statistics2

A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance)High signal (sensitive subjects, high dose)Large sample sizeWide range of applicability Experimental unit, randomisation, blindingReplicate over other factors (e.g. sex, strain): factorialdesignsSimplicityAmenable to a statistical analysis3

The animal as the experimental unitN 8n 4Animals individually treated. May be individually housed or grouped4

A cage as the ExperimentalUnit.TreatedControlTreatedControlN 4n 2Treatment in water or diet.5

An animal for a period of time: repeatedmeasures or crossover designNAnimal412443Treatment 1N 12n 6Treatment 26

Teratology: mother treated,young measuredN 2n 1Mother is the experimental unit.7

Failure to identify the experimental unitcorrectly in a 2(strains) x 3(treatments) x6(times) factorial designELD groupELD groupSingle cage of 8 mice killed at each time point (288 mice in total)8

Experimental units must berandomised to treatments Physical: numbers on cards. Shuffle and takeoneTables of random numbers in most textbooksUse computer. e.g. EXCEL or a statisticalpackage such as MINITAB9

RandomisationOriginal Randomised121313112221222133323331NB Randomisation should includehousing and order in whichobservations are made10

Failure to randomise and/or blindleads to more “positive” resultsBlind/not blindodds ratio3.4 (95% CI 1.7-6.9)Random/not randomodds ratio3.2 (95% CI 1.3-7.7)Blind Random/not blind randomodds ratio5.2 (95% CI 2.0-13.5)290 animal studies scored for blinding, randomisation andpositive/negative outcome, as defined by authorsBabasta et al 2003 Acad. emerg. med. 10:684-68711

Some factors (e.g. strain, sex) can not berandomised so special care is needed to ensurecomparabilitySix cages of 7-9 mice of each strain: error bars are SEMs"CBA mice showed greatervariability in body weights thanTO mice."Outbred TO (8-12 weekscommercial)Inbred CBA (12-16weeks Home bred)12

A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance)High signal (sensitive subjects, high dose)Large sample sizeWide range of applicability Experimental unit, randomisation, blindingReplicate over other factors (e.g. sex, strain): factorialdesignsSimplicityAmenable to a statistical analysis13

High power: (good chance of detecting the effectof a treatment, if there is one)High Signal/Noise ratio High Standardized effect size High d m1-m2 /s High (Difference between means)/SDStudent’s t ( X1-X2)/Sqrt (2S2/n)14

Power Analysis for sample size andeffects of variation A mathematical relationship between six variablesNeeds subjective estimate of effect size to be detected(signal)Has to be done separately for each characterNot easy to apply to complex designsEssential for expensive, simple, large experiments(clinical trials)Useful for exploring effect of variabilityA second method “The Resource Equation” is described later15

Power analysis: the variablesSignala) Effect size of scientific interestor b) actual responseChance of a false positiveresult. Significance level(0.05)Sample sizeSidedness of statisticaltest (usually 2-sided)Power of theExperiment (80-90%?)NoiseVariability of theexperimental material16

Group size and Signal/noiseratioBad140Power90%80%120Group seratioEffect size (Std.Devs.)Assuming 2-sample, 2 sided t-test and 5% significance level17

Comparison of two anaesthetics for dogsunder clinical conditions(Vet. Anaesthes. Analges.)Unsexed healthy clinic dogs, Weight 3.8 to 42.6 kg. Systolic BP 141 (SD 36) mm HgAssume: a 20 mmHg difference betweenanaesthetics is of clinicalimportance, a significance level of a 0.05 a power 90% a 2-sided t-testSignal/Noise ratio 20/36 0.56Required sample size 68/group18

Power and sample sizecalculations using nQuery Advisor19

A second paper described: Male Beagles weight 17-23 kg mean BP 108 (SD 9) mm Hg. Want to detect 20mmdifference between groups (asbefore)With the same assumptions asprevious slide:Signal/noise ratio 20/9 2.22Required sample size 6/group20

Summary for two sources of dogs: aim is tobe able to detect a 20mmHg change in bloodpressureType of dogSDev Signal/noiseRandom dogs 36Male beagles90.562.22Samplesize/gp(1)686%Power (n 8)(2)1898(1) Sample size: 90% powerThescientificdilemma:(2) Power,Sample size8/groupWith small sample sizes we can not detect anAssumes a 5%, 2-sided t-test and effect size 20mmHgimportanteffect in genetically heterogeneous animals.We can detect the effect in genetically homogeneousanimals, but are they representative?21

Variation in kidney weight in58 groups of 20100159 13 17 21 25 29 33 37 41 45 49 53 57Sample numbe rGartner,K. (1990), Laboratory Animals, 24:71-77.22

Required sample sizesFactorTypeGeneticsF1 hybrid13.50.743080F2 .Dev Signal/noise*Sample Power**size*signal is 10 units, two sided t-test, a 0.05, power 80%** Assuming fixed sample size of 30/group23

The randomised block design: anothermethod of controlling noiseTreaments A, B & CBCAB1ACBB2BACB3ACBB4BCA Randomisation is within-blockCan be multiple differencesbetween blocksHeterogeneous age/weightDifferent shelves/roomsNatural structure (litters)Split experiment in timeB524

Apoptosis scoreA randomised olCGPSTAU365 398 4211Treatment effect p 0.023(2-way ANOVA)423 4324592Week308320 329325

Analysis of apoptosis dataAnalysis of Variance for .0000.02326

Residual Model DiagnosticsNormal Plot of ResidualsI Chart of Residuals20ResidualResidual100-10100Mean 3.16E-14-10-20-1.5-1.0-0.50.00.51.01.5LCL -20.1701234567Normal ScoreObservation NumberHistogram of ResidualsResiduals vs. Fits891032ResidualFrequencyUCL 20.1710-100-10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5Residual300350400Fit45027

Another method of determining sample size:The Resource Equation Depends on the law of diminishing returns Simple. No subjective parameters Useful for complex designs and/or multiple outcomes(characters) Does not require estimate of Standard Deviation Crude compared with Power AnalysisE (Total number of animals)-(number of groups)10 E 20 (but give some tolerance)28

Student's t, 5% critical valueThe Resource Equation & Sample Size12.0E (total numbers)-(number of groups)9.510 E 207.04.52.005101520253035Degrees of freedomBut if experimental subjects are cheap (e.g. multi-well plates, E can be much higher29

A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance)High signal (sensitive subjects, high dose)Large sample sizeWide range of applicability Experimental unit, randomisation, blindingReplicate over other factors to (e.g. sex, strain) to increasegenerality: factorial designsSimplicityAmenable to a statistical analysis30

Factorial designsFactorial designTreated ControlE 16-4 12Single factor designTreated ControlE 16-2 14One variable at a time (OVAT)Treated Control Treated ControlE 16-2 14E 16-2 1431

Factorial designs(By using a factorial design)”. an experimentalinvestigation, at the same time as it is made morecomprehensive, may also be made more efficient ifby more efficient we mean that more knowledgeand a higher degree of precision are obtainable bythe same number of observations.”R.A. Fisher, 196032

A 4x2 factorial designAnalysed with Student’s t-test: This is not appropriate because:1. Each test is based on too few animals (n 3-4), so lacks power2. It does not indicate whether there are strain differences in protein thiol status3. It does not indicate whether dose/response differs between strains4. A two-way design should be analysed using a 2-way ANOVA33

Incorrect statistical analysis leading toexcessive numbers of animalsOne experiment or4 separate experiments?8 mice per group8 groups 64 mice.E 64-8 56Alternative3 mice per group:8 groupsE 24-8 16Saving:40 miceFormal test of interaction34

2 (strains) x 4 (Animal units)factorial35

Effect of chloramphenicol(2000mg/kg) on RBC countStrain s:Treated7.817.216.967.109.188.318.478.67Should not be analysedusing two t-tests1. Each test lacks powerdue to small sample size2. Will not give a test ofwhether strains differ inresponseUse a two-way ANOVA with interaction1. Do the treatment means averaged across strains differ?2. Do the strains differ, averaged across treatments3. Do the two strains respond to the same extent?36

A 2x2 factorial design rrorTotalDF1111215Red 016Pooled ain and treatmentTreated37

Use of several inbred strains to reducenoise, increase signal and exploregeneralityEffect of chloramphenicol on mouse haematologyDose of chloramphenicol 222C3H222222BALB/c222222C57BL222222InbredFesting et al (2001) Fd. Chem.Tox. 39:37538

Example of a factorial compared witha single factor ontrol 02.301.001.301.60Four inbred strainsOne outbred stock39

WBC counts following chloramphenicol at2500mg/kgWhite blood cell countsStrain NCD-1 1602.23Strain N0CBA4 2.25C3H4 2.15BALB/c 4 1.05C57BL 4 2.25Mean 16 1.93Dose * strainSignalNoise2500 (Difference) (SD) Signal/noise p1.830.400.860.470.38SignalNoise2500 (Difference) (SD)0.301.950.340.401.850.341.35 735.44(-0.88)3.822.15 0.001 0.00140

Genetics is important: Twenty two Nobel Prizes since 1960for work depending on inbred strainsCell mediated immunityImmunological toleranceH2 restriction, immune responsesMedawar, Burnet, Doherty, ZinkanagelBenacerraf (G.pigs)GeneticsSnellES cells & “knockouts”Humoral immunity/antibodiesT-cell receptorTonegawa, Jernemonoclonal antibodiesBALB/c miceKohler and MillsteinC.C. Little, DBA, 1909Inbred Strains and derivativesJackson LaboratoryEvans, Capecchi, prionsPruisnerSmellAxel & BuckRetroviruses, Oncogenes & growth factorsCohen, Levi-montalcini, Varmus, Bishop, Baltimore, Temin41

18th Annual Short Course on ExperimentalModels of Human CancerAugust 21-30, 2009Bar Harbor, MEcourses.jax.org42

Conclusions Five requirements for a good design Unbiased (randomisation, blinding)Powerful (signal/noise ratio: control variability)Wide range of applicability (factorial designs, common butfrequently analysed incorrectly)SimpleAmenable to statistical analysisMistakes in design and analysis are commonBetter training in experimental design would improvethe quality of research, save money, time andanimals43

44

45

The design of animal experiments Michael FW Festing c/o Understanding Animal Research, 25 Shaftsbury . A two-way design should be analysed using a 2-way ANOVA . 34 Incorrect statistical analysis leading to excessiv