Programming MATLAB Symbolic Math Toolbox For Speed .

Transcription

Programming MATLAB Symbolic MathToolbox for Speed:Experience from the GenSSI Version 2ProjectThomas LigonApril 26, 2018

Start Ligon Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018#2

BackgroundThomas Ligon: MATLAB Symbolic Math Speed28.04.2018#3

GenSSI 2.0Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018#4

BackgroundGenSSI Version 1 GenSSI written in MATLAB with the Maple toolbox In 2008, Mathworks acquired MuPAD MuPAD developed by University of Paderborn and sold to SciFace MuPAD became MATLAB Symbolic Math Toolbox, replacing MapleGenSSI Version 2 Convert Code to run in newer version of MATLAB Calculation of Jacobian matrix required change Many performance problems required change New functions, such as multi-experiment model created Performance (R2016a) better than Version 1 (R2008a)Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018#5

GenSSI 2.0 PerformanceThomas Ligon: MATLAB Symbolic Math Speed28.04.2018#6

Jacobian of a MatrixMaple can calculate Jacobian of a matrixMATLAB only calculates Jacobian of a vectorPossible solution:df sym(zeros([size(f,1),size(f,2),length(v)]))for iRow 1:size(f,1)df(iRow,:,:) jacobian(f(iRow,:),v)endThis does not work for GenSSI, becauseMaple and MATLAB order the result differentlyThomas Ligon: MATLAB Symbolic Math Speed28.04.2018#7

Jacobian of a Matrix% Original code:rank(LDer); % LDer is matrix of Lie derivativesJacParamC jacobian(LDer, Par); % Par is vector of parameters% Final code:% calculate 2D jacobian the way Maple does itJacParam ar);That works, but then we had big performance problems!Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018#8

Previous ExperienceMemory management can be very dangerousExample of a very efficient system: Call the operating system rarely and manage that buffer internally Allocate many pieces of memory (internally) and manipulate pointers instead of dataBut that’s low-level code, and we are using MATLABThomas Ligon: MATLAB Symbolic Math Speed28.04.2018#9

Matrices in MATLABNumeric matrices have a fixed size Number of cells times size of (double precision) number Pre-allocation is effectiveSymbolic matrices have a variable size Cells can be long expressions or polynomials Pre-allocation is still not enoughThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 10

Mathworks RecommendsMathworks highly recommends Pre-allocation andVectorization1234%5rowN zeros(numCols,1); pre-allocate rowNfor iCol 1:numColsrowN(iCol) matrix(n,iCol) % elementwise (loop)endalternate coderowN matrix(n,:) % vectorizedWithout line 1, line 3 changes size of rowNLine 5 puts all the logic inside of MATLABPerformance numbers later (in GenSSI code)Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 11

Remove Zero Rows - Old1 remove Lie index [];2 for iRow JacParamx:-1:13if JacParam(iRow,:) 0;4JacParam(iRow,:) [];5remove Lie index [remove Lie index iRow];6end7 endLine 4 changes the size of JacParam Causes memory managementLine 5 also changes sizeThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 12

Remove Zero Rows - New1 [JacParam,tilde,useful Lie index] genssiRemoveZeroRows(JacParam);3 results.useful Lie index useful Lie index;4 function [matrixOut,keepBoolean,keepIndex] genssiRemoveZeroRows(matrixIn)5keepBoolean any(matrixIn 0,2);6matrixOut matrixIn(keepBoolean,:);7keepIndex find(keepBoolean)’;8 endLine 5 contains all logic, no “for” loop, no “if” “any” creates a Boolean array Line 6 returns rows with 1 in keepBooleanThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 13

Remove Zero Rows - ResultsThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 14

MuPAD Rank – Test ProgramTest Performance of Rank12345load('JacParamC1.mat’);A JacParamC1;tic;R1 rank(A); % MATLAB Rankdisp(['rank1 ',num2str(R1),', time ',num2str(toc)]);6789tic;R2 feval(symengine,'linalg::rank',A); % MuPAD Rankdisp(['rank2 ',char(R2),', time ',num2str(toc)]);disp('end');Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 15

MuPAD Rank – ResultsThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 16

Symbolic to Numeric - CodeConvert Jacobian (symbolic) to Tableau (Boolean)Old code (vectorized single line)% JacParam01 zeros(sizeJacParam);JacParam01 double(JacParam 0);New codeJacParam01 zeros(size(JacParam));JacParam01(find(JacParam)) 1;Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 17

Symbolic to Numeric - ResultsThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 18

Remaining LimitationsMATLAB solve In some cases, it cannot find a solution Warning: Explicit solution could not be found Limits information returnedMATLAB jacobian, rank, solve can be slow support for parallel processing would helpMemory every new derivative increases the size of the jabobian by a factor of Npar(number of parameters) Memory usage grows exponentiallyThomas Ligon: mRNA Delivery Model28.04.2018# 19

SummaryBeware of things that require memory managementPre-allocation is goodVectorization is goodSpecial functions including “any” and “find” are good but might require a version checkThanks to Mathworks for support cases provided quick help and workarounds fixed bugs and improved code in future versionsGeneral recommendations Support: Always provide a clear and concise test program and description. Change requests: Include a business case.Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 20

GenSSI TeamsThomas S. Ligon1Oana-Teodora Chiş4Fabian Fröhlich2,3Eva Balsa-Canto5Jan Hasenauer2,3Julio R. Banga51Facultyof Physics, Ludwig-Maximilians-Universität, München, Germany,2Institute3Centerof Computational Biology, Helmholtz Zentrum München, Germany,of Mathematics, Technische Universität München, München, Germany,4Technological5(Bio)ProcessInstitute for Industrial Mathematics, Campus Vida, Santiago de Compostela, Spain,Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, SpainThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 21

End Ligoff Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 22

AppendixAppendixThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 23

AbstractDynamical systems, especially in systems biology, are often modeled bysystems of ordinary differential equations (ODEs) and we want to know if itis possible to infer the parameters of these equations (e.g. reaction rates)from experimental data, in a process called parameter estimation or “theinverse problem”. If this is possible in principle, i.e. based on optimal data,the parameters are called “structurally identifiable”. GenSSI (GeneratingSeries for testing Structural Identifiability) uses generating series of Liederivatives to analyze the structural identifiability of parameters of a set ofODEs based on arbitrary analytical functions. We converted GenSSI fromversion 1, which uses the Maple toolbox and runs on MATLAB versionR2008a and older, to version 2, which uses the MATLAB Symbolic Mathtoolbox (based on MuPAD) and runs on MATLAB version R2008b andnewer. As part of this, we corrected some very significant performanceissues with the Symbolic Math toolbox, ultimately achieving betterperformance than the original version. This talk addresses thoseperformance aspects.Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 24

Tools for identifiabilityPaper of first slide, supporting informationThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 25

Tools for identifiabilityPaper of first slide, supporting informationThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 26

Tools for mas Ligon: MATLAB Symbolic Math Speed28.04.2018# 27

mRNA TransfectionThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 28

mRNA TransfectionThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 29

mRNA TransfectionmRNA transfection model𝑑𝐺 𝑘 𝑇𝐿 𝑚 𝛽 𝐺𝑑𝑡𝑑𝑚 𝛿 𝑚𝑑𝑡Solution𝑘 𝑇𝐿 𝑚0𝐺𝑚𝑅𝑁𝐴 𝑡 1 𝑒 𝛿 𝛽 (𝑡 𝑡0 ) 𝑒 𝛽(𝑡 𝑡0 )𝛿 𝛽Problems 𝑘 𝑇𝐿 𝑚0 cannot be separated (identified). Solution is symmetric in 𝛽 and 𝛿. 2 equation for 4 parameters.Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 30

mRNA TransfectionThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 31

mRNA TransfectionGenSSI: Nothing is identifiable-- WARNING: The number of parameters is Larger than the number ofrelations!An explicit solution cannot be given for this subset of parameters.PLEASE CONSIDER AN EXTRA DERIVATIVE!Structurally globally identifiable parameters:[]Structurally locally identifiable parameters:[]Structurally non-identifiable parameters:[]Thomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 32

mRNA TransfectionExperimental solutionsMeasure not only GFP, but also mRNA but that is difficult in living cellsPerform multiple experiments GFP with different degradation rate (eGFP) mRNA with different degradation rate (poly-A-tail) antibiotic that inhibits translation (different rate)- Multi-experiment modelsThomas Ligon: MATLAB Symbolic Math Speed28.04.2018# 33

Programming MATLAB Symbolic Math Toolbox for Speed: Experience from