Non-Functional Properties

Transcription

Software Product Line EngineeringNon-Functional PropertiesChristian Kästner (Carnegie Mellon University)Sven Apel (Universität Passau)Norbert Siegmund (Bauhaus-Universität Weimar)Gunter Saake (Universität Magdeburg)1

IntroductionNot considered so far: 2How to configure a software product line?How about non-functional properties?How to measure and estimate a variant’s non-functionalproperties?

AgendaConfiguration and non-functional propertiesApproaches for measurement and estimationExperience reportsOutlook 3

Configuration ofSoftware Product Lines4

Recap: Configuration and Generation ProcessReusable artifactsCar variantsVariant generationConfiguration based on requirements5

Recap: Configuration and Generation ProcessReusable artifacts (code, documentation, etc.)VariantsVariant generationConfiguration based on requirements6

Configuration with Feature ModelsEncryptionFunctional requirementsData ase ManagementSystemalternativeorCache Size8MB7Encryption32MB -treeIndexesB-treePartial feature selectionReportingHashPage Size2K4K8K16K

Non-Functional RequirementsNon only functionality is importantPerformanceMemory consumptionFootprint8

Non-Functional Properties: Definition(s)Also known as quality attributesOver 25 definitions (see [6])In general:Any property of a product that is not related withfunctionality represents a non-functional property. Different models describe relationships among nonfunctional properties 9

McCall‘s Quality Model I [7] Modelling of quality attributes and factors to simplifycommuncation between developers and usersHierarchical model: 1011 factors (specify product; external user view)23 quality criteria (for development; internal developer view)Metrics (to control and evaluate results)

McCall‘s Quality Model I [7]External View11Internal View

ISO Standard 9126 SO/IEC 25010:2011SO/IEC 25010:2011 defines:1.A quality in use model composed offive characteristics (some of which arefurther subdivided intosubcharacteristics) that relate to theoutcome of interaction when a productis used in a particular context of use.This system model is applicable to thecomplete human-computer system,including both computer systems in useand software products in use.2.A product quality model composed ofeight characteristics (which are furthersubdivided into subcharacteristics) thatrelate to static properties of softwareand dynamic properties of thecomputer system. The model isapplicable to both computer systemsand software products.Quelle: Wikipedia13

Categorization Quantitative Response time (performance), throughput, etc.Energy- and memory consumptionMeasurable properties, metric scaleEasy to evaluateQualitative 14ExtensibilityError freenessRobustnessSecurityNo direct measurement (often, no suitable metric)

How to configure with non-functional propertiesin mind?Energy consumptionNon-functional requirementsPerformanceFootprintMemory consumptionMaximize performance, but keep footprint below 450 KBoptionalmandatoryDatabase ManagementSystemalternativeorCache Size8MB15Encryption32MB -treeIndexesB-treeReportingHashPage Size2K4K8K16K

Motivating Questions of Practical Relevance What is the footprint of a variant for a given featureselection?Database ManagementSystemCache Size8MB Encryption32MB 25 KBIndexesR-treeB-treeReportingHashPage Size2K4K8K16KWhat is the best feature selection to minimize memoryconsumption?Database ManagementSystemMin()Cache Size8MB Encryption32MB hat are the performance critical features?Database ManagementSystemCache Size168MBEncryption32MB -treeIndexesB-treeReportingHashPage Size2K4K8K16KR-treeIndexesB-treeReportingHashPage Size2K4K8K16K

Practical RelevanceConfiguration complexity: [1] Xu et al. FSE’15: Developers and users are overwhelmed withconfiguration optionsUnused optimization (up to80% of options ignored)Substantial increase in configurability17

Why Should We Care?Outdated default configurations: [2] Van Aken et al. ICMD’17: Default configuration assumes160MB RAMNon-optimal default configurations: [4] Herodotuo et al. CIDSR’11: Default configurationresults in worst-case execution timeNon-optimal default configurations: [3] Jamshidi et al., MASCOTS’16: Changing configurationis key to tailor the system to the use caseBestWorst18Best configuration is 480times better than WorstconfigurationOnly by tweaking 2 optionsout of 200 in Apache Storm observed 100% change inlatency

RelationReusable artifactsApplication Eng.Domain Eng.Feature modelFeature selection19GeneratorFinal program

Measuring Non-Functional Properties20

Side Note: Theory of Measurement Stevens defines different levels of measurement [4]Nominal scaleOrdinal scaleInterval scaleQuelle: WikipediaExamples:Sex21Ratio scaleGradesTime (date)Age

Classification of Non-Functional Properties forSoftware Product Lines Not measurable properties: Measurable per feature Qualitative propertiesProperties without a sensible metric (maintainability?)Properties exist for individual featuresSource code properties, footprint, etc.Measurable per variant 22Properties exist only in final (running) variantsPerformance, memory consumption, etc.

Methods for Measuring Product Lines How to measure non-functional properties of variantsand whole product lines? Artifact-basedFamily-basedVariant-based 23

Measurement: Artifact-based Artifact-based Features are measured in isolation from other featuresLinear effort with respect to the number of featuresRobust against changes of the product lineDrawbacks: Not all properties are measurable (performance?)Requirements specific implementation techniques (#ifdef?)No black-box systems, since code is requiredNo feature interactions considered (accuracy?)Requires artificial measurement environmentEffort24 Accuracy-Applicability-Generality-Environment-

Measurement: Family-based Family-based Measurement of all features and their combinations at thesame timeRequires feature model to derive influence of individualfeatures on the measurement outputEffort: O(1) if there are no constraintsDrawbacks: Not all properties measurable; artificial measurement settingInaccurate with respect to feature interactionsRequires tracing information from features to codeEffort 25Accuracy-Applicability-Generality-Environment-

Measurement: Variant-based Variant-based Measure each individual variantEvery property can be measuredWorks for black-box systemsIndependent of the implementation techniqueInteractions between features can be measuredDrawback: Huge measurement effort O(2n)Effort-26Accuracy Applicability Generality Environment

Example: SQLiteExclusive LockingCase SensitivityThread SafetyAtomic Write 000,000,000,000,000,000,000,000,000,00027

Approach 0: Brute Force SQL data:3*1077 varaints5 minutes per measurement (compilation benchmark) 3*1077 * 5min 000,000,000,000,000,000,000,000,000,000,000,000 years!Now1.37 * 1010 yearsBig bang28Birth of earth9 * 109 yearsLogarithmictime scaleMeasurement finished2.8 * 1072 years

Approach 1: Sampling Measure only few, specific variants Predict properties of unseen configurationsState-of-the-art approaches use machine-learning techniquesfor learning a prediction modelProblem: Feature interactions 29We need to measure many combinations offeatures to identify and quantify the influenceof interactionsOrder-6 interaction:13,834,413,152 131,605 years!

Approach 2: Family-Based Measurement Create a variant simulatorExecute simulator and measurement the propertyCompute the influences of each feature based on theexecution of the imulatorCallgraph(s)0sPerformance model3s15s3s15s5sWorkload0s0sfully automated30⟨base,15s⟩, ⟨f1, 5s⟩,⟨f2, 3s⟩, ⟨f1#f2, 10s⟩, .

Prediction of Non-Functional Properties31

Learning Techniques RegressionNeuronal networksCARTBayse NetsMARSM5CubistPrincipal Component AnalysisEvolutionary algorithms 32

Goal: Prediction of Properties based on theInfluence of Features20sPartial featureselectionInfluence ModelObjectivefunction ⟨PageSize 1k, 15s⟩⟨PageSize 2k, 0s⟩⟨PageSize 4k, -10s⟩, ⟨CacheSize 8k, -5s⟩,⟨Encryption, 20s⟩, ⟨Hash Index, - 5s⟩⟨Encryption#PageSize 4k, 15s⟩33PageSize 4kHash Index

Overview(1) SamplingConfiguration spaceCohen et al. TSE’08; Siegmund et al. SPLC’11, SQJ’12,ICSE’12, FSE’15; Sarkar et al. ASE’15; Henard et al. TSE’14,ICSE’15; Oh et al. FSE’17; Johansen et al. SPLC’12; Medeiroset al. ICSE’16; Dechter et al. AAAI’02; Gogate and DechterCP’06; Chakraborty et al. AAAI’14; Key domains: Combinatorial testing, artificial intelligence,search-based software engineering, design of experimentsSize: 2#𝑜𝑝𝑡𝑖𝑜𝑛𝑠Performance model𝑓: 𝐶 ℝ(2) LearningGuo et al. ASE’13; Siegmund et al. ICSE’12, FSE’15; Sakar et al.ASE’15; Oh et al. FSE’17; Zhang et al. ASE’15; Nair et al.FSE’17,arXiv’17; Jamshidi et al. SEAMS’17; Xi et al. WWW’04, Key domains: machine learning, statisticsGoal(4) Analysis(3) OptimizationNot covered here34System understandingOptimalconfiguration(s)Sayyad et al. ICSE’13, ASE’13; Henard et al.ICSE’15; White et al. JSS’09; Guo et al. JSS’12; KaiShi ICSME’17; Olaechea et al. SPLC’14; Hierons etal. TOSEM’16; Tan et al. ISSTA’15; Siegmund et al.SQJ’12; Benavides et al. CAiSE’05; Zheng et al.OSR’07; Jamshidi et al. MASCOTS’16; Osogami undKato SIGMETRICS’07; Filieri et al. FSE’15Key domains: search-based software engineering,meta-heuristics, machine learning, artificialintelligence, mathematical optimization

Sampling – OverviewChallenges:Exponential sizeconfiguration spaceBinary configurationoptions35Find only relevant configurationsfor measurementNumericconfigurationoptions

RandomSamplingOr how to obtain randomness in the presence of constraints?Trivial approach: Enumerate all configurations and randomly draw oneEasy to implement [12] Temple et al. TR’17; [13] Guo et al.Not scalableTrue randomness ASE’13; [14] Nair et al. FSE’15; [15] Zhanget al. ASE’15;SAT approach: Manipulate a SAT/CSP solver:No guaranteeduniformityLimited scalability[5] Henard et al. ICSE’15: Randomlypermute constraint and literal order andEasy to implement phase selection (order true - false)Better distribution [17] Siegmund et al. FSE’17: Specifydistribution of config. as constraintsBDD approach: Create a counting BDD to enumerate all configurations: [6] Oh et al. FSE’17BDD creation canbe expensiveScales up to 2,000optionsTrue randomnessBeyond SE: Tailored algorithms: [7] Chakraborty et al. AAAI’14: Hash the configuration space36 and Dechter CP’06 and [9] Dechter et al. AAAI’02: Consider CSP output as probability distribution[8] Gogate

Sampling with Coverage ISurvey: [10] Medeiros et al. ICSE’16Interaction coverage: t-wise, (e.g., 2-wise pair-wise)[20] Siegmund et al. SPLC’11[21] Siegmund et al. ICSE’12Insights:Many options do not interact2-wise interactions most commonHot-spot options37Kuhn et al.:[11] Henard et al. TSE’14[18] Cohen et al. TSE’08[19] Johansen et al. SPLC’12

Sampling with Coverage IISaltellie et al.:Option coverage: Cover all options either by minimizing or maximizing interactionsLeave-one-out /one disabled sampling: [10] Medeiros et al. ICSE’16Option-wise sampling: [20,24] Siegmund et al. SPLC’11, IST’13Negative option-wise sampling: [22] Siegmund et al. FSE’15Option-frequency sampling: [23] Sakar et al. ASE’1538

Sampling Numeric Options39

Plackett-Burman Design (PBD) Minimizes the variance of the estimates of the independentvariables (numeric options) while using a limited number of measurementsDesign specifies seeds depending on the number ofexperiments to be conducted (i.e., configurations to beNumeric optionsmeasured)ConfigurationsValue range of a numeric optionMin40CenterMax

In Detail: Feature-wise Sampling41

Determine the Influence of Individual Features How shall we approach?DBMSCoreCompressionEncryptionΠ() 100sΠ() 100sΠ() 100sΠ(,) 120sΠ(,) 130sΠ(,) 110sΔ() 20sΔ() 30sΔ() 10sΠ(,,,) Δ( 160s42Transactions) Δ() Δ() Δ()

Experience with Feature-wise Sampling43

Footprint Material:Product lJava5244 030ZipMeCompressionAcademicJava81044 874PKJabMessengerAcademicJava11725 016SensorNetSimulationAcademicC 2632407 303VioletUML editorAcademicJava100ca. 102019 379Berkeley DBDatabaseIndustrialC8256209 682SQLiteDatabaseIndustrialC85ca. 1023305 191Linux kernelOperating system IndustrialC25ca. 3 * 10713 005 84244FeaturesVariantsLOC

Results: Footprint Average error rate of 5.5% without VioletWith Violet: 21.3%# measurements 288SQLite: 85 vs.Linux : 25 vs. 3*107186% fault ratePrevayler47Why this error?

Analysis: Feature Interactions Two features interaction if their combined presence in aprogram leads to an unexpectedprogram behaviorMeasuredExpectedΠ(,, ) Δ( ) Δ ( ) Δ ( 100s 20s 30s 150s)Feature Interaction: 140s*Δ(#49#since encrypted data has beenpreviously compressed) -10s //delta between predicted and measured performance

Experience with Pair-wise Sampling50

Pair-wise Measurement: Footprint Average error rate of 0.2% without VioletReduction of 4.3 %Partially improved,but still very bad# measurements:SQLite: 3306 vs. 285Linux : 326 vs. 3*107722% Error rate52

White-Box Interaction Detection: Footprint Source code analysis revealed higher order featureinteractions in V

Violet UML editor Academic Java 100 ca. 1020 19 379 Berkeley DB Database Industrial C 8 256 209 682 SQLite Database Industrial C 85 ca. 1023 305 191 Linux kernel Operating system Industrial C 25 ca. 3 * 107 13 005 842 44