Benchmarking Semantic Reasoning On Mobile Platforms: Towards .

Transcription

Benchmarking Semantic Reasoning onMobile Platforms: Towards OptimizationUsing OWL2 RLEditor(s): Thomas Lukasiewicz, Oxford University, United KingdomSolicited review(s): Eduardo Mena, University of Zaragoza, Spain; Nick Bassiliades, Aristotle University of Thessaloniki, Greece; oneanonymous reviewerWilliam Van Woensela,* and Syed Sibte Raza AbidiaaNICHE Research Group, Faculty of Computer Science, Dalhousie University, 6050 University Ave, Halifax, NSB3H 4R2, Nova Scotia, CanadaAbstract. Mobile hardware has advanced to a point where apps may consume the Semantic Web of Data, as exemplified indomains such as mobile context-awareness, m-Health, m-Tourism and augmented reality. However, recent work shows that theperformance of ontology-based reasoning, an essential Semantic Web building block, still leaves much to be desired on mobileplatforms. This presents a clear need to provide developers with the ability to benchmark mobile reasoning performance, basedon their particular application scenarios, i.e., including reasoning tasks, process flows and datasets, to establish the feasibilityof mobile deployment. In this regard, we present a mobile benchmark framework called MobiBench to help developers tobenchmark semantic reasoners on mobile platforms. To realize efficient mobile, ontology-based reasoning, OWL2 RL is apromising solution since it (a) trades expressivity for scalability, which is important on resource-constrained platforms; and(b) provides unique opportunities for optimization due to its rule-based axiomatization. In this vein, we propose selections ofOWL2 RL rule subsets for optimization purposes, based on several orthogonal dimensions. We extended MobiBench to supportOWL2 RL and the proposed ruleset selections, and benchmarked multiple OWL2 RL-enabled rule engines and OWL reasonerson a mobile platform. Our results show significant performance improvements by applying OWL2 RL rule subsets, allowingperformant reasoning for small datasets on mobile systems.Keywords: mobile computing; OWL2 RL; rule-based reasoning; OWL reasoning; reasoning optimization1. IntroductionAdvances in mobile technologies have enabledmobile applications to consume semantic data, withthe goal of e.g., collecting context- [52], [59] andlocation-related data [6], [50], achieving augmentedreality [41], [58], performing recommendations [60],accessing linked biomedical data (m-Health) [39] andenabling mobile tourism [27]. Automated reasoning,an essential Semantic Web pillar, involves theinference of useful information based on the semanticsof ontology constructs, domain-specific if-then rules,or both. Giving the availability of advanced mobile*Corresponding author. E-mail: william.van.woensel@gmail.com.technology and large volumes of mobile-accessiblesemantic data, we hence consider it opportune toinvestigate the potential of semantic reasoning onmobile, resource-constrained platforms. In light ofrecent empirical work [11], [25], which indicates thatmobile reasoning performance still leaves much to bedesired, we choose to focus on benchmarking andoptimizing mobile semantic reasoning. In particular,we discern a clear need for benchmarking specificapplication scenarios, including reasoning task (e.g.,ontology or rule-based reasoning), process flow (e.g.,frequent vs. incremental reasoning) and rule- anddatasets, as it will allow mobile developers to make

more informed decisions – for instance, in case of poorperformance of their particular application scenario,they may choose hybrid solutions that combinemobile- and server-deployed reasoning [2], [56].In traditional Semantic Web reasoning applications,OWL2 DL is the most popular representation andreasoning approach. Regarding resource-constrainedsystems however, it has been observed that OWL2 DLis too complex and resource-intensive to achievescalability [11], [25]. Reflecting this, most mobilesemantic reasoners i.e., tailored to resourceconstrained systems, instead focus on rule-basedOWL axiomatizations, such as custom entailmentrulesets [1], [28] or OWL2 RL rulesets [36], [47].Indeed, OWL2 RL is an OWL2 profile with a statedgoal of scalability, partially axiomatizing the OWL2RDF-based semantics as a set of rule axioms. Further,a rule-based axiomatization allows easily adjustingreasoning complexity to the application scenario [47]or avoiding resource-heavy inferences [8], [43], byapplying subsets of rule axioms. In contrast,transformation rules used in tableau-based DLreasoning are often hardcoded, making it hard to deselect them at runtime [47]. Also, most classic DLoptimizations improve performance at the cost ofmemory [11], which is limited in mobile devices. Atthe same time, as only a partial axiomatization, OWL2RL does not guarantee completeness for TBoxreasoning [36]; and places syntactic restrictions onontologies to ensure all correct inferences.Nevertheless, we find this expressivity trade-offacceptable in case it would render semantic reasoningfeasible on resource-constrained platforms.In this regard, our objective is three-fold:(1) developing a mobile reasoning benchmarkframework (called MobiBench) that allows developersto evaluate the performance of reasoning on mobileplatforms (reasoning times, memory usage), forspecific scenarios and using standards-based rule- anddatasets. Key features of MobiBench include auniform, standards-based rule and data interfaceacross reasoning engines, as well as its extensibilityand cross-platform nature, allowing benchmarks to beapplied across multiple platforms;(2) optimizing semantic reasoning on mobileplatforms, by studying the following three OWL2 RLrule subset selections: (i) Equivalent OWL2 RL rulesubset, which leaves out logically equivalent rules;i.e., rules of which the results are covered by otherrules; (ii) Purpose and reference-based subsets, whichdivides rule subsets based on their purpose andreferenced data; and (iii) Removal of resource-heavyrules that have a large performance impact – althoughthis will result in missing certain inferences, we feelthat developers should be able to weigh their utility vs.computational cost; and(3) performing mobile reasoning benchmarks, whichmeasure the performance of the materialization ofontology inferences, using the AndroJena andRDFStore-JS rule systems loaded with differentOWL2 RL ruleset selections, as well as three OWL2DL reasoners (HermiT, JFact and Pellet). We notethat, although the proposed OWL2 RL subsetselections were construed and evaluated in the contextof resource-constrained platforms, they may beapplied in any kind of computing environment.This paper is built on previous work, whichpresented a clinical benchmark [55] and an initialversion of the Mobile Benchmark Framework [54],which only supplied an API, restricted benchmarkingto rule-based reasoning, and did not attemptoptimizations or applications of OWL2 RL.The paper is structured as follows. Section 2introduces the MobiBench framework, presenting itsarchitecture and main components, and Section 3discusses how mobile developers can utilizeMobiBench. In Section 4, we shortly discuss theOWL2 RL profile and our reasons for focusing on it,and detail its implementation as a ruleset. Section 5elaborates on our selection of OWL2 RL rule subsetsfor optimization purposes. Section 6 presents anddiscusses the benchmarks we performed usingMobiBench. We review related work in Section 7, andend with conclusions and future work in Section 8.2. Mobile Benchmark FrameworkThe goal of the MobiBench benchmark frameworkis to allow studying and comparing reasoningperformance on mobile platforms, given particularapplication scenarios, including reasoning task,process flow and rule- and datasets. An importantfocus lies on extensibility, with clear extension pointsallowing different rule and data formats, tasks andflows to be plugged in. Moreover, given the multitudeof mobile platforms currently in use (e.g., Android,iOS, Windows Phone, BlackBerry), MobiBench wasimplemented as a cross-platform system.Fig. 1 shows the architecture overview of theMobiBench framework. The API supplies third partieswith direct access to the MobiBench functionality. Tofacilitate developers in running benchmarks, theAutomation Support allows automating large numbersof benchmarks, and comprises (1) a remote

Fig. 1. MobiBench Framework Architecture.Automation Client, which generates a set ofbenchmark configurations; and (2) an AutomationWeb Service on the device that invokes the API foreach configuration. This setup avoids re-deployingMobiBench for each benchmark (i.e., with a new hardcoded configuration); and even allows benchmarkingwithout physical access to the device. The AnalysisTools aggregates the benchmark results, includingreasoning times and memory dumps, into CSV files.The core of the framework, the Benchmark Engine,can perform different reasoning tasks, using differentprocess flows, to better align benchmarks with realworld scenarios. Any Reasoning System can beplugged into this component by implementing theuniform plugin interface.To support OWL2 RL, MobiBench was extendedwith the following services: (a) Uniform ConversionLayer, to cope with the myriad of rule (and data)formats currently supported by rule-based reasoners;(b) Pre-Processing Service, which pre-processes theruleset and ontology if required (e.g., to support n-aryrules); and (c) Ruleset Selection Service, whichautomatically applies OWL2 RL subset selections tooptimize ontology-based reasoning.A remote RESTful Web Service, deployed on aserver (e.g., the developer’s PC), comprises theseservices, and also hosts some utility services to persistbenchmark output (Persistence Support). A LocalProxy component acts as an intermediary between themobile system and the remote Web service.For portability across platforms, MobiBench wasimplemented in JavaScript (JS) and deployed usingApache Cordova [62] for mobile platforms and JDK8Nashorn [68] for PC (this version is used for testing),which allows native, platform-specific parts to beplugged in. We note that this also allows MobiBenchto easily benchmark JavaScript reasoners, which areusable in mobile websites or cross-platform,JavaScript-based apps (e.g., developed using ApacheCordova, Appcelerator Titanium [64]) with a writeonce, deploy-everywhere philosophy. We currentlyrely on Android as the deployment platform, sincemost reasoners are either developed for Android orwritten in Java (which facilitates porting to Android),

but MobiBench could be easily deployed on otherplatforms as well. The MobiBench framework can befound online [51].In the subsections below, we elaborate on the mainMobiBench components, namely the UniformConversion Layer (Section 2.1), Ruleset SelectionService (Section 2.2), Pre-Processing Service(Section 2.3) and Benchmark Engine (Section 2.4);and indicate extension points for each component (seeparts on Extensibility). Section 3 shows howdevelopers can utilize the benchmark framework.2.1. Uniform Conversion LayerThe goal of the Uniform Conversion Layer is tohandle the multitude of rule (and data) formatscurrently supported by rule-based reasoners. Itsupplies a uniform, standards-based resource interfaceacross reasoning engines, which dynamically convertsthe input to their supported formats. The major benefitof this layer is that it allows developers to re-use asingle rule- and dataset across different reasoners.A range of semantic rule standards are currently inuse, including the Semantic Web Rule Language(SWRL) [23], Web Rule Language (WRL) [3], RuleMarkup Language (RuleML) [12], and SPARQLInferencing Notation (SPIN) [31]. Some reasonersalso introduce their own custom formats (e.g., ApacheJena) or rely on non-Semantic Web syntaxes (e.g.,Datalog:IRIS,PocketKRHyper).Whenbenchmarking multiple systems, this multitude offormats prevents direct re-use of a single rule- anddataset. We chose to support SPIN rules and RDF dataas standard input formats; Section 2.1.1 shortlydiscusses SPIN and our reasons for choosing it.Since the only available SPIN API is developed forthe Java Development Kit (JDK) [30], conversionfunctions are deployed on an external Web service. Toconvert incoming SPIN rules, the SPIN API is utilizedto generate an Abstract Syntax Tree (AST), which isthen visited by a Rule Converter to convert the rule.Section 2.1.2 discusses our current converters, andhow new converters can be plugged in. To convertincoming RDF data, a Data Converter can utilizeApache Jena [4] to query and manipulate the data.2.1.1. SPINSPIN is a SPARQL-based rule and constraintlanguage, which provides a natural, object-orientedway of dealing with constraints and rules associatedwith RDF(S)/OWL classes. In the object-orienteddesign paradigm, classes define the structure ofobjects (i.e., attributes) together with their behavior,which includes creating/changing objects (rules) andensuring a consistent object state (constraints).Similarly, SPIN allows directly associating locallyscoped rules and constraints to their relatedRDF(S)/OWL classes, using properties such asspin:rule and spin:constraint.To serialize rules and constraints, SPIN relies onSPARQL [49], a W3C standard with sufficientexpressivity to represent both queries and generalpurpose rules and constraints. SPARQL is supportedby most Semantic Web systems, and is well known bySemantic Web developers. As such, this rule format ismore likely to be easily comprehensible to developers.Further, relying on SPIN also simplifies support forour current rule engines (see below).2.1.2. Rule and Data ConversionRegarding rule-based reasoners, our choice forSPIN greatly reduces conversion effort for systemswith built-in SPARQL support. RDFStore-JS supportsINSERT queries from SPARQL 1.1/Update [49],which are easy to obtain from SPIN rules in theirSPARQL query syntax. Both AndroJena andRDFQuery support a triple-pattern like syntax, d. Other rule engines lack built-inSemantic Web support, and require more tKrHyper and IRIS, accept Datalog rules andfacts in a Prolog-style input syntax. For these cases,we utilize the same first-order representation as in theW3C OWL2 RL specification [13], namely T(s, p, o)(since predicates may also be variables, arepresentation such as predicate(subject, object) is notan option in non-HiLog).Currently, our converters support SPIN functionsthat represent primitive comparators (greater, equal,etc.) and logical connectors in FILTER clauses.Advanced SPARQL query constructs, such as(not-)exists, optional, minus and union, are not yetsupported. None of the OWL reasoners (Section 2.4.1)required (data) conversion, since they can consumeserializations of OWL in RDF out of the box.Extensibility To plug in a new resource format,developers can create a new converter classimplementing the uniform converter interface. Theclass is then added to a configuration file (spin2s.txt /rdf2s.txt), used by the Web service to dynamicallyload converter class definitions at startup. Eachconverter identifies its own format via a unique ID,

allowing to match incoming conversion requests to thecorrect converter.2.2. Ruleset Selection ServiceTo optimize OWL2 RL reasoning on mobileplatforms, the Ruleset Selection Service automaticallyapplies the OWL2 RL ruleset selections presented inthis paper (Section 5), given one or more selectioncriteria. Indeed, due to its rule-based axiomatization,the OWL2 RL profile greatly facilitates applyingsubsets of axioms. In Section 5, we discuss relevantselection criteria in detail, such as logical equivalencewith other rules, and subsets based on purpose andreference. As before, since the only available API forSPIN (i.e., the input rule format) [30] is developed forJava, this component is deployed in the Web service.The Default Selection function selects an OWL2RL subset, given a list of selection criteria indicatingrules and axioms to leave out, replace or add. TheDomain-based Selection function leaves out rules thatare not relevant to a given ontology – i.e., rules thatwill not yield any additional inferences(Section 5.2.2). Typically, a ruleset selection isperformed once, before reasoning takes place; and incase of ontology updates that require re-executing theselection (e.g., schema updates; Section 5.2.1). Hence,the usefulness of selections will depend on whetherthe ontology is prone to frequent, relevant updates atruntime. This is especially true in our current setup,where this requires re-invoking the remote service atruntime, causing considerable overhead. By deployingthe service directly on the mobile device, and evenintegrating it with the reasoner, this drawback couldbe mitigated (see future work).Extensibility: To support a new selection criterionthat requires an a priori analysis of the ontology,developers can create a new subclass of theDomainBasedSelection class. Else, the developer cansimply add a new subfolder under the owl2rl/ folder inMobiBench, which keeps a list of rules and axioms tobe removed, replaced or added.2.3. Pre-Processing ServiceThe Pre-processing Service performs preprocessing of the ruleset and target ontology tosupport OWL2 RL-based reasoning, if required. Inparticular, the service implements 3 solutions tosupport n-ary rules (see Section 4.2.3): (1) instantiatethe rules, based on schema assertions found in theontology; (2) normalize (or “binarize”) the inputontology to only contain binary versions of the n-aryassertions, and apply binary versions of the rules; and(3) replace each rule by 3 auxiliary rules.When applying solutions (1) and (2), preprocessing needs to occur initially and each time theontology is updated. Solution (3) does not have thisdrawback, but infers n 1 intermediary inferences foreach “complete” inference for an n-ary assertion,which do not follow from the OWL2 RL semantics.The choice between these solutions thus depends onthe scenario, e.g., whether the ontology is prone tofrequent updates. As before, deploying this service onthe mobile device could alleviate these drawbacks (seefuture work). Currently, it is deployed on the Webservice since only a Java SPIN API is available.Extensibility: To support a new pre-processingmechanism, developers can create a new subclass ofthe PreProcessor class. In case the mechanismrequires ontology analysis (cfr. solutions (1), (2)),OntologyBasedPreProcessor should be subclassed.2.4. Benchmark EngineThe Benchmark Engine performs benchmarks ofreasoning engines, following a particular reasoningsetup. A reasoning setup includes a reasoning task andprocess flow. By supporting different setups, andallowing new ones to be plugged in, benchmarks canbe better aligned to real-world scenarios.In Section 2.4.1, we elaborate on the currentlysupported reasoning engines. Next, we discuss theavailable reasoning tasks (Section 2.4.2) and processflows (Section 2.4.3), as well as the supportedbenchmark measurement criteria (Section 2.4.4).2.4.1. Reasoning EnginesBelow, we categorize currently supported enginesaccording to their reasoning support. The engines notindicated as Android systems, excluding theJavaScript (JS) engines, were manually ported toAndroid. In this categorization, we consider ruleengines as any system that can calculate the deductiveclosure of a ruleset, i.e., execute a ruleset and outputresulting inferences (not necessarily limited to this).Rule-based systemsAndroJena [61] is an Android-ported version ofApache Jena [4]. It supplies a rule-based reasoner,which supports both forward and backward chaining,respectively based on the RETE algorithm [17] andSLG resolution [15].

RDFQuery [70] is a JavaScript RDF store thatperforms queries using a RETE network, andimplements a naïve reasoning algorithm.RDFStore-JS [71] is a JavaScript RDF store,supporting SPARQL 1.0 and parts of SPARQL 1.1.We extended this system with naïve reasoning,accepting rules as SPARQL 1.1 INSERT queries.IRIS (Integrated Rule Inference System) [9] is aJava Datalog engine meant for Semantic Webapplications. The system relies on bottom-upevaluation combined with Magic Sets [7].PocketKrHyper [44] is a J2ME first-order theoremprover based on a hyper tableaux calculus, and ismeant to support mobile semantic apps. It supplies aDL interface that accepts DL expressions andtransforms them into first-order logic.OWL reasonersAndroJena supplies an OWL reasoner, whichimplements OWL Lite (incompletely) and supportsfull, mini and micro modes that indicate customexpressivities; and an RDFS reasoner, similarly withfull, default and simple modes. For details, we refer tothe Jena documentation [63].The ELK reasoner [26] supports the OWL2 ELprofile, and performs (incremental) ontologyclassification. Further, Kazakov et al. [25] hasdemonstrated that it can take advantage of multi-coreCPUs of modern mobile devices.HermiT [18] is an OWL2 DL reasoner based on anovel hypertableaux calculus, and is highly optimizedfor performing ontology classification.JFact [66] is a Java port of the FaCT reasoner,which implements a tableau algorithm and supportsOWL2 DL expressivity.Pellet [45] is a DL reasoner with sound andcomplete support for OWL2 DL, featuring a tableauxreasoner. It also supports incremental classification.In Section 6.3, we list the reasoning engines utilizedin our benchmarks.Extensibility: To support a new JS reasoner, thedeveloper writes a JS plugin object, which implementsa uniform reasoner interface and specifies the acceptedrule and data format, the process subflow (if any)dictated by the engine (Section 2.4.3), and its availablesettings (e.g., reasoning scope (OWL, RDFS)). To ruleout communication, console output, etc. influencingmeasurements, each plugin captures its own finegrained result times using our ExperimentTimer API.Any required JavaScript libraries, as indicated by theplugin, are automatically loaded. Developers registertheir plugins in an engine.json file.For native engines, the developer similarlyimplements a native plugin class, and supplies askeleton JS plugin. The system wraps this skeletonplugin with a proxy object that delegates invocationsto the native plugin over the Cordova bridge (seeFig. 1). In practice, native (Android) reasoners oftenhave large amounts of dependencies, some of whichmay be conflicting (e.g., different versions of the samelibrary). To circumvent this issue, we package eachengine and its dependencies as jar-packaged .dex files,which are automatically loaded at runtime. For moredetails, we refer to our online documentation [51].2.4.2. Reasoning TasksCurrently, we support three reasoning tasks. Fig. 2illustrates the dependencies between these tasks.1) Rule-based materializing inference: Computingthe deductive closure of a ruleset for a dataset, andadding all inferences to the dataset.2) OWL2 materializing inference: Given anontology, materialize all inferences based on anOWL2 expressivity (e.g., OWL2 Full, OWL2 DL,OWL Lite, or some other reduced expressivity). Thistask can also be performed by rule engines, e.g., usingthe rules axiomatizing the OWL2 RL semantics.Fig. 2 shows two types of OWL inference: “built-in”inference of any kind (e.g., OWL2 DL, QL, Lite, etc.),which only requires an input ontology; and OWL2 RLreasoning, which uses a rule engine and accepts bothan OWL2 RL ruleset and ontology as input.Regarding our choice for materializing inferencesvs. reasoning per query (e.g., via resolution methodssuch as SLG [15]), we note that each have theiradvantages and drawbacks on mobile platforms. Priorto data access, the former involves an expensive preprocessing step that may significantly increase thedataset scale, which is problematic on mobileplatforms, but then leaves query answering purelydepending on speed of data access. In contrast, thelatter incurs a reasoning overhead for each query thatdepends on dataset scale and complexity. Anothermaterialization drawback is that inferences need to be(re-)computed whenever new data becomes available.For instance, Motik et al. [36] combine materializationwith a novel incremental reasoning algorithm, toefficiently update previously drawn conclusions. Toallow benchmarking such incremental methods, ourframework supports an “incremental reasoning”process flow (Section 2.4.3). For the purposes of thispaper, we chose to focus on a materializationapproach, although supporting resolution-basedreasoning is considered future work. We note that

many Semantic Web rule-based reasoners, includingDLEJena [35], SAOR [22], OwlOntDb [16] andRuQAR [5], also follow a materialization approach.3) Service matching: Checks whether a user goal,which describes the services the user is looking for,matches a service description. In its rule-basedimplementation, a pre- or post-condition / effect fromone description (e.g., goal) acts as a rule; and itscounterpart condition from the other (e.g., service)serves as a dataset, which is done by “freezing”variables, i.e., replacing them by constants. A match isfound when rule execution infers the consequent. Wenote that this rule-based task can easily be enhancedwith ontology reasoning – i.e., by including an OWL2RL ruleset with the match rule(s), and relevantontology elements in the match dataset – which is oneof the additional advantages of utilizing a rule-basedOWL axiomatization. In mobile settings, servicematching enables mobile apps to locate useful servicesin a smart environment, with all necessarycomputation taking place on the mobile platform (seee.g., [53]). While our benchmarks do not measure theperformance overhead of service matching, this isconsidered future work.Fig. 2. Reasoning types.Extensibility: Reasoning tasks are implemented asJS classes, with a hierarchy as shown in Fig. 2. A newreasoning task class needs to implement the inferencefunction, which realizes the task by either directlyinvoking the reasoner interface (see Section 2.4.1),delegating to another task class (e.g., Rule-basedinference) or to a subflow (see Section 2.4.3 –Extensibility). The Reasoning task super classprovides functions such as checking conformance,collecting result times, and logging inferences. A newtask file should be listed in tasks.json.2.4.3. Process FlowsTo better align benchmarks with real-world usecases, MobiBench supports several process flows,which dictate the times at which operations (e.g., loaddata, execute rules / perform reasoning) areperformed. From previous work [54], [55], and in linewith our choice for materializing inferences, weidentified two useful process flows:Frequent Reasoning: in this flow, the system storesall incoming facts directly in a data store (whichpossibly also includes an initial dataset). To generatenew inferences, reasoning is periodically applied tothe entire datastore. Concretely, this entails loading areasoning engine with the entire datastore each time acertain timespan has elapsed, applying reasoning, andstoring new inferences into the datastore.Incremental Reasoning: here, the system appliesreasoning for each new fact (currently, MobiBenchonly supports monotonic reasoning, and thus does notdeal with deletions). In this case, the reasoning engineis first loaded into memory (possibly with an initialdataset). Then, reasoning is (re-)applied for eachincoming fact, whereby the new fact and possibleinferences are added to the dataset. Some OWLreasoners directly support incremental reasoning, suchas ELK and Pellet. As mentioned, Motik et al. [36]implemented an algorithm to optimize this kind ofreasoning, initially presented by Gupta et al. [20].Further, we note that each reasoner dictates asubflow, which imposes a further ordering onreasoning operations. In case of OWL inference(implemented via e.g., tableau reasoning), data istypically first loaded into the engine, and then aninference task is performed (LoadDataPerformInference). Similarly, RDFQuery, RDFStore-JS andAndroJena first load data and then execute rules. Forthe IRIS and PocketKrHyper engines, rules are firstloaded (e.g., to build the Datalog KB), after which thedataset is loaded and reasoning is performed(LoadRulesDataExecute). For more details, we referto previous work [54].Extensibility: Process flows are implemented as JSclasses. Each main process flow is listed in flows.json,and will call a reason task at certain times (e.g.,frequent vs. incremental) and with particularparameters (e.g., entire dataset vs. new fact). Asubflow is specific to a particular reasoning task (seeSection 2.4.2). A Reason task may thus utilize asubflow class behind-the-scenes, in case multiplesubflows are possible. When called, a subflow classexecutes the uniform reasoning functions (e.g., loaddata, execute) in the appropriate order.

2.4.4. Measurement CriteriaThe Benchmark Engine allows studying andcomparing the metrics listed below.Performance:Loading times: time needed to load data and rules,ontologies, etc. into the engine.Reasoning times: time needed to infer new facts orcheck for entailment.Memory consumption: total memory consumed bythe engine after reasoning. Currently, it is not feasibleto measure this criterium for non-native engines; werevisit this issue in Section 6.4.Conformance:The Benchmark Engine allows to automaticallycompare inferences to the expected output forconformance checking (Section 5.4). As such,MobiBench allows investigating the completeness andsoundness of inference as well (cfr. [19]).Other related works focus on measuring the finegrained performance of specific components [33],such as large joins, Datalog recursion and defaultnegation. In contrast, MobiBench aims to find themost suitable reasoner on a mobile platform given anapplication scenario (e.g., reasoning setup, dataset).Our performance metrics support this objective.We further note that the performance of theremotely deployed services, i.e., the UniformConversion Layer (Section 2.1), Ruleset Selection(Section 2.2) and Pre-Processing (Section 2.3)services are not measu

performance of their particular application scenario, they may choose hybrid solutions that combine mobile- and server-deployed reasoning [2], [56]. In traditional Semantic Web reasoning applications, OWL2 DL is the most popular representation and reasoning approach. Regarding resource-constrained systems however, it has been observed that OWL2 DL