Open Source BI Platforms: A Functional And Architectural . - Unibo.it

Transcription

Openp Source BI Platforms:a Functional and ArchitecturalCComparisoniMatteo GolfarelliDEIS – University of BolognaAgenda:1. Introduction2. Conduct of the Comparison3.3 Platforms description4. DiscussionDAWAK - 091

Open Source BI Software BI market is historically dominated by commercial vendors––––MicrostrategyOracle BI suiteMicrosoft BI Onlyy recentlyy Openp Source solutions appearedppas singlegtools first, and as complete platforms laterAn OS BI Platform provides a full spectrum of BI capabilities within aunified system that reduces the overhead for the development andmanagement of each applicationapplication, and lets the user feel like she wasusing a single BI solutionCommercial platforms are commonly considered superior toOS ones, that, on the other hand, can evolve faster.2

Conduct of the comparison This work compare three different OS BI platforms– JasperSoft– Pentaho– SpagoBIthe versions considered are those released by December 31 2008 The outcome of the analysis is the fusion of our independent analysisand the testing and evaluation activities of three consulting firmsspecialized in BI projectsprojects.1. We initially defined an evaluation grid describing in details the aspects tob iinvestigated.beti t d2. The resulting grid was shared with the consultant firms and furtherdiscussed and integrated.g3. Each consultant firm carried out one or more porting of real projectspreviously implemented through commercial BI suites.4 The4.Th compiledil d gridsid were fifinallyll sharedh d andd didiscussedd withith ththe otherthparticipants.

Top level of the comparison grid Non-technical: platform philosophy, type of licensing and availability ofNonenterprise editionseditions.Architectural: in terms of the global framework, modules and theirrelationships, programming languages and supported operationalsystems.Functional: in terms of functionalities provided natively by the platformsor maded availableil bl tto ththe users ththroughh ththe iintegratedtt d BI ttools.lMeta--data: in terms of expressiveness, completeness, standardizationMetaand level of reusabilityreusability.Security: in terms of functionalities provided for authentication andprofilingpg of the users,, interfaces to external authentication systemsyandsecure data transmission.Usability: both from the user viewpoint, in terms of level oftransparency ini usingi theh diffdifferent tools,l andd ffrom theh ddevelopers’l’ anddsystem administrators’ viewpoint in terms of complexity of installationand administration as well as development of applicationsapplications, quality ofmanuals and forums.

Non-technical aspects: open sourceNonmodeld l Commercial openp source: pprovides for separateppproductreleases.– community edition meets the user’s basic needs and it is completelyfree– enterprise edition of the product can be purchased and it usuallyi l d enhancedincludeshd ffeaturestas wellll as supportt andd ttrainingi i services.iJasper and Pentaho fit into this model. Free and Open Source Software (FOSS): the product iscompletely free, no enterprise solution is available, thus all thefunctionalities are available to the community for free. SpagoBIfits into this model.model

Non--technical aspects: pluggingNon Integration:ga software interface is defined in order to control and toexploit module functionalities directly and transparently through theplatform.– TheTh iintellectualt ll t l propertyt off theth softwareftdoesdnott change,handd theth originali i lddeveloperslremain in charge of maintaining and evolving the module.– SpagoBI is strictly based on integration Acquisition: the intellectual property of the software is acquired and theoriginal project terminated.– The buyer will be in charge of maintaining and evolving the modulemodule.– Pentaho often has recourse to acquisition (e.g. Pentaho ETL comes from the Kettleproject) Technological partnership: stands in the middle between integration andacquisition.– The original project remains alive and it is maintained by the original developersdevelopers.– The partner that incorporates the module influences its evolution and collaborates toits maintenance. The module usually appears with a different name in the newplatform.platform– Jasper mainly exploits partnerships (e.g. JasperETL was developed through apartnership with Talend that still maintains Talend Open Studio).

Architectural aspects The pplatforms adoptedpthe same architectureOS BI platforms are developed using Java since the modules they relyon are based on this technology.OLAPPORTALEnd UserWEB A MINING DWHETLBI enginesBI PLATFORMSERVERADMINISTRATIVE TOOLSApplication serverSOURCEDBMETADATA

Architectural aspects The pplatforms adoptedpthe same architectureOS BI platforms are developed using Java since the modules they relyon are based on this technology.OLAPPORTALEnd UserWEB A MINING DWHETLBI enginesBI PLATFORMSERVERADMINISTRATIVE TOOLSApplication serverSOURCEDBMETADATA

Architectural aspects The pplatforms adoptedpthe same architectureOS BI platforms are developed using Java since the modules they relyon are based on this technology.OLAPPORTALEnd UserWEB A MINING DWHETLBI enginesBI PLATFORMSERVERADMINISTRATIVE TOOLSApplication serverSOURCEDBMETADATA

Architectural aspects The pplatforms adoptedpthe same architectureOS BI platforms are developed using Java since the modules they relyon are based on this technology.OLAPPORTALEnd UserWEB A MINING DWHETLBI enginesBI PLATFORMSERVERADMINISTRATIVE TOOLSApplication serverSOURCEDBMETADATA

Architectural aspects The pplatforms adoptedpthe same architectureOS BI platforms are developed using Java since the modules they relyon are based on this technology.OLAPPORTALEnd UserWEB A MINING DWHETLBI enginesBI PLATFORMSERVERADMINISTRATIVE TOOLSApplication serverSOURCEDBMETADATA

Architectural aspects: modules Manyy of the modulesare sharedSome of them areevolutionsl tioff a diffdifferenttopen source projectSome modules arestandard de facto withinBI OS (Mondrian,JPivot , on ServerJBossJBossJBossAuthentication anduser profilingAcegiAcegiIntegrated in JfreeChartOpenlaszloData Mining-WekaWekaDBMSMySQL, Oracle,SQL Server,PostgreSQL, etc.ETLJasperETLPentaho DataIntegrationTalend Open StudioGeo-referenciationGoogle MapsGoogle MapsGEOJob rian&JpivotMondrian&JpivotPortalLiferayJBoss PortalExoPortal, LiferayQuery by Example--HibernateReportingJasperReportPentaho ReportDesigner,JJasperReport,Rt BIRTJasperReport, BIRTSingle sign onAcegiCASCASWeb ServerTomcatTomcatTomcatMySQL, Oracle, SQL MySQL, Oracle, SQLServer, PostgreSQL, Server, PostgreSQL,etc.etc.

Metadata Within a BI pplatform, metadata largelyg y determine the behavior it canexhibit. Metadata necessary to specific BI functionalities are usually createdoutside the platforms by editing an XML file or by exploiting simplegraphical toolstools. Although they model the same informationAlthough,information, metadata belonging todifferent engines are differently coded and cannot be reused. Thisobviously affects development and maintenance negatively. Although all three platforms declare that their metadata are CWMCWM-compliantlino iinteroperabilitybili toolsl hhave bbeen releasedld yet!!

Functional aspects SpagoBIp gcommunityyedition overcomesPentaho and Jasperthat make availablemany of the advancedfeatures onlyy in theenterprise editions.All the platforms allowsecure datatransmission as well asuser authentication,while they offer prettydifferent functionalitiesfor user profiling.FunctionalitiesSpagoBI PentahoPentahoJasperJasperEnt. Ed.Ent. Ed.Activities scheduling Ad h reportingAd-hocti Auditing Collaborative BI Data Mining Dashboard Document exportETL Geo-referenced analysis OLAP Query by Example Report validationworkflowReporting User profiling

Functional aspects SpagoBIp gcommunityyedition overcomesPentaho and Jasperthat make availablemany of the advancedfeatures onlyy in theenterprise editions.All the platforms allowsecure datatransmission as well asuser authentication,while they offer prettydifferent functionalitiesfor user profiling.FunctionalitiesSpagoBI PentahoPentahoJasperJasperEnt. Ed.Ent. Ed.Activities scheduling Ad h reportingAd-hocti Auditing Collaborative BI Data Mining Dashboard Document exportETL Geo-referenced analysis OLAP Query by Example Report validationworkflowReporting User profiling

Community vs Enterprise editions Differences are not onlyy in terms of functionalities available to the usersbut also in terms of utilities for administrators and developers:– Improved administration consoles: the improvement is particularly relevant in Pentahowhere the Enterprise console fills the gap with Jasper as concerns usability andfunctionalities.– Wizard based configurations: most configuration activities are based on wizards anddo not require a manual access to configuration files or multiple access to menus.menus– Process monitoring: frontfront--end (e.g. query execution) as well as backback-end (e.g. ETL)processes can be monitored and analyzed in order to optimize their execution.– ETL debuggingd bi environment:it it isi availableil bl andd determinesd tia strongtreductiond ti off thethdevelopment effort. Administrators and developerspare further supportedppthroughg a widerdocumentation, a knowledge base as well as consultant and trainingservices.Such enhancements, together with warranties and certification of thesoftware become more and more relevant when you are developing amission--critical application or when you are planning to adopt themissionplatform in a large and complex organization.

Usability: usersusers’ point of view Platforms usability is largely determined by the BI enginescomposing them. We consider the usability of those engines qualitativelysatisfactory. Although they do not reach the level ofrefinement of the commercial suites, their graphical featuresgive the developed applications an appreciable looklook-andand-feel. OS BI platforms also succeed in hiding the access todifferent toolstools.

Usability: administrators’administrators point of view Complexity of the installing and configuring process– Installing procedures are in general quite easy. This is particularly truefor Pentaho and JasperSoft whose installation procedures completelyrely on a wizard Administration complexity– In SpagoBI and even more in Jasper we appreciated the easiness ofthe formform--based procedure. Problem solving and training effort– manuals have a good quality and allow most of the problems to besolved.l d– Several practitioners’ forums make available a high number of tips. Pentaho more than 2020,000000 registered users Jasper about 90,000 registered users (but we experienced in many caseslongerg response time)) SpagoBI community is definitely smaller and so the activeness of its forum

Discussion Our analysis shows that OS BI platforms determine an addedvalue with respect to single BI tools– several functionalities are accessed transparentlypy– a set of processes are centralized and simplified The main shortcoming of the platforms is the absence of afullyy centralized and unified metadata layer.y The capabilities of the administrative tools could also beimproved in the community editions SpagoBI functionalities are comparable to the enterpriseeditionsditibyb JasperJandd PentahoP t h .Pentaho.

Discussion Although OS BI platforms are still not as sophisticated ascommercial ones, they got a sufficient level of reliability andmust be considered a valid alternative to commercial suites.– This is particularly true in small and mediummedium--sized enterprises wherethe quantity of data and the workload are not critical points.– Several companies are evaluating the use of OS BI in pilot projectswhere budget constraints are typically very tight. The main risks related to an investment in OS technologycome from:f– Unexpected termination of the project– The adoption of a more restrictive licensing of the new releases– It is impossible to predict if, apart from the initial investment, thecompanies that are in charge of the platforms will earn enough fromservices and application developments to stay on the market.

Pentaho Data JPivotJPivot , , WekaWeka)) ETL JasperETL Integration Talend Open Studio Geo-referenciation Google Maps Google Maps GEO Job SchedulerJob Scheduler Quartz Quartz Quartz OLAP Mondrian&Jpivot Mondrian&Jpivot Mondrian&Jpivot Portal Liferay JBoss Portal ExoPortal, Liferay Query by Example - - Hibernate Reporting JasperReport Pentaho Report