Federated Data Usage Statistics In The Earth System Grid Federation

Transcription

Federated data usage statistics inthe Earth System Grid federationA.Nuzzo, M.Mirto, P. Nassisi, S. Fiore, G. AloisioCMCC Foundation(Euro Mediterranean Center on Climate Change)6th Annual ESGF ConferenceDec 6-9 2016, Washington, DC

OutlinevGoals and main tasksvArchitecture in the large – single node levelvArchitecture in the large – federation levelvFederation protocolvNew Dashboard-UI module

Goals and main tasksThe main goal of the DWT was to provide a distributed and scalable monitoringframework responsible for: capturing usage metrics, system status and aggregated information at thesingle site level and at the federated level providing the user with a user friendly interface including widget showingaggregated statistics and monitoring information.The Dashboard system faces this important challenge through two maincomponents:

Architecture in the large – Single node levelESGF DATA NODEFeatures: Extended set of staGsGcs Fine grain level Project specific views More scalable designESGF node manager filteresgf dashboardESGCETDASHBOARD QUEUEIDurl pathdura-onsize-mestampsuccessprocessed getMetadata(url path)Mul- Tier Database1 Datawarehouse A set of data martsETLESGF dashboardMETADATASOLRMETADATAESGF DASHBOARD-UI

Architecture in the large – Federation levelTill now, the federated statistics have been collected by manually executing a set ofdifferent queries on the various data nodes and importing the results into a singledatabase.v The federated protocol isbased on a hierarchical viewof the systemv Two kinds of nodes:Ø Collector nodeØ Leaf node

Architecture in the large – Federation levelLeaf nodev Dashboard back-endengineØ A data warehousestoring all the datarelated to thedownloadsØ A set of data martscontaining specificstatistics informationv A set of RESTful APIproviding the collectornode the possibility toaccess data marts andgetting the statistics.

Architecture in the large – Federation levelCollector nodeThe collector node has a more complex structure because, in addition to making itsinformation available to the collector through the RESTful API, is in charge to queryits leaf nodes.The collector node is composed by:v data warehouse and data martsv RESTful APIv xml configuration filev federation componentA first prototype of such protocol has beensuccessfully installed and tested on foursites: CMCC, DKRZ, NASA/JPL, PCMDI

New Dashboard-UI – Statistics Overview

New Dashboard-UI – Statistics OverviewRegistered Users and Number of Downloads per Project

New Dashboard-UINumber of Downloads byCon-nent and Countries

New Dashboard-UINumber of Users byCon-nent and Countries

New Dashboard-UINumber of Registered Users by IdPs

New Dashboard-UI – Data Usage Statistics sectionNumber of download over -meNumber of download per host

New Dashboard-UI – Project specific sectionObs4MIP4 project

New Dashboard-UI - Client statistics sectionNumber of users whomade a download

New Dashboard-UI – Federated Data Archive sectionFilter by data node

New Dashboard-UI – Federated Data Archive sectionTotal number of datasets and related size for each Model and Modeling Ins-tutefor CMIP5 project (data obtained by SOLR module).

New Dashboard-UI – Service status sectionDeployment distribu-on

Thank you

vNew Dashboard-UI module. Goals and main tasks The main goal of the DWT was to provide a distributed and scalable monitoring . More scalable design Architecture in the large - Federation level v The federated protocol is based on a hierarchical view of the system v Two kinds of nodes: