DATA SCIENCE COMPETENCY FRAMEWORK

Transcription

DATA SCIENCE COMPETENCY FRAMEWORKCOMMERCIAL IN CONFIDENCE

CONTENTSINTRODUCTIONDefining Competency and Competency Framework01The Data Science Competency Framework02Data Science Related Job Families03DATA SCIENCE /ANALY TICS SOLUTIONLIFE-CYCLE COMPE TENCIESTECHNICAL COMPE TENCIESData Science Tools22Big Data Analytics23Business Understanding07Data Understanding09Prepare and Preprocess Data for Modelling14Create, Test and Validate Models16Project Management24Deploy Models18Thinking and Problem Solving25Business Insights19Monitor and Assess Models20D ATA T O D E C I S I O N S C R CCORE COMPE TENCIES 2017 D2D CRC LtdDIRECTOR LE VEL COMPE TENCIESData Science/Analytics SolutionLife-cycle Competencies27Technical Competencies28Core Competencies29D ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KThese materials are subject tocopyright and are protected by theCopyright Laws of Australia. All rightsare reserved.Any copying or distribution ofthese materials without the writtenpermission of the copyright owner isnot authorised.COMMERCIAL IN CONFIDENCE

INTRODUCTIONWhile various competency frameworks exist inrelation to Information Technology, the needs ofthe evolving Data Science and Analytics field werepreviously undefined. After identifying this gap,Data to Decisions CRC (D2D CRC) embarked on aproject to create a generic data science competencyframework. By describing the skills, knowledge,experience and personal attributes relevant toworking in data science/analytics, including Big Data,the Data Science Competency Framework aims tosupport the development of the Big Data workforce.In particular, the framework can be utilised to supportworkforce planning, recruit, develop individualsand teams, highlight career pathways and enablecompetency recognition.DEFINING COMPETENCY ANDCOMPETENCY FRAMEWORKThe terms ‘competency’ and ‘competencies’ focuson the inputs of an individual into the completion ofa task. They can be defined as the behaviours (andtechnical attributes where appropriate) that individualsmust have, or must acquire, to perform effectively atwork. Competencies generally include: Knowledge – the cognisance of concepts, theories,models and principles gained from formal trainingand/or experience A skill is a developed proficiency or dexterity inmental operations or physical process that is oftenacquired through specialised training Experience is the accumulated application of skillsand knowledge in practice D ATA T O D E C I S I O N S C R CIndividual attributes and properties, qualities orcharacteristics of individuals that reflect one’sunique personal makeupD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KA ‘competency framework’ is a structure that setsout and defines each individual competency (such asproblem-solving or people management) requiredby individuals working in an organisation or part ofan organisation.The competency framework does not attempt todescribe everything that people within the fieldare required to do, for example, it does not includedescriptions of software tools or technologyspecific skills or knowledge, industry experience orqualifications as these are often organisation specific.Similarly, the framework doesn’t identify the mix ofcompetencies an individual, team or organisationmay require – these are often specific to the context.However, it does offer a suggestion as to whichcompetencies are most likely to be found in differentdata science related job families and to what extentthey are expected to be demonstrated.(adapted from: The definition of competencies andcompetency frameworks, UK Chartered Institute ofPersonnel and Development and The definition ofcompetencies and their application at UN, University ofNebraska-Lincoln)COMMERCIAL IN CONFIDENCE01

ma Projenag c teme ntDeploymenninleaDacqta auisitionBusinsi inessg htslslsDatapre-processingd ean odstTe ate mlidvaPreppre -proare andce s s d ataBusde inessve inlop sigm e htnt sodeAn aPress t a k e e n t toholderste mD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KC re aPs c ro j eop c tings ges inis n andBu erstdunst, te ea te i d a tCred valanD ATA T O D E C I S I O N S C R CtgtorMonitcesscadnaess ngsin diBu rstandeundisData source nidentificationtio elu yclSo fe CLior andMonits modelsecacuaaCorelyticsdes solutioi gnnDeploymodelsunta diDa stanerndticalCore – identifies the data science related aspectsof competencies that often have organisationalrelevance such as project management.nTechTechnical – includes competencies relatingspecifically to Big Data, technologies and tools;skills Data analytics solution life cycle – organisesthose competencies related to processing andmanaging data projects;Big data Cpr r iticob ale l thm inso k inlv gin ,gDae nccieta s lsD a to oThe structure of the Data Science CompetencyFramework is captured in the below diagram.It contains three key competency areas, eachholding a number of relevant competencies.Data cTHE DATA SCIENCECOMPETENCY FRAMEWORKgExploratodata ana rylysisINTRODUCTIONCOMMERCIAL IN CONFIDENCE02

INTRODUCTIONDATA RELATED JOB FAMILIESAfter detailed research, workshops and meetings,three major job families were identified within thescope of the data science workforce. Covering thebreadth of the data analytics solution life cycle andstack, the job families are identified as Data Scientist,Data Engineer and Data Analyst.BUSINESSUNDERSTANDINGDATAUNDERSTANDINGThe job family descriptions provided beloware not intended to be an exact match toorganisation’s role structures as a wide varietyof factors and organisational nuances impactrole design. The descriptions are intended to be aguide that assist organisations to identify relevantcompetencies and the extent to which theymight be expected to apply to their roles.DATAPREPARATIONMODELLINGTEST &VALIDATEAdditional job families may be added to the DataScience Competency Framework through futurereviews, which will also incorporate any evolutionarychanges or new trends impacting the data workforceand required competencies.DEPLOYMENTCOMMUNICATIONOF INSIGHTSONGOINGASSESMENTDATA SCIENTISTDATA ANALYSTDATA ENGINEERThe above diagram is a visual guide showing which job families have strengths over the data solution life cycle. The job families will also require some capability in the nonidentified areas however, these are areas which are less critical.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KCOMMERCIAL IN CONFIDENCE03

INTRODUCTIONDATA SCIENTISTDATA ENGINEERDATA ANALYSTA Data Scientist uses their knowledge of data miningtechniques (statistics, machine learning, AI), theirprogramming skills and business knowledge toextract insights from large sets of data. This includescleaning, transforming and combining data sources,using mathematical models, machine learning andvisualisations to analyse the data and communicatingfindings to business stakeholders. They are oftenrequired to produce answers in days, rather thanmonths, and typically work via exploratory analysisusing a variety of tools and languages. A Data Scientistuses their domain knowledge to interpret raw dataand results.A Data Engineer designs, develops and maintainssoftware architectures to collect and analyse largedata sets. This includes installing, testing andconfiguring scalable databases and data processingsystems and creating software components to collect,parse, manage, analyse and visualise data. They oftentackle problems associated with data integrationand unstructured data sets, employing a variety ofprogramming languages and tools to combine data andsystems and to improve data quality.A Data Analyst interprets data and analytic output,using their domain knowledge to draw conclusionsand support data driven decision making. This includesaccessing, manipulating, querying and analysing datausing a range of software and tools, applying statisticalanalysis techniques, and presenting the output ofanalysis to business stakeholders.A Data Scientist has an understanding of data storageand processing technologies, allowing them to performanalysis and develop models that will run efficientlyand reliably within the technology constraints.D ATA T O D E C I S I O N S C R CData Engineers work with business stakeholders todetermine the required data sets and analysis tools.Their goal is to provide clean, usable data and analyticoutput to other roles in the organisation.There is an overlap between a Data Engineer and aSoftware Engineer, with the Data Engineer having astronger emphasis on the data storage and processinglayers, and a deeper understanding of data modellingand data lifecycle management.D ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KThe goals of a Data Analyst are to identify andinterpret trends or patterns in complex data sets,to recognise and define process improvementopportunities and to effectively summarise conclusionsand recommendations.There is an overlap between a Data Analyst and a DataScientist, with the Data Analyst having more emphasison interpreting results, presenting conclusions andmaking business recommendations.COMMERCIAL IN CONFIDENCE04

INTRODUCTIONMATURIT Y LEVELSTo help support implementation, the Data ScienceCompetency Framework includes four broadgroupings that reference the maturity levels withinthe Australian Public Service Work Level Standards(WLS) & Integrated Leadership System (ILS) andSkills Framework for the Information Age (SFIA). Thefollowing table maps the various maturity levels withinAPS WLS & ILS and SFIA against those within theData Science Competency Framework. However, likethe competencies themselves, the organisational andteam context is likely to influence the maturity levelsof roles in particular organisations and is thereforejust a guide.APS WLS & ILSSFIAD2D CRCLevel 4&5Level 3 & 4PractitionerLevel 6Level 5SeniorExec. Level 1Level 6LeadExec. Level 2Level 7DirectorThe framework also includes an Awareness levelto describe the level of competency for those whoare not data science or data analytics practitionersbut need to know about data science/analyticsin order to maximise their effectiveness (forexample, IT, marketing, service delivery, HR andintelligence analysts).D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KCOMMERCIAL IN CONFIDENCE05

INTRODUCTIONThe competency descriptions should be read incorrelation with the below general maturity levelinformation. The job characteristics at each level offeradditional information about the level of autonomy,influence, accountability and complexity ks under general direction. Uses somediscretion in in less complex tasks. Workreviewed at frequent milestones.Works under general direction within aframework of accountability. Plans own workto meet given objectives and processes.Establishes team objectives andassigns responsibilities.Strategy and policy formation andapplication on the organisational level.InfluenceInteracts with and influences colleagues.Responsible for components of projects.Some responsibility for the work of others andallocation of resources. Influence the successof projects and team objectives.Influences policy and strategy formation.Initiates relationships with internal andexternal partners at senior management level.Makes decisions critical to organisationalsuccess. Inspires and influences theorganisation, and the industry at executivelevels. Develops long-term strategicrelationships with customers, partners,industry leaders and government.ComplexityPerforms a variety of work with some levelof complexity.Broad range of complex technicalor professional activities. Resolvescomplex issues.Deep understanding of complex technicaland business issues. Performs highlycomplex work activities. Contributes to theimplementation of policy and strategy.Leads formulation and implementation ofpolicy and strategy. Deep understanding ofthe industry and the organisation includingimplications of emerging technologies.Leadership andAccountabilityAnalytical and systematic approach to workwithin deadlines. Effective communicationand teamwork skills. Appreciates thebusiness context.Selects applicable standards, methods, toolsand applications. Communicates fluently andcan present complex information. Facilitatescollaboration. Plans, schedules and monitorswork to meet time and quality targets. Maycoach other staff members.Authority and accountability for a significantarea of work, including technical, financial andquality aspects. Absorbs complex informationand communicates effectively at all levels.Manages and mitigates risk. Understands theimplications of new technologies and industrychanges. Demonstrates clear leadership.Understands applicable legislation and codesof practice promoting compliance. Mentorsand coaches other staff members.Authority and accountability at anorganisational level. High level strategicmanagement and leadership skills.Understands and communicates complexideas to all levels in a persuasive andconvincing manner. Has a broad anddeep business knowledge, includingemerging technology the activities ofother organisations. Assesses the impactof legislation, and actively promotescompliance. Mentors and coaches otherstaff members.JOB CHARACTERISTICS(EQUIVALENT TO APS 4&5 OR SFIA 3/4)D ATA T O D E C I S I O N S C R C(EQUIVALENT TO APS L6 OR SFIA 5)D ATA S C IE N C E C O MP E T E N C Y F R A ME W O R K(EQUIVALENT TO APS EL1 OR SFIA 6)(EQUIVALENT TO APS EL2 OR SFIA 7)COMMERCIAL IN CONFIDENCE06

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESBUSINESS UNDERSTANDING – PROBLEM IDENTIFICATIONAbility to establish the problem and whether it can be solved by analytics.AWARENESSPRACTITIONERSENIORLEADBeing aware of what kind of businessproblems can be addressed bydata-driven solutions.Utilises communication andcollaboration skills to gain anunderstanding of the organisationalproblem from stakeholders.Utilises sound communicationand collaboration skills to gain anunderstanding of the organisationalproblem from stakeholders.Utilises significant communicationand collaboration skills to gainan understanding of complexorganisational problems, frommultiple stakeholders.Utilises general knowledge of datascience/analytics and experience ofuse cases and solutions to identifywhen an issue may be amenable to adata-driven solution and to assist withsetting data science/analytics goalsand deliverables.Establishes whether the problem isamenable to a data driven solutionusing experience and knowledgeof a variety of use cases andpotential solutions.Understanding the principles andgeneral ideas of creating a datascience/analytics solution.Ability to set data science/analyticsgoals and deliverables based onthe established success criteriaand to define key metrics of thesolution’s success.Knowledge of a variety of datascience /analytics use cases thatoutline solutions to large-scalebusiness issues.Experience in designing complicated,multi-staged data-driven solutionsto real-world problems includingcollecting business requirements anddetermining if the issue is amenableto a data-driven solution or finding thekey components that are amenableto that.Ability to set data science/analyticsgoals and deliverables for complexprojects and to identify keysuccess metrics.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KCOMMERCIAL IN CONFIDENCE07

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESBUSINESS UNDERSTANDING – BUSINESS AND DATA UNDERSTANDINGAbility to utilise understanding of the orgnisation and it’s dataAWARENESSPRACTITIONERSENIORLEADAwareness of organisational datasources and the implications of utilisingdata to inform organisational strategy,decision making and service delivery.General understanding of theorganisational context, direction andkey strategies.Understands the organisationalcontext, functions and keystrategies in-depth.Extensive understanding of thecomplexities of the organisationcontext, functions and strategy.Understanding of a specific domain,the data applicable to the domainand the meaning of the data and itsimplications for the organisation anddecision making.Ability to apply detailed understandingof a specific domain, the dataapplicable to the domain andthe meaning of the data for theorganisation to enhance success,improve decision making anddeliver insights.Utilises organisational understandingand detailed knowledge of a range ofdata sets across a variety of domainsto create insights impacting strategy,decision making and service delivery.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KCOMMERCIAL IN CONFIDENCE08

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESDATA UNDERSTANDING – DATA SOURCES IDENTIFICATIONAbility to establish the availability and accessibility of data.AWARENESSPRACTITIONERSENIORLEADAppreciate data requirements andissues of availability and accessibility.Appreciate the role of data wrangling/hacking in data science/analyticsAbility to assist in establishing the keyrequired internal and external datasources as well as data availabilityand accessibility.Ability to establish the keyrequired internal and externaldata sources as well as dataavailability and accessibility formoderate-scale projects.Ability to lead the team to establishthe key required internal andexternal data sources as well asdata availability and accessibility forlarge-scale projects.Utilises in-depth knowledge of a rangeof data sources including how theyare collected, where and how they arestored, and interrelationships, bothwithin and external to the organisation,to verify relevance of potentialdata sources.Utilises extensive knowledge of arange of data sources both internaland external to the organisation,including how they are collected,where and how they are stored, andtheir interrelationships, to verifyrelevance of potential data sources.Utilises knowledge of data sourcesincluding how they are collected,where and how they are stored, withinand external to the organisation,to verify relevance of potentialdata sources.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KCOMMERCIAL IN CONFIDENCE09

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESDATA UNDERSTANDING – DATA ACQUISITIONAbility to acquire and wrangle data.AWARENESSPRACTITIONERSENIORLEADAwareness of privacy and ethicalimplications of sourcing internaland external dataAbility to conduct data acquisition fromrelational databases and flat files.Uses experience and knowledgeof appropriate techniques andtheir strengths, such as ETL batchprocessing, streaming ingestion,scrapers, APIs and crawlers, to acquireopen source data.Applies a wide range of advanceddata wrangling techniques, such asparsing, and algorithms for complex,multi-source data and large orcomplex projects.Ability to hack/wrangle low complexitydata selecting appropriate techniques,such as parsing, or an alogorithm, tocreate a data structure relevant tothe problem.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KAbility to conduct data acquisition fromrelational databases and flat files.Ability to hack/wrangle complex data,selecting appropriate techniques, suchas parsing, and algorithims into a datastructure relevant to the problem.Ability to apply advanced methodsof data acquisition for multiple typesof data sources including ETL batchprocessing, streaming ingestion,scrapers or APIs for open source data.COMMERCIAL IN CONFIDENCE10

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESDATA UNDERSTANDING – DATA AUDITAbility to conduct data quality s of the necessity of data audit.Knowledge of basic data audittechniques and approaches.Experience with a range of data audittechniques and approaches.Ability to assist the team with dataquality assessment using experienceof relevant tools and programminglanguages and general understandingof the data, potential issues such asmissing values, duplicate data, andthe implications of data quality for thedata science/analytics process.Utilises experience to design, reviewand monitor optimal approachfor data quality assessment forcomplex projects and to conductdata quality verification utilising adetailed understanding of the data,potential issues, such as missingvalues, duplicates, inconsistenciesand the implications for the datascience/analytics process.Extensive and/or in-depth knowledgeof data audit techniques andapproaches and experience inapplying them in complex settings.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KAbility to design, review andmonitor optimal approach for dataquality assessment for complex orlarge projects utilising extensiveknowledge of the data, potentialissues such as missing values,duplicates and inconsistent formats,and the implications for the datascience/analytics process.COMMERCIAL IN CONFIDENCE11

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESDATA UNDERSTANDING – DATA CLE ANINGAbility to identify and resolve established data issues.AWARENESSPRACTITIONERSENIORLEADAwareness of the necessity ofdata cleaning and potential dataquality issues.Knowledge of some basic datacleaning techniques and approachessuch as data wrangling, batchprocessing, data mining, dataenhancement, data harmonisation anddata standardisation.Experience in utilising a numberof data cleaning techniques andapproaches for structured andunstructured data such as datawrangling, batch processing,data mining, data enhancement,data harmonisation anddata standardisation.Extensive and/or in-depth knowledgeof best-practice data cleaningtechniques and approaches fora variety of data types such asdata wrangling, batch processing,data mining, data enhancement,data harmonisation anddata standardisation.Ability to conduct data cleaning ofnoisy, incomplete data or data withestablished data quality issues usingexperience of relevant tools andprogramming languages.Extensive experience in utilisingthese for cleaning complex, large,incomplete data or data withestablished quality issues.Ability to assist the team with datacleaning of noisy and incompletedata using relevant tools andprogramming languages.Develops understanding of whydata requires cleaning, includingthe organisational context, andthe implications of this for datascience/analysis processes.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KUtilises knowledge of how theinteraction of multiple data issues,such as missing data, outliers, multiplevalues and meaning of data, impactsanalysis and identifies an appropriatecleaning approach.Ability to design and implement datacleaning approach for complex dataand projects.COMMERCIAL IN CONFIDENCE12

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESDATA UNDERSTANDING – EXPLORATORY DATA ANALYSISAbility to establish data specifics and ness of the necessity ofexploratory data analysis.Knowledge of basic descriptiveanalysis techniques includingdata visualisations to establishvariable distributions andinter-variable relationships.Knowledge of a variety of exploratorydata analysis methods and techniques,such as box plots, histrograms, scatterplots and Pareto charts, suitable forvarious data types.Extensive and/or in depth knowledgeof complex data explorationtechniques and methods andexperience in applying them creativelyand effectively.Ability to assist in exploratory dataanalysis tasks.Ability to conduct exploratorydata analysis activities includingdata visualisations to establishvariable distributions andinter-variable relationships formoderate-scale projects.Ability to design, lead and coordinateexploratory analysis activitiesincluding descriptive analysistechniques, identification of relevanttrends and relationships withinthe data, data visualisation andconducting missing data analysis andimputation for complex (e.g. noisy orsparse) data.Ability to conduct missing dataanalysis and imputation utilisingrelevant techniques and establishwhether data enrichment is necessaryfor achieving the project objectives.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KEstablishes whether data enrichmentis necessary for achieving largescale project objectives as wellas the nature and sources of therequired data.COMMERCIAL IN CONFIDENCE13

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESPREPARE AND PREPROCESS DATA FOR MODELLING – ANALY TICS SOLUTION DESIGNAbility to design data science/analytics solutions based on the business and data level awareness of a wide range ofcore data science/analytics techniques,their advantages, disadvantages andareas of application.Uses familiarity with an increasingnumber of data science/analyticstechniques, their advantages,disadvantages, assumptionsand application to identify whichtechniques may work best.Ability to design analytical modellingapproach for moderate-scale projectsor for components of large-scaleprojects utilising sound knowledge ofdata science techniques.Design and supervise the design ofData Science/Analytics solution forlarge or complex projects.Knowledge of statistical andmachine learning techniquessuch as classification, linearregression modelling, clustering anddecision trees.Understands the data, andperformance requirements for theproblem to select the best technique.Develops ability to identify the causeof errors, such as the impact ofoutliers , and works logically to identifypotential solutions.Develops knowledge of current datascience/analytics trends.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KChooses the technique optimal forthe task (for example, decision trees,advanced regression techniques suchas LASSO methods, random forestsetc.) based on knowledge of the dataand the technique’s constraints,assumptions, interpretability,robustness and application.Understanding of businessrequirements and constraintsincluding potential trade-offs betweenspeed and accuracy,Maintains knowledge of datascience trends.Chooses the technique optimal forthe task (for example, decision trees,advanced regression techniquessuch as LASSO methods, randomforests etc.) based on detailedknowledge of the data and thetechnique’s constraints, assumptions,interpretability, robustness andareas of application from a widerange of best-practice DataScience/Analytics techniques.Utilises understanding oforganisational context to negotiateperformance requirementsand implementation ofbest-practice solutions.Expands knowledge of datascience/analytics trends.COMMERCIAL IN CONFIDENCE14

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESPREPARE AND PREPROCESS DATA FOR MODELLING – DATA PRE-PROCESSINGPreprocess and transform the data to ensure that it is in the optimal format, layout or shape for the project purposes.AWARENESSPRACTITIONERSENIORLEADAwareness of the necessity of datapreprocessing stage, its objectives andtime and resource requirements.Ability to create required data setutilising understanding of routineproblems, data formats, applicability ofthe data to the problem and standardmodelling techniques.Ability to create required dataset utilising understanding of theorganisational problem, applicabilityof the data to the problem, data formatand a range of modelling techniques.Ability to fuse data sources usingknowledge of data pre-processingtechniques such as transformation,integration, normalisation, featureextraction, to identify and applyappropriate methods.Ability to fuse data sourcesusing knowledge of a variety ofdata pre-processing techniquessuch as transformation,integration, normalisation, featureextraction, to identify and applyappropriate methods.In-depth knowledge of and experiencein using a wide variety of morecomplex data manipulation andtransformation techniques suchas transformation, integtration,normalisation and feature extractionto fuse and reshape complex,multi-source data.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KUtilises in-depth knowledge andexperience of organisational problems,data formats and data modelling tolead the team in application of datamanipulation techniques for large orcomplex projects.COMMERCIAL IN CONFIDENCE15

DATA SCIENCE/ANALYTICS SOLUTION LIFE-CYCLE COMPETENCIESCREATE, TEST AND VALIDATE MODELS– CRE ATE MODELSAbility to develop a data science/analytical model.AWARENESSPRACTITIONERSENIORLEADAwareness of the necessity of thisstage, its objectives and time andresource requirements.Knowledge of statistical methods andbest-practice advanced modellingtechniques (predictive modelling,advanced clustering, associationrules etc).Knowledge of a wide range ofstatistical methods and best-practiceadvanced modelling techniques(for example, predictive modelling,advanced clustering, text mining,social network analysis, associationrules etc).Extensive knowledge of a range ofbest-practice advanced modellingtechniques (for example, text mining,social network analysis, predictivemodelling, advanced clustering,association rules etc).Experience in using modellingtechniques to model structured,uncomplicated data.Ability to source additionalinformation, ideas and solutionsthrough a variety of sources such asresearch and relevant libraries.D ATA T O D E C I S I O N S C R CD ATA S C IE N C E C O MP E T E N C Y F R A ME W O R KExperience in experimentingwith, selecting and developingthe modelling techniques mostsuitable for the organisationalobjective, organisational context andincreasingly complex, unstructuredand multiple data sets (for exampleincluding different types of data suchas streaming data, raw text data).Significant experience in selectingand combining modelling techniquesthat are most likely to deliver maximalaccurac

the evolving Data Science and Analytics field were previously undefined. After identifying this gap, Data to Decisions CRC (D2D CRC) embarked on a project to create a generic data science competency framework. By describing the skills, knowledge, experience and personal attributes relevant to working in data science/