The Principles Of Good Data Management - GOV.UK

Transcription

Intra-governmental Groupon Geographic InformationThe Principles of GoodData Management

The Principles of GoodData Management2nd EditionJuly 2005Office of the Deputy Prime Minister: London

The Office of the Deputy Prime MinisterEland HouseBressenden PlaceLondon SW1E 5DUTelephone 020 7944 4400Web site www.odpm.gov.uk Queen’s Printer and Controller of Her Majesty’s Stationery Office 2005Copyright in the typographical arrangement and design rests with the Crown.This publication (excluding logos) may be reproduced free of charge in any formator medium provided that it is reproduced accurately and not used in a misleadingcontext. The material must be acknowledged as Crown copyright with the title andsource of the publication specified.This publication is also available at www.iggi.gov.ukAny enquiries relating to this publication should be addressed by email to:iggi@odpm.gsi.gov.ukFurther copies of this publication are available from:ODPM PublicationsPO Box 236WetherbyWest YorkshireLS23 7NBTel: 0870 1226 236Fax: 0870 1226 237Textphone: 0870 1207 405E-mail: iggi@odpm.gsi.gov.ukPrinted in Great Britain on material comprising 80% post-consumer waste and20% ECF pulp.July 2005Product Code: 05 PLUS 03146AcknowledgementsThis guide, version 2.0, dated June 2005, was prepared by the IGGI Working Groupon Data Management and Standards. Thanks to Jeremy Giles, British GeologicalSurvey and his cross-departmental team for their considerable efforts in puttingtogether this guide. Thanks also to David Lowe, British Geological Survey, forediting the document.

The Principles of Good Data ManagementFOREWORDBy Philip White, IGGI ChairmanThe Good Data Management booklet was originally produced 2002as part of IGGI’s commitment to encourage best practice whendealing with government geographic information. Recognising thatthe environment is constantly evolving, the guide has been revised totake account of current best practice, recent developments and reflectkey legislation. In recent years, there has been an increased emphasison facilitating data sharing both within and between organisations.The principle of “collect once, use many times” is well establishedas a concept but can only be achieved with data management. Thisbooklet complements The Principles of Metadata Management alsopublished by IGGI.The booklet provides best practice guidance for those responsible formanaging data of a geographic nature. Specifically, it considers thebenefits, drivers, principles and mechanisms needed for good datamanagement.This guide helps ensure that processes are in place for initial inputand maintenance of reliable metadata for retrieval through ODPM’sMaps on Tap and the gigateway Data Locator. If the metadatapolicies recommended in this leaflet are followed, users can expectconsistency in the data they are using and exchanging with theircolleagues throughout central government.I would like to thank the members of the group chaired by JeremyGiles from British Geological Survey for their hard work and thecommitment involved in completing this guide.1

The Principles of Good Data Management2

The Principles of Good Data ManagementCONTENTS1Purpose of this guide62What is Data Management?73Why do we need to manage our data?8Key drivers for improved Data Management8456Benefits of good Data Management10Benefits to Data SuppliersBenefits to Data Brokers/IntermediariesBenefits to users and customers101011Principles of Good Data Management12Avoid re-collecting dataData lifecycle controlData policyData ownershipMetadataData quality121213131414Establishing a Data Policy17Data acquisitionData care – StewardshipData use and exchangeReview171717183

The Principles of Good Data Management74Implementation – key roles19Data Management ChampionData ManagerData Stewards1919198Further guidance219Glossary of Terms2210World Wide Web URLs24

The Principles of Good Data Management1Purpose of this guideThis guide provides general guidance on the management of data.The guide has been produced for those responsible for geographicinformation, although the principles are equally relevant to othertypes of government information.Government departments and agencies collect, generate, store anduse large amounts of data that have been obtained at considerablecost. Much of this data is geographical in that they are referenced togeographical locations, such as points, lines or areas; e.g. post codes.The importance of good Data Management has become increasinglyrecognized over recent years and a body of legislation reflects thischange in attitude. Key elements of the relevant legislation include: The Freedom of Information Act 20001 The Environmental Information Regulations2 The Human Rights Act 19983 The Data Protection Act 19984 The Public Records Act 19585The European Commission is also having greater impact, asINSPIRE and the Directive on Public Sector Informationdemonstrate. These and other drivers mean that Data Management,and the associated activity of records management, need to be givenpriority by public bodies.5

The Principles of Good Data Management2What is Data Management?Data Management is a group of activities relating to the planning,development, implementationand administration of systems for the acquisition, storage, security,retrieval, dissemination, archiving and disposal of data. Such systemsare commonly digital, but the term equally applies to paper-basedsystems where the term records management is commonly used. Theterm embraces all forms of data, whether these datasets are simplepaper forms, the contents of relational databases, multi-mediadatasets such as images, or scientific data such as seismic records ofthe UK land mass.The management of geographic data is in many ways no different tothe management of other types of data. However, it is important torecognise that there may be geography-specific issues that needcareful thought as part of Data Management activities; for example,ensuring that any geographic identifiers used are appropriate andresilient. Bearing in mind that one of the strengths of geographicdata is the ability to link seemingly disparate pieces of information,it is absolutely critical to ensure that the chosen geographicidentifiers allow this.Key Data Management activities include: 6Data Policy development;Data Ownership;Metadata Compilation;Data Lifecycle Control;Data Quality; andData Access and Dissemination.

The Principles of Good Data ManagementThis guide covers only the key aspects of Data Management.Extensive additional advice is now available from The NationalArchives6, The Department for Constitutional Affairs7 andUKgovtalk8.7

The Principles of Good Data Management3Why do we need to manageour data?Government owns huge amounts of irreplaceable GeographicInformation, potentially of use to a wide range of bodies, and thereare increasing pressures on departments/agencies to manage thesedata properly. Examples of these pressures are identified below.Key drivers for improved Data ManagementThe Freedom of Information (FoI) Act 2000 came into force inJanuary 2005. General guidance on FoI is provided by TheDepartment for Constitutional Affairs9 and The National Archives(TNA)10. The FoI defines clear duties and responsibilities for all whomanage public sector information in all its forms. The Act conferstwo statutory rights on applicants: To be told whether or not the public authority holds thatinformation; and if so, To have that information communicated to them.The Act allows twenty days for a response to be prepared. Thislimited time means that good Data Management practices must be inplace to ensure that public bodies can meet the requirements of theFoI.The FoI gives rights of access to a wide range of information.However, rights of access to environmental information are providedby a separate statutory regime, the Environmental InformationRegulations (EIR). The aim of the regulations, which also came intoforce in January 2005, is to ensure access for the public toenvironmental information to enable them to participate in decisionmaking and obtain justice in environmental matters. This is seen as8

The Principles of Good Data Managementessential for creating transparency and building trust withincommunities and between individuals and public authorities.Guidance on EIRs is provided by TNA11.The Human Rights Act 1998 and The Data Protection Act 1998 bothprovide for the protection of personal information from inappropriateuse and the right of access to data held about the individual. Thesetwo acts place specific duties on Data Management concerningsecurity and access to personal information.Other, non-legislative drivers include: Increasing recognition that Government data, collected atpublic expense, must be properly managed in order torealize their full potential and justify their considerableproduction and maintenance costs. Increasing pressure from customers for easier and quickeraccess to the right information at little or no charge. Interoperability between systems and services, for so longseen as desirable, is now becoming a reality. The outputsand credibility of such services depend heavily upon thequality of the data provided. As the number of interoperableservices increases, so too does the requirement to haveready access to data of known (maintained) quality. Stronger emphasis within Government on the need torationalize and combine data in order to improve efficiencyand add value. More reluctance from suppliers to provide data at affordableprices. Stricter control is required by Data Owners over theuse of their data to safeguard their Intellectual PropertyRights (IPR) and the confidentiality of sensitive data.9

The Principles of Good Data Management4Benefits of good DataManagementData Management policies and procedures ensure that data on allmedia are treated as a valued resource. Implementing such policiesand procedures will give many benefits:Benefits to Data Suppliers An increased confidence and trust that their data will beused according to their agreed conditions of use, withoutrisk to confidentiality, copyright or IPR, and in compliancewith all statutory and non-statutory obligations. Providing a clear understanding of the use of their data,formally documented in a Memorandum of Agreementsigned by both supplier and user. A fair return for the use of the data they have supplied.Benefits to Data Brokers/Intermediaries10 Better quality, harmonized and coherent data from the useof common definitions, including geographic references,formats, validation processes and standard procedures. Better care of the data holdings through the use of effectivedata policies and best practice guidance. Better control over the data by the clear definition and useof the procedures for the care of data.

The Principles of Good Data Management Improved knowledge and understanding of data holdings,their availability, interpretation and use, with subsequentreduction of the risk of duplication or loss, through bettercataloguing, metadata and, in time, better access to data viaan integrated data environment. Improved business processes, including better and moreefficient use and re-use of data, and the standardization ofdatasets that are frequently used by different parts of anorganization. Increased confidence that the organization complies withstatutory and non-statutory obligations, by the regular useof centrally coordinated, frequently updated guidance, codesof practice and training on legal, contractual and otherobligations. Better control over access to data, both for internal andbona fide external customers, resulting from better dataorganization and maintenance following defined policies onrelease, disclosure control and data security. More sensible and consistent data charges and conditions ofuse, resulting from clear pricing and dissemination policiesthat recognize the need for free access by appropriatecustomers whilst recovering the appropriate income fromcustomers who seek to make commercial gain. An increasing confidence by the customer in the quality ofthe data managed and in the reliability of outputs that areproduced.11

The Principles of Good Data ManagementBenefits to users and customers12 Improved awareness and understanding of what data areavailable for current and future use, resulting from bettercataloguing and data archiving. Improved access to data, free from unnecessary obstacles,safeguarded from disclosure of personal information orinfringement of legal and contractual obligations. Better quality and more timely information i.e. access to theright information at the right time, resulting from quickeridentification of customer needs and the avoidance ofwrong or conflicting information, through the use ofeffective metadata. Better value for money, resulting from clear, fair andconsistent data charges and conditions of use, whichrecognize the need for free access by the appropriatecustomers. Better exploitation of data generally, enabled by easier dataexchange and integration with other harmonized data. Efficiency gains across government and its agenciesresulting from the use of better quality data.

The Principles of Good Data Management5Principles of Good DataManagementGood Data Management is essential for the effective use of theinformation resources of public bodies in all their forms. Section 2,above, identified a range of key Data Management activities; theseare discussed below.The key principles of Data Management are illustrated in Figure 1and described in the text.Avoid re-collecting dataThe largest potential for waste in Data Management is reacquiring anexisting dataset. This has been done frequently by public and privatesector organizations and must be avoided. In the USA, ExecutiveOrder 1290612 requires government agencies to put internalprocedures in place to ensure that they check whether other agencieshave already collected information they plan to acquire. Whereas noequivalent instruction exists in the UK, it should be regarded as bestpractice to use the gigateway13 Data Locator to search for existinggeospatial datasets before new ones are created.Data lifecycle controlGood Data Management requires that the whole life cycle of datasetsbe managed carefully. This includes: Business justification, to ensure that thought has been givento why new data are required rather than existing dataamended or used in new ways, how data can be specifiedfor maximum use including the potential to meet otherpossible requirements, and why the costs of handling,13

The Principles of Good Data Managementstoring and maintaining these data are acceptable andrecoverable. Data specification and modelling, processing, databasemaintenance and security, to ensure that data will be fit forpurpose and held securely in their own databases. Ongoing data audit, to monitor the use and continuedeffectiveness of the data. Archiving and final destruction, to ensure that data arearchived and maintained effectively until they are no longerneeded or are uneconomical to retain.Figure 1: Key Principles of Data Management14

The Principles of Good Data ManagementData policyThe fundamental step for any organization wishing to implementgood Data Management procedures is to define a Data Policy. Thedocument may have different names in different public bodies but ineach it should be a set of broad, high-level principles that form theguiding framework within which Data Management can operate.This is the document that is approved at senior levels in the publicbody, and the senior executive who owns the policy (DataManagement Champion) manages the resources for itsimplementation. Section 6 includes a model Data Policy Statement.Data ownershipOne key aspect of good Data Management is the clear identificationof the owner of the data. Normally this is the organization or groupof organizations that originally commissioned the data acquisition orcompilation and retains managerial and financial control of the data.The Data Owner has legal rights over the dataset, the IPR and theCopyright.Data ownership implies the right to exploit the data, and if continuedmaintenance becomes unnecessary or uneconomical, the right todestroy them, subject to the provisions of the Public Records andFreedom of Information acts. Ownership can relate to a data item, adataset or a value-added dataset. IPR can be owned at differentlevels. For example, a merged or value-added dataset can be ownedby one organization, even though other organizations own theconstituent data. If the legal ownership is unclear, there are risks thatthe data can be wrongly exploited, used without payment of royaltyto the owner, neglected or lost.15

The Principles of Good Data ManagementIt is therefore important for Data Owners to take action to establishand document: The ownership, IPR and Copyright of their data so thatthese can be safeguarded. The statutory and non-statutory obligations relevant to theirbusiness to ensure that the data are compliant. The departmental policies for data security, disclosurecontrol, release, pricing and dissemination. The agreement reached with users and customers on theconditions of use in a signed Memorandum of Agreement,before data are released.MetadataAll datasets must have appropriate metadata compiled for them. Atthe simplest level metadata are “data about data”. Metadata provide asummary of the characteristics of a dataset. A good metadata recordenables the user of a dataset or other information resource tounderstand the content of what they are reviewing, its potential valueand its limitations.There are many metadata standards, but the ones that are mostappropriate to GI are:16 ISO 19115:200314 (Geographic Information – Metadata);and UK GEMINI – (Geo-spatial Metadata InteroperabilityInitiative)The profile is the result of a collaboration between theAGI15 and the e-Government Unit16. A profile is a subset of

The Principles of Good Data Managementone or several information standards that adopts elements,structures or rules for different user communities.Adherence to the UK GEMINI profile, which will replacethe gigateway Discovery Metadata Specifications (theNGDF Standard) as the UK’s national geospatial metadataprofile, allows for the creation of discovery metadata withboth ISO 19115 (Geographic Information – Metadata) andthe national e-Government Metadata Standard (eGMS),ensuring compliance with both.Comprehensive advice on the compilation of metadata can be foundin the IGGI booklet entitled “The Principles of Good MetadataManagement17”, the second edition of which was published inMay 2004.Data qualityGood Data Management also ensures that datasets are capable ofmeeting current needs successfully and are suitable for furtherexploitation. The ability to integrate data with other datasets is likelyto add value, encourage ongoing use of the data and recover the costsof collecting the data. The creation, maintenance and development ofquality data require a clear and well-specified management regime.Data StewardAll datasets need to be managed by a named individual referred tohere as the Data Steward; also known as dataset manager and datacustodian. A Data Steward should be given formal responsibility forthe stewardship of each major dataset. They should be accountablefor the management and care of the data holdings assigned to them,in line with the defined data policy. Section 6 provides a list of theresponsibilities of the Data Steward.17

The Principles of Good Data ManagementData Management PlanThe Data Steward is responsible for the development of a DataManagement Plan for each dataset under their responsibility. Theobjective of the Data Management Plan is to ensure: That the dataset is fit for the purpose for which it isrequired. That the long-term management of the dataset is consideredfor potential re-use.The individual management plans should be compliant with the localdata policy and include:18 Scope of the plan Link to metadata Responsibilities IPR and Copyright Quality objectives Standards (International, National and local) adopted duringcompilation of the data Staff resources required to manage the dataset Physical resources required to manage the dataset Long term management of the dataset

The Principles of Good Data ManagementData Management proceduresIndividual datasets may require compilation of specific DataManagement procedures. These may be needed where specificdatasets require detailed operational procedures to ensure theirquality; examples of this include scientific and statistical datasets.Data access and disseminationAlthough this aspect will depend upon the business and the financialpolicy of the organization, the following guidance should befollowed. Public access to data should be provided in line with TheFreedom of Information Act, The Data Protection Act andThe Human Rights Act. IPR and Copyright of datasets owned by public bodies mustbe protected, as data should be regarded as an asset. IPR and Copyright of third-party data must be respected. The potential for commercial re-use and exploitation of thedataset should be considered. The right to use or provide access to data can be passed toa third party, subject to agreed pricing and disseminationpolicies. Consideration should be given to the impact of Europeandevelopments such as the Public Sector InformationDirective and INSPIRE.19

The Principles of Good Data ManagementData auditData Management audits are recommended to ensure that themanagement environment for given datasets are being maintained.Their purpose is to provide assurance to the Data ManagementChampion that the resources expended are being used appropriately.Audits of major datasets should be commissioned to ascertain thelevel of compliance with data policies and the Data Managementplans and procedures that have been prepared.20

The Principles of Good Data Management6Establishing a Data PolicyIGGI has prepared the following model Data Policy Statement,which Government departments/agencies may wish to use or adapt tomeet their own Data Management needs.Data acquisition All projects and other activities that give rise to substantialdatasets will establish at the outset whether suitable dataalready exist in a potentially usable form, or whether newdata need to be acquired. Before projects are approved, they must establish how thedata acquired will be exploited to the full, who will beresponsible for full exploitation of the data, and how thebenefits will be maximized and shared. Subsequent data handling and storage needs will beconsidered, and plans put in place to ensure that databasesare maintained in such a way that maximum use cansubsequently be made of them.Data care – Stewardship Databases will be managed closely, with clear responsibilityfor stewardship established and individuals madeaccountable for ensuring Data Management procedures arefollowed. Data will be held securely within their own database, andadequate provision made for their long-term care.21

The Principles of Good Data Management All data will be validated and quality assured before beingused or archived. Easy access will be given to data holdings, both for staffand bona fide ‘customers’. Data that are not legally required to be retained will not bedestroyed or put at risk without first exploring all otherpossibilities and then demonstrating clearly that the costs ofretaining them cannot be justified by potential benefits, orthat the replacement cost is less than the storage costs.Data use and exchange22 Memoranda of Agreement will be drawn up with Users andCustomers who receive data, with respect to the subsequentuse of such data. These will include confidentialitydeclarations and conditions of use. Intellectual Property Rights will be protected in relation toany development of information, by specifying formally anyrestrictions on the use of the data in formal licensingarrangements. Adequate provision will be made for the widest possiblepublic access to data and associated metadata. Costs will be recovered for the handling of data andinformation, in line with departmental policies, which willbe made readily available. The appropriate return will be charged when data are passedon to other parties seeking to make commercial gain.

The Principles of Good Data ManagementReview The Data Policy will be monitored regularly and will bemodified in the light of developments (e.g. technology andlegislation) and experience. Information handling practiceswill be audited so that duplication can be minimized.23

The Principles of Good Data Management7Implementation – key rolesTo be successful, Data Management best practice must beimplemented across the whole organization, under the guidance of amember of the Executive Board, i.e. the Data ManagementChampion. Other key roles are the Data Manager and the DataStewards assigned to each key dataset.The following list of responsibilities may help organizations toestablish these key roles and implement good Data Managementpolicies and procedures.Data Management ChampionThe Champion is responsible for: Ensuring that policies on Data Management are in line withlegislation and Government Policies. Reporting progress to the Executive Board on theperformance achieved against the targets set for theimprovement of data quality and the value gained fromeffective Data Management.In larger departments, particularly those spread over a number ofsites, a Data Management Steering Group may also be required.Data ManagerThe Data Manager may require the help of Local Data Managers todischarge the following responsibilities: 24Developing and maintaining the Data Policy Statement andother corporate guidance.

The Principles of Good Data Management Directing the development, implementation andmaintenance of the detailed data policies, standards,procedures and guidelines across the whole organization. Appointing and monitoring the performance of DataStewards. Issuing guidance and training staff. Ensuring that local practice in individual business areasmeets the standard set for the whole organization. Ensuring that the organization maintains a central metadataresource.Data StewardsData Stewards are responsible for ensuring that the followingminimum standards are applied for each dataset: The dataset must be documented in the organization’scatalogue following the standards for discovery metadata, toenable the ownership, Intellectual Property Rights,stewardship and accessibility to be determined. The policy for exploiting the dataset and making it availableto third parties must be agreed and documented. The dataset and its conditions of use must comply with allthe statutory and non-statutory obligations of theorganization.25

The Principles of Good Data Management26 The data must follow standard classifications anddefinitions where appropriate, and must comply with allrelevant standards, codes of practice and other protocols. The data must be fully validated and quality assured withsufficient detailed metadata to enable their use by thirdparties without reference to the originator of the data. The data must be stored, managed and accessed in line withagreed Data Management and Security/Confidentialitypolicies. The release/use of data by internal and external customersmust be authorized and agreement to the conditions of usedocumented. The costs and benefits of continuing to maintain the datasetmust be reviewed periodically.

The Principles of Good Data Management8Further guidanceThis guide is intended to give an introduction to the principles ofgood Data Management and has been prepared with the help of anumber of organizations who have already benefited from adoptingsuch Data Management principles.IGGI will continue to support the use of good Data Managementprinciples and will, where possible, provide detailed guidance on thewebsite. This detailed guidance will take the form of provenguidelines, made available by IGGI members. Comments on thisguide and on the detailed guidance are welcomed, and these can bemade to the IGGI Secretariat by e-mailing IGGI@odpm.gsi.gov.uk.27

The Principles of Good Data Management928Glossary of TermsAGI:Association for Geographic Information.Data Audit:A process to demonstrate assurance thatData Management is being undertaken tothe required level.Data ManagementChampion:The member of the Executive who isCorporately responsible for the DataManagement.Data ManagementPlan:A plan for the management of anindividual dataset, compliant with the localData Policy.Data ManagementProcedure:A set of detailed operational parameters forthe day-to-day management of a specificdataset.Data Manager:The senior manager, reporting to the DataManagement Champion, responsible forData Management in an organization.Data Owners:Are the individuals or groups of individualswho are held accountable, manageriallyand financially, for a dataset and who havelegal ownership rights to a dataset eventhough that dataset may have beencollected/collated/disseminated by anotherparty.Data Policy:A set of broad, high-level principles thatform the guiding framework within whichData Management can operate.

The Principles of Good Data ManagementData Quality:A set of parameters, established accordingto specific local requirements andprocesses, to ensure that datasets may beseen to be fit for purpose.Data Steward:An individual accountable for themanagement of a specific dataset or groupof datasets.Data:A collection of facts, concepts orinstructions in a formalized mannersuitable for communication or processingby human beings or by computer.Dataset:A dataset is a collection of data that hasbeen compiled to serve a specific businesspurpose. It may have been collected for aspecific purpose (e.g. Census data) or itmay be data previously collected foranother purpose that is being reused for apurpose not envisaged at the time ofcollection (e.g. family history

booklet complements The Principles of Metadata Management also published by IGGI. The booklet provides best practice guidance for those responsible for managing data of a geographic nature. Specifically, it considers the benefits, drivers, principles and mec