Immunization Information Systems Patient-Level De-duplication Best .

Transcription

IMMUNIZATION INFORMATION SYSTEMSPATIENT-LEVEL DE-DUPLICATION BEST PRACTICESNational Center for Immunization and Respiratory Disease (NCIRD)Immunization Information Systems Support Branch (IISSB)June 25, 2013IIS Patient-Level De-duplication Best PracticesPage 1

1EXECUTIVE SUMMARY . 41.12BACKGROUND AND PROJECT OVERVIEW . 92.12.23OBJECTIVE . 14PATIENT DE-DUPLICATION PROCESSING SCENARIOS . 15INSIDE THE BLACK BOX . 16FIVE TYPICAL PROCESS STEPS . 17DATA PREPARATION. 18LOOK-UP BY IDENTIFIERS . 20FIND CANDIDATE RECORDS/BLOCKING AND SCORING. 22TAKE ACTION ON MATCHING AND DE-DUPLICATION OUTCOMES . 23REPORT ACTIONS TAKEN. 23CLASSIFICATION OF PATIENT DE-DUPLICATION APPROACHES . 26ADVANCED PRACTICE CONSIDERATIONS . 304.14.24.34.44.54.64.74.84.95PROJECT APPROACH . 10CDC IIS PANEL SCOPE OF WORK . 13FOUNDATIONAL CONCEPTS FOR IIS PATIENT DE-DUPLICATION . 143.13.23.33.43.53.63.73.83.93.104FINDINGS AND BEST PRACTICES . 4FRONT-END PATIENT DE-DUPLICATION. 31RETROSPECTIVE REVIEW . 33DATA PREPARATION. 34BLOCKING . 38EXPERT RULE DEVELOPMENT . 39FIELD MATCHING ALGORITHMS. 40ESTABLISHING THRESHOLD TOLERANCES . 45METRICS. 46MASTER PATIENT INDEX (MPI) . 48BEST PRACTICE GUIDANCE . 505.15.25.3GENERAL OBSERVATIONS . 50BEST PRACTICE GUIDANCE ON IIS OPERATIONS . 52FUTURE POTENTIAL CONSIDERATIONS . 55APPENDIX A - PANEL MEMBERSHIP . 57THE CDC EXPERT PANELISTS . 57CDC EXPERT REVIEWERS. 60NORTHROP GRUMMAN PUBLIC HEALTH CONTRACTOR PERSONNEL . 63APPENDIX B - PATIENT DE-DUPLICATION LITERATURE REVIEW . 65BACKGROUND . 65METHODOLOGY . 65THEORETICAL ROOTS . 66IIS Patient-Level De-duplication Best PracticesPage 2

BEST PRACTICE DEVELOPMENT. 68SELECTED ACADEMIC LITERATURE . 69INDUSTRY AND GOVERNMENT REPORTS. 70M ASTER PATIENT INDEXES . 71REFERENCES . 73APPENDIX D - VOCABULARY . 75IIS Patient-Level De-duplication Best PracticesPage 3

1 Executive SummaryPatient-level de-duplication (also called patient matching, patient de-duplication, or patient identitymanagement) is the process of finding and removing redundant patient records from a database.Patient matching and patient de-duplication are essential data processing capabilities forImmunization Information Systems (IIS). These capabilities ensure that updates and queries applyonly to the correct patient record and prevent fragmented and duplicate information from beingadded to an individual’s health records. The inability to consistently determine which recordsrepresent the same patient and errors in combining the data contained in a patient’s recordnegatively affect the overall data quality, usefulness, and credibility of public health immunizationrecord keeping.This document, a result of a CDC-sponsored project, is designed to be read by programmatic,technical, and operational experts who are involved in creating or maintaining an IIS. Thedocument intends to bridge the gap between technical and program staff so they can have amutual understanding of the issue of patient-level de-duplication and target actions to addressthese recommendations.Best practice guidelines on patient-level de-duplication, documented within this report, willpositively affect immunization registries by encouraging common de-duplication practices. This willthereby improve overall data quality and usefulness of registry information. The best practicesguidelines are also technology-neutral and foster collaboration and communication amongst IISprofessionals.1.1Findings and Best PracticesDetailed best practice guidance for day-to-day IIS patient de-duplication operations is outlined inboth the foundational and advanced practice sections of this document. While specific bestpractice guidance has been summarized in the various practice-related sections of this report, it isbelieved that certain best practice information can broadly benefit the IIS national community andalso help establish the discussion surrounding a long-term agenda.Patient de-duplication is a multi-step process. Accordingly, best practices need to be understoodwithin the context of a generalized process model. This model is presented in the body of thereport (see Figure 3.2). Currently, there is not a single idealized best practice process for patientlevel de-duplication. There are wide variations in needs, capabilities, resources, and businesspractices; however, the expert panel believes that there are a number of techniques that cansupport patient-level de-duplication efficiencies in virtually all circumstances. While idealizedprocess discussions were considered non-productive, certain techniques were identified that canenable productivity.The following table summarizes the project’s key findings and best practices by domain. Additionaldetailed information on the best practices can be found in section 5 of this document.IIS Patient-Level De-duplication Best PracticesPage 4

fications andMeasurementMetricsFindingsBest Practices The national practicecommunity needs to anticipatehow to best leverage their deduplication processes into theevolving state and nationalhealth information technologyarchitectures and in conjunctionwith Meaningful Use Single, discontinuous efforts arenot adequate to provide thefunctionality required to sustaincontinued improvements Consequences ofinappropriately merging therecords of two patients aremore severe than duplicating apatient’s instance in thedatabase Formally document all facets ofpatient de-duplication processes,including the business rules for eachstep of the matching and deduplication process Apply a business process approachto planning, implementing, anddocumenting patient de-duplicationpractices Understand the functional differencesamongst de-duplication approachesfor real-time, incoming, andretrospective processing and thestrengths and weaknesses of each Err on the side of preventing falsepatient record data merges andfailing to match two records for thesame patient (also called falsenegatives) Participate in the on-going patientde-duplication process improvementdialogue whether a technical or nontechnical subject matter expert(SME) Formalize a body of knowledgewhich can help further solidifyimplementations and driveefficiencies acceptable to thenational IIS community Implement better mechanisms forsharing and collaboration amongstIIS around de-duplication bestpractices Emerging role of Master Patient Have a greater understanding of deIndexes (MPIs) is uncertainduplication “black box” operationsrelative to IIS patient deand the deterministic andduplicationprobabilistic techniques being used There is currently no standard Participate in de-duplication engineroad map for IIS integration intoset-up and ongoing reviewsHealth Information Exchanges Utilize active discussion and on(HIEs) or other arrangementsgoing review with SMEs andintegrating cross-jurisdictionaltechnical support to identifyhealth informationthreshold scores in conjunction with Purely deterministicthe needs of local stakeholders,implementations eventually hit alocal constraints, and available dataceiling of diminishing marginal When evaluating de-duplicationreturnsengines, look for the followingIIS Patient-Level De-duplication Best PracticesPage 5

DomainFindingsBest Practicesfunctionality:o Recognize when records havepreviously been adjudicated. i.e.,de-duplication software shouldprevent multiple redundantrecord reviewso Perform comprehensive editchecks on manual data entry tostandardize data contained in thedatabaseo Evaluate incoming data forcompleteness, timeliness, andaccuracy through onlineprompting and edit checks, popup windows, and otherautomated techniqueso Provide on-line help as well assuggestions regarding formattingduring manual data entry processo Illustrate potential duplicaterecords during manual data entryo Merge and unmerge patientrecords in more standardizedwayso Utilize well-developed blockingtechniques with a high number ofunique values to reduce theoverall number of candidate pairsto evaluateo Utilize machine learning to allowfor further sophistication incorrectly identifying andmaintaining patient recordso Implement more probabilisticmethods for increased volumesand more complex problemso Adjust and configure algorithmictechniques and thresholds asdata sets change and evolveo Use a combination ofdeterministic and probabilisticalgorithmic methodso Apply advanced algorithms toprocess last name data (seepage 36)o Utilize specific five measures ofsensitivity, specificity, accuracy,precision, and false positive rateIIS Patient-Level De-duplication Best PracticesPage 6

DomainIncoming Dataand Manual DataEntryFindings Despite increasing automation,manual data entry remains animportant method of dataoriginationManual data entry and multipledata sources can introducevariations in data,typographical errors, and dataomissions, affecting overalldata qualityMulti-tier approaches to publicand private immunizationprovider data problems areneededThe order in which records areexamined may influence theoutcome of recordcomparisons in certainsituationsA road map is needed tofurther the jurisdictionalmapping of IIS data to NationalVaccine Advisory Committee(NVAC) core data elementsand functional standardsBest Practices IIS Patient-Level De-duplication Best Practicessee page 46) to benefit practiceefficiencyo Utilize additional useful measuresfor understanding IIS operationsand improving data quality (seepage 47 and 48)Provide the broad community ofimmunization data providers,including HMOs, pediatricassociations, schools, pharmacies,insurance companies, and otherinstitutions, with formal feedbackregarding the data quality needs ofIISUtilize fact sheets, FAQs, dedicatedexpert calls, user group exchangewebinars, and web-based training toimprove the quality of data fromimmunization data providersFollow up on trends in dataoriginating from provider interfacesand encourage providers to reviewand act upon response files anderror messagingEncourage providers to utilizestandardized HL7 messagesEncourage providers to run Vaccinefor Children (VFC) and assessmentreportsDo not accept data originating froma source that is not approvedProvide well-documented options toproviders for submission of theirdataTrain data entry users on the bestsearch methods supported by theIIS and provide detaileddocumentation and trainingPerform better screening andcleansing of incoming data,particularly regarding placeholderand missing data, to ensure thatincoming data meet minimalprocessing requirementsPerform systematic testing of formatand content of incoming data withinthe on-boarding process for a newPage 7

DomainFindingsBest Practices RetrospectiveProcesses Manual ReviewProcesses Retrospective examination ofIIS data to find duplicatepatient records and otherforms of data quality problemsshould be considered auniversal best practiceThe processes used inretrospective patient deduplication may be differentthan front-end de-duplicationprocesses. IIS SMEs need tounderstand and activelymanage these differences toimprove data quality.Manual data review processesare expensive and timeconsumingIIS Patient-Level De-duplication Best Practices data sourceImplement the option of utilizingdata fields such as birth order, race,and ethnicity for de-duplicationprocessingUse specific unique identifiers suchas social security number, medicalrecord number, chart number, orbirth certificate number to quicklyfind a match in the IIS and preventcalling the de-duplication engineMake business decisions not toutilize certain types of records thatpresent themselves (e.g., baby boyor girl)Identify life status changes duringthe manual data entry process toaid in the identification of situationscreating duplicate records orfragmented historiesApply standardization rules for eachcomponent of a client’s name (seepage 35)Standardize address and phonenumber information through adetailed examination of availableaddress components (see page 36and 37)Perform periodic retrospectiveexamination and de-duplication ofIIS patient recordsActively monitor the results ofretrospective processing as animportant source for improvementUtilize automated approaches toreview audit trail artifacts to provideuseful metadataUtilize audit trails and manualreview files to identify improvementopportunitiesUtilize objective data qualitymeasuresPage 8

DomainFindingsBest PracticesPerform systematic reviews ofpending logs to identify recurrentproblems and logic gaps Research and access additionalinformation to increase thelikelihood of making accuratedeterminations and document thesesituations to provide insights intooperational weaknesses Agree upon SME and technicalactivities that can reduce theburden of manual review processes Note and pass along new variationsto technical personnel toincorporate into a newstandardization process (e.g. MLKfor Martin Luther King) Identify and apply culture-specificconventions (e.g., family memberssharing the same date of birth)The findings and recommendations for the development of new IIS patientde-duplication test cases can be found in a separate testing document(Volume 2). TestingTable 1.1: Summary of Project’s Key Findings2 Background and Project OverviewImmunization Information Systems (IIS) are confidential, population-based, computerizedinformation systems that collect vaccination data within a defined geographic area. IIS are animportant tool to increase and sustain high vaccination coverage by consolidating vaccinationrecords from multiple providers into a single immunization record.The ability for physicians, hospitals, and other healthcare providers to send immunization recordsto IIS electronically is a key element of what has been termed “Meaningful Use.” Meaningful Useis the ability to exchange complete and accurate electronic patient information, based upon the setof standards defined by the Centers for Medicare & Medicaid Services (CMS), in a way that canimprove healthcare efficiency and patient outcomes.Meaningful Use is defined by using certified Electronic Heath Record (EHR) technology in ameaningful manner (for example electronic prescribing); ensuring that the certified EHR technologyis connected in a way that provides for the electronic exchange of health information to improvethe quality of care; and submitting information on quality of care and other measures to theSecretary of Health & Human Services (HHS). The sending of provider immunization data to publicIIS Patient-Level De-duplication Best PracticesPage 9

health jurisdictional IIS has been incentivized by federal legislation, namely the AmericanReinvestment & Recovery Act (ARRA) and Health Information Technology for Economic andClinical Health Act (HITECH). Accordingly, the volume of patient records being sent electronicallyto jurisdictional IIS has increased dramatically and will continue to increase. Therefore, IIS areunder pressure to improve their overall data quality programs.The last formal examination of IIS patient-level de-duplication methods, along with thedevelopment of patient de-duplication tools, was performed by the Centers for Disease Control andPrevention (CDC) in 2002. Much has changed since that time, and, given the importance of IIS innational Meaningful Use objectives, there is much to be gained from a fresh examination of patientde-duplication best practices.The new CDC-sponsored patient-level de-duplication project specified best practices anddeveloped test cases to test for both sensitivity and specificity and other accuracy measures.Additionally, the project examined and proposed practice-based solutions for the use of IIS data ina Master Patient Index (MPI) or similar environment to allow de-duplication engines to yield betterand more accurate results through the use of a clean and complete data set.In summary, the project sought to accomplish the following: 2.1Streamline, standardize, and improve overall IIS patient de-duplication processesIncrease IIS expertise in patient de-duplication best practicesImprove patient de-duplication and data quality best practices, which can lead to improvedsingle patient hit rates from patient query/response use casesMake recommendations on how to implement improvements to an IISCreate a standardized set of test cases that can be used across all IISProject ApproachTo address the problem of duplicate patient records in IIS, the project established a Patient DataDe-duplication Expert Panel. The panel consisted of 14 Subject Matter Experts (SMEs) and ExpertReviewers from the following organizations: American Immunization Registry Association (AIRA)Indian Health Service (IHS)EHR vendorsIIS programs and vendorsIIS consultants and de-duplication expertsAcademic institutionsIIS Patient-Level De-duplication Best PracticesPage 10

The work of the expert panel was performed during the period of August, 2011 through March,2013. Work was assigned to one of two roles: 1) SMEs for primary content generation and 2)Expert Reviewers for content and product review.As an expert panel, the group represented decades of profound expertise in IIS patient deduplication procedures, tools, methods, and system administration. The membership of the expertpanel is detailed in Appendix A.The expert panel agreed upon the following mission statement: “The Patient De-duplication ExpertPanel is comprised of key stakeholders focused on developing best practices and resources toimprove patient de-duplication processes and the quality of IIS patient data.”The expert panel work followed a well-defined approach: Panel RecruitmentOff-Line Research and PreparationPanel Worko Phase 1: Orientationo Phase 2: Literature Reviewo Phase 3: Patient-Level De-Duplication National Practice Assessment (NPA) results andmanuscripto Phase 4: Vocabulary Definitiono Phase 5: New Test Case Specification Developmento Phase 6: Best Practice Statement Developmento Phase 7: Final Report Development and Disseminationo Phase 8: Test Case Development and DisseminationConference PresentationsPublications and Website UpdatesLessons LearnedOrientation involved initial preparatory off-line work including literature reviews, assembly ofpertinent materials, production of preparatory notes, analysis of processes, and development ofpreliminary drafts. This effort was performed by a small group of business analysts and SMEs.The Patient-Level De-duplication NPA examined and reported trends, problems, and approachescurrently being taken on a national basis. A peer-reviewed paper is the by-product of this effortand will be published separately.The work of the expert panel was conducted utilizing formal project management and facilitationtechniques. Work methods included facilitated, pre-scheduled, bi-weekly teleconferences withexpert panel SMEs that involved the following: Sharing of individual experiencesIIS Patient-Level De-duplication Best PracticesPage 11

Group discussions of patient de-duplication issuesVoting via SurveyMonkey to stimulate and elicit best-practice agreement and disagreementDevelopment and review of materialsDrafting of consensus-based recommendationsThe CDC sponsored an intensive, 3½-day in-person session in Atlanta, GA from February 21-242012. This in-person meeting covered all of the domain areas in the scope of work, included thefull workgroup of expert panel SMEs, and utilized facilitated modeling techniques. Thedevelopment and formulation of consensus-based recommendations occurred during strategicgroup breakout sessions.The post-in-person session work finalized the development of the best practice discussions andtest case specifications. Additional teleconferences were dedicated to reviews of specific patientde-duplication practice questions by dividing up the work for development in small groups of SMEsand then by the group in its entirety. The expert panel’s definition of consensus did not reflect100% agreement, but rather “I can live with that and support it.”To help organize and coordinate the expert panel’s work, Northrop Grumman Corporation’s (NGC)Public Health Division was retained as the project contractor. Northrop Grumman provided projectmanagement, IIS de-duplication subject matter expertise, and administrative support. Their scopeof work included recruiting and constituting the expert panel; providing guidance towardcollaborative examination, evaluation, and analysis; facilitation services; and proposing practicebased standardized solutions. Additionally, NGC supported test case development and final reportauthorship and production.IIS Patient-Level De-duplication Best PracticesPage 12

2.2CDC IIS Panel Scope of WorkThe major focus of this project was IIS patient-level de-duplication. This focus included thedevelopment of best practice guidelines and the creation of an updated set of test cases.De-duplication of immunization records can be a two-fold problem that includes de-duplication atthe patient level (e.g. two records describe the same patient) and de-duplication at the vaccinationevent level (e.g. two records describe the same immunization event). The scope of this projectconsidered only the first of these two processes. Additionally, the project was not focused on theassembly of lifetime immunization records or on clinical decisions related to the immunizationschedule.The expert panel focused on five domain areas:1. De-duplication software approaches, capabilities, specifications, and measurementmetricso Practice-based evaluation of the efficacy of de-duplication approacheso Validation of contextual modelso Best practice guidance on the ability of de-duplication software to yield better and moreaccurate results2. Incoming data and manual data entry de-duplication practiceso Practice recommendations around the validation and cleansing of incoming data usingunique identifiers to shortcut de-duplication process interrogationo Identification of the most problematic data sources and situationso Guidance on prescreening incoming records to reduce manual efforto Recommendations to external providers to procedurally avoid duplication situations3. Retrospective de-duplication processeso Best practices around de-duplication of existing patient datao Specifications around what additional data elements, particularly from the immunizationhistory, may be usefulo Identification of idealized record merge and unmerge practices4. Manual de-duplication review processeso Identification of strengths and weaknesses of approaches, processes, and techniquesusedo Consideration of merge and unmerge processeso Identification of manual review productivity improvements5. De-duplication testingo Development of an updated and expanded set of test cases to help assess the ability ofan IIS to detect and de-duplicate patient recordsIIS Patient-Level De-duplication Best PracticesPage 13

ooooSpecifications for more robust patient de-duplication test cases, including theconsiderations for measuring sensitivity and specificitySpecifications for the nature, type, and volume of test casesReview of new test cases featuring updated and expanded data utilizationRecommendations for test case packaging, distribution, and useThe scope of this document includes domains 1 through 4. The outputs of Domain 5: DeDuplication Testing, are documented in a separate report (Test Case Development & Utilization)that will accompany the new test cases.3 Foundational Concepts for IIS Patient De-duplication3.1ObjectiveThe goal of patient-level de-duplication is to correctly match all records related to the same patienteven when there are variances in the data used to establish the patient’s identity. Matching orlinking records relating to the same patient from several or multiple data sources is often requiredto integrate the information needed to construct an accurate immunization history. Patientmatching for record updates and detecting

duplication "black box" operations and the deterministic and probabilistic techniques being used Participate in de-duplication engine set-up and ongoing reviews Utilize active discussion and on-going review with SMEs and technical support to identify threshold scores in conjunction with the needs of local stakeholders,