1. What Is Geocoding? - Census.gov

Transcription

1. What is Geocoding?Geocoding is an attempt to provide the geographic location (latitude, longitude) of an addressby matching the address to an address range. The address ranges used in the geocoder are thesame address ranges that can be found in the TIGER/Line Shapefiles which are derived from theMaster Address File (MAF). The address ranges are potential address ranges, not actual addressranges. Potential ranges include the full range of possible structure numbers even though theactual structures might not exist. The majority of the address ranges we have are for residentialareas. There are limited address ranges available in commercial areas. Our address ranges areregularly updated with the most current information we have available to us.The hypothetical graphic below may help customers understand the concept of geocoding andCensus Geography (addresses displayed in this document are factitious and shown for exampleonly.) If we look at Block 1001 in the example below the address range in red 101-199 is therange of numbers that overlap the actual individual house numbers associated with the bluecircles (e.g. 103, 117, 135 and 151 Main St) on that side of the street (i.e. the Left side, note thearrow is pointing to the right on Main Street.) Based on this logic, the from address would be101 and the to address would be 199 for this address range. Besides providing a user with thegeographic location of an address the Census Geocoder can also provide all of the additionalCensus geographic information associated with a location, for example a Census Block, Tract,County, and State.For a definition of many of the Census terms discussed in this document please consult theCensus Bureau’s Geography Reference page .1

2. How do I create a digital file so that I can submit my addresses for batch geocoding?An important first step to preparing your file for batch geocoding is to format your spreadsheetof addresses into 5 columns like the example below.Note: City, State and Zip Code fields can be left blank. See example below.The geocoder accepts input files in text (.csv, .txt, .dat) and Excel format (.xls, .xlsx). The outputfile is provided in the same format as the input file. Most database software products have theoption of saving your file in these formats.3. What is the Census Geocoder?The Census Geocoder (https://geocoding.geo.census.gov/) allows customers the ability tosubmit one or many addresses (10,000 addresses is the limit) to determine their geocode(interpolated latitude and longitude) if they fall within a census address range. The latitude andlongitude coordinate system is NAD83. There are seven different geocoding options to choosefrom on the home page: there are three options under “Find Locations Using” and four optionsunder “Find Geographies Using,” see example below.2

We will discuss each of these below and group them based on the similarity of their outputproduct.A. Find Locations Using One Line.For this selection the user types in one address in the text box, using comma’s to separate thedifferent address components, note example below.The user gets the option to choose three different address range (AR) benchmarks in theBenchmark pulldown menu.Benchmark refers to the time period when the address range was captured in TIGER.Public AR Current is the most current benchmark. More Benchmark information is discussed inSection #4, What does Benchmark and Vintage Mean?A discussion of the output is provided in Appendix A.B. Find Locations Using Address.These options allow the user to input an address through a series of text boxes to get a geocode(see example below.)A discussion of the output is provided in Appendix A.3

C. Find Locations Using Address Batch.This option allows the user to submit multiple addresses in a digital file formatted in text (.csv,.txt, .dat) or Excel (.xls, .xlsx).To get started the user selects their digital address file for input with the browse button, nextthe user can choose a Benchmark type, and finally one clicks the Get Results button to receivethe output information. After the processing is complete the user will be prompted by theirweb application if they would like to open the GeocodResults.csv file or save it, note screen shotbelow.The output information in shown in Appendix B.D. Find Geographies Using There are four options using the Find Geographies Using method which is discussed below.I.Find Geographies Using One Line.For this option the user types in one address in the text box, using comma’s to separatethe different address components (see example below.)4

This option allows the user to choose a different address range (AR) benchmark(discussed above) and different Geography Vintage types (note image below.)Vintage is the date when the geography information was captured. More Vintageinformation is discussed in Section #4, What does Benchmark and Vintage Mean?The output information is shown in Appendix C.II.Find Geographies Using AddressThis option allows the user to input an address through a series of text boxes, seeexample below.Again the user has the option of choosing a different Benchmark and Vintage. Theoutput information in shown in Appendix C.III.Find Geographies Using Address Batch.5

This option allows the user to submit multiple addresses in a digital file formatted in text(.csv, .txt, .dat) or Excel (.xls, .xlsx) to obtain the Census Geography informationassociated with the address.To get started the user selects their digital address file for input with the browse button,next the user can choose a Benchmark and Vintage type, and finally the user clicks theGet Results button to receive the output information. After the processing is completethe user will be prompted by their web application if they would like to open theGeocodResults.csv file or save it, note screen shot below.The output information in shown in Appendix D.IV.Find Geographies Using Geographic Coordinates.This option allows the user to submit Longitude (X) and Latitude (Y) coordinate values todetermine the geography associated with that location.The output information in shown in Appendix C. Note that this option does not outputany address range information.4. What does Vintage and Benchmark mean?Benchmark refers to the date or time frame when the address range repository was lastupdated. Vintage refers to the date or time frame when the geography is from. If you choosean address range from an earlier time frame you will only be able to choose geography from6

that time or earlier. So for example if you choose a 2010 Benchmark Address Range you willonly have the option of choosing Vintage geography from either 2000 or 2010. If you choose acurrent Benchmark Address Range you will have the option of choosing Vintage geography fromCurrent, ACS2015, ACS2014, ACS2013 or Census 2010.5. Possible reasons your address did not geocode.There are several possible reasons why an address you submit may not geocode to a Censusaddress range. The Street and address that you submitted truly does not exist or has not been built. The address is a newly constructed home that has not yet been captured by our addresscapturing techniques. The address may have existed at one time but now does not exist (i.e. the housing unitmay have been demolished or destroyed by natural or man-made causes.) The address may have existed at one time but now does not exist (i.e. the housing unitaddress may have been changed to a non-residential one.) The house number or street name may have changed because of renaming and/orrenumbering due to E911 activities. The address submitted matches to a single address range street segment. Because ofthe Census Bureau’s commitment to Title 13 individual address information isconsidered confidential information and thus cannot be released to the public. A singleaddress range street segment essentially identifies the location and name of a singleaddress which is prohibited and cannot be released.Our address ranges consist mainly of residential addresses. If you do not get a result and you knowthe approximate location of the address, we recommend you use our TIGERweb interactive mapviewer. If you do not know the approximate location we recommend you use outside sources todetermine the approximate location. We are continually improving our addresses and addressranges. We release updated geography and address ranges at least once per year. Please send anyquestions or comments to geo.geocoding.services@census.gov.6. LUCA participantsEntities submitting address lists for geocoding to the LUCA Geocoder will submit their addresses tothe Census Bureau via the SWIM application. The address list file submitted to the LUCA geocodercan exceed 10,000 records and must be in a .csv format similar to the formats discussed above forthe public geocoder. All files submitted to the LUCA geocoding must be formatted into 5 columnslike the example shown in Section 2 above.The output products that are returned to the LUCA participants are provided to assist them withtheir LUCA submission. We shall discuss these specific data products next.7

Output file format for the Address Count List.Address Geocoding Output File Layout for the Address Count ListColumn NameMATCHED ‘M’MATCHED ‘T’MATCHED ‘U’TABSTATETABCOUNTYTABTRACTTABBLOCKADDRESS COUNTExampleThe total number of addresses thatmatched to an address range.The total number of addresses thattied between two or more Censusaddress ranges. A Tie indicatesmultiple possible results for thataddress.The total number of addresses thatdid not match to an address range.Tabulation State FIPS CodeTabulation County FIPS CodeCensus Tabulation Tract CodeCensus Tabulation Block CodeThe number of address that matchedto a census address range in theparticular block.The output file format for the Address List:Output FieldsLINE NUMBERINPUT ADDRESSMATCH INDICATORMATCH TYPETIGER OUTPUT ADDRESSINTERPOLATED LONGITUDE, INTERPOLATEDLATITUDETIGERLINE IDDefinitionThe unique number for each address.The address submitted by theparticipantThe results in this column refer to ifthe address matched a censusaddress range (Match), did not match(No Match), or was a tie betweentwo or more Census address ranges(Tie). A Tie indicates multiple possibleresults for that address.Indicates if the match that occurredwas exact (Exact), or equivocated(Equivocate.)The equivocated address thatmatched the TIGER address range.The interpolated Latitude andLongitude value based on the addressrange location,The specific TIGER Line Identifiernumber.8

TIGERLINE SIDESTATECOUNTYTRACTBLOCKThe side of the address range that theinput address matched to left or right.The two digit state FIPS code.The two digit county FIPS code.The six digit Census Tract code.The Census Block code (4-6 digits.)Appendix AThe output data when using Find Locations Using One Line and Address.The output information is defined in the table below:Output FieldsMatched AddressCoordinatesTiger Line IdSideFrom AddressTo AddressPreQualifierDefinitionThe address that was used in the geocodingprocess.The Longitude (X) and Latitude (Y) values based onan interpolation on where the address falls alongthe address range.The unique Tiger Line Id of the street segment.The side of the street the address range lies oneither L (Left) or R (Right).The from address value.The to address value.A word or phrase that precedes all other elementsof the street name and modifies it, but is separated9

PreDirectionPreTypeStreet NameSuffixTypeSuffix DirectionSuffixQualifierCityStateZipfrom the street name by a street name predirectional and/or pre-type.*A word preceding the street name that indicatesthe directional taken by the thoroughfare from anarbitrary starting point, or the sector where it islocated.*The element of the complete street nameproceeding the street name element that indicatesthe type of street.*The official name of the street.The element of the complete street namefollowing the street name element that indicatesthe type of street.*A word following the street name that indicatesthe directional taken by the thoroughfare from anarbitrary starting point, or the sector where it islocated.*A word or phrase that follows all other elements ofthe street name and modifies it, but is separatedfrom the street name by a street name suffix-type,suffix-directional and/or suffix-type.*The city the address is located in.The State the address is located in.The Zip Code the address is located in.(* Note documentation for this is available at thefollowing data/05-11.2ndDraft.CompleteDoc.pdf).Appendix BThe output data when using Find Locations Using Address Batch.The output information is defined in the table below:Output FieldsRecord ID NumberDefinitionThe unique number for each addresssubmitted. Note the output file may not10

return the records in the same order as thatsubmittedInput AddressTIGER Address Range Match IndicatorTIGER Match TypeTIGER Output AddressInterpolate Latitude and LongitudeTIGERLine IDTIGERLine ID SideThe address submitted by the customer.Values in the Match Results column refer toif the address matched a census addressrange (Match), did not match (No Match),or was a tie between two or more Censusaddress ranges (Tie). A Tie indicatesmultiple possible results for that address.Indicates if the match that occurred wasexact (Exact), non-exact (Non-Exact), tiedwith another address range (Tie) or nomatch (No Match).Standardize version of the input addressthat was used to match to the TIGERAddress range.The Longitude (X) and Latitude (Y) valuesbased on an interpolation on where theaddress falls along the address range.The unique Tiger Line Id of the streetsegment.The side of the street the address range lieson either L (Left) or R (Right).Appendix CResults from using Find Geographies Using One Line, Address or Geographic Coordinates.For each successful match using either of these options the software returns the following information,note the four example screen shots below. (Note the examples below have been cut up to provide anatural break to help describe the output data.)11

This first section of the output data is similar to the output data given for the Find Locations using eitherOne Line or Address (please note Appendix A.)Output screen shot #2 (Geographic information for the county that the address range is located in.)The output information is defined in the table below:Geography Output County RNAMELSADCDefinitionMAF/TIGER Object IdentifierState FIPS CodeFunctional StatusThe Length of the State BoundaryWater Area in Square Miles for the CountyThe Name of the CountyLegal Statistical Area Description Code12

IDCentroid Longitude (of the county)The Geometric Area of the StateBase name portion of the StandardizedNameInternal Point LatitudeCounty FIPS Class CodeMAF/TIGER Feature Classification CodeCounty FIPS CodeGeographic Identifier – Fully ConcatenatedGeographic CodeCentroid LatitudeInternal Point LongitudeLand Area in Square Miles (of the County)County National Standard CodeMAF/TIGER Object Identifier?Output screen shot #3 (Geographic information for the tract that the address range is located in.)The output information is defined in the table belowGeography Output Tract ONBASENAMEINTPTLATMTFCCDefinitionMAF/TIGER Object IdentifierState FIPS CodeFunctional StatusCensus Tract NumberWater Area in Square Miles for the TractLegal Statistical Area Description CodeCentroid Longitude (of the tract)Base name portion of the StandardizedNameInternal Point LatitudeMAF/TIGER Feature Classification Code13

nty FIPS CodeGeographic Identifier – FullyConcatenated Geographic CodeCentroid LatitudeInternal Point LongitudeLand Area in Square Miles of the TractMAF/TIGER Object IdentifierTract NumberOutput screen shot #4 (Geographic information for the block that the address range is located in.)The output information is defined in the table below:Geography Output Block IXLSADCCENTLONLWBLKTYPBASENAMEDefinitionCensus Block Group NumberMAF/TIGER Object IdentifierFunctional StatusState FIPS CodeWater Area in Square Miles for theBlockThe Name of the BlockThe Suffix of the BlockLegal Statistical Area Description CodeCentroid Longitude (of the block)Land/Water Block TypeBase name portion of the StandardizedName14

NDOBJECTIDTRACTBlock NumberInternal Point LatitudeMAF/TIGER Feature Classification CodeCounty FIPS CodeGeographic Identifier – FullyConcatenated Geographic CodeCentroid LatitudeInternal Point LongitudeLand Area in Square Miles for the BlockMAF/TIGER Object Identifier?Tract NumberOutput screen shot #5 (Geographic information for the State that the address range is located in.)The output information is defined in the table below:Geography Output State itionMAF/TIGER Object IdentifierState FIPS CodeFunctional StatusState NameWater Area in Square Miles for theStateLegal Statistical Area Description Code15

EOIDCENTLATINTPTLONREGIONAREALANDOBJECTIDCentroid Longitude (of the State)USPS State AbbreviationBase name portion of the StandardizedNameInternal Point LatitudeCensus Division CodeMAF/TIGER Feature Classification CodeState National Standard CodeGeographic Identifier – FullyConcatenated Geographic CodeCentroid LatitudeInternal Point LongitudeCensus Region CodeLand Area in Square Miles for the StateMAF/TIGER Object Identifier?Appendix D Results from using Find Geographies Using Address BatchThe output information is defined in the table below.Output FieldsRecord ID NumberInput AddressTIGER Address Range Match IndicatorTIGER Match TypeTIGER Output AddressDefinitionThe unique number for each addresssubmitted. Note the output file may notreturn the records in the same order asthat submittedThe address submitted by the customer.Values in the Match Results column referto if the address matched a censusaddress range (Match), did not match(No Match), or was a tie between two ormore Census address ranges (Tie). A Tieindicates multiple possible results for thataddress.Indicates if the match that occurred wasexact (Exact), or equivocated (Equivocate.)Standardize version of the input addressthat was used to match to the TIGERAddress range.16

Interpolated Latitude and LongitudeTIGERLine IDTIGERLine ID SideState CodeCounty CodeTract CodeBlock CodeThe Longitude (X) and Latitude (Y) valuesbased on an interpolation on where theaddress falls along the address range.The unique Tiger Line Id of the streetsegment.The side of the street the address rangelies on either L (Left) or R (Right).The State FIPS Code IdentifierThe County FIPS Code IdentifierCensus Tract NumberCensus Block Number17

1. What is Geocoding? Geocoding is an attempt to provide the geographic location (latitude, longitude) of an address by matching the address to an address range. The address ranges used in the geocoder are the same address ranges that can be found in the TIGER/Line Shapefiles which are derived from the Master Address File (MAF).