Tech Talk: Metadata
February 25, 2011
The term metadata has been around for a while. Among certain geospatial professionals, this term, what it means and how it relates to them is intrinsic to their practice. For other professionals, metadata is part of what they do, but they don’t think of it as metadata, nor do they make a big deal about it. And there are still others who don’t think about metadata at all. So what exactly is metadata? Simply put, metadata is data about data.
Surveyors may wonder why they need to care about metadata. First, realize that metadata is used outside the geospatial world, too. An image may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created and other data. A text document’s metadata may contain information about how long the document is, who the author is, when the document was written and a short summary of the document. Web pages often include metadata in the form of meta tags. Description and keyword meta tags are commonly used to describe the Web page’s content. Most search engines use this data when adding pages to their search index. Many of the software products you use automatically create metadata to go with the data you create. Sometimes these metadata files are visible to you, and sometimes they are not. Ask the manufacturer about the metadata created and whether it is visible to you.
GIS professionals, geodesists and cartographers routinely publish data about their data. Why? Because when data is supplied from one entity to another, it may be useless without some explanatory information, such as whether the coordinates are in plane X, Y and up order; north, east and elevation order; or based on the origin of the east zone of New York’s State Plane Coordinate System of 1983.
But it can be a lot more basic than that. It wasn’t very long ago that local surveyors would issue plats that caused state agencies and professional associations to create mandatory or voluntary standards of practice. A plat is data also, but some of these plats lacked basic information, such as scale, north orientation, specificity in where the survey was done, the basis for directions given on the plat, owner’s name, name of the adjoiners, and the book and page number of the county records where the deed to the subject property was last recorded. Some may say these plats were missing data. True, but often what was really missing was data about data, i.e., metadata.
As a surveyor in private practice, I have encountered surveys that didn’t identify where the property was located in the most general way. We sometimes rant about the many synonyms lawyers use when they write. Why use sell, mortgage, hypothecate, etc.? Why do lawyers--and some surveyors, as well--use words such as by, along, abutting and touching? Sometimes it is because they want to make sure there is no misinterpretation of intent. The same holds true with metadata.
Data can be useless without metadata. The data you create can fall into many categories, so this is not the place to talk specifically about what metadata you ought to publish with each dataset. Keep in mind that sometimes the person you may affect by not publishing metadata with your data may be yourself. There are general items you should consider including when you create any dataset. The items listed below may not all apply to you, and you may come up with additional items. The main thing is to establish a set of metadata publishing standards for yourself and your organization for every type of dataset you create--including final products, such as maps, plats and reports.
Items to publish include:
• Means of creation of the data. For surveyors, it is important to state whether the data was created with a total station, GPS static, GPS RTK or laser scanner. Sometimes it is also helpful to include the number of epochs or session times (see also standards below).
• Purpose of the data. Data used to create a topographic map is seldom good enough to also be used to determine a property line location.
• Time and date of creation. This is particularly important if there is a set of dates for the field work and another date for actual publication.
• Creator or author of data. Make sure to note whether there is a chain of creators or authors.
• Placement on a computer network where the data was created. This information is increasingly important in many organizations.
• Standards used. This can include many things for surveyors. Some things to think about are the positional accuracy of different classes of data in the dataset and actual procedures used, such as the number of times an angle was measured, epochs, whether the leveling was double run, etc.
• Datum and projection information.
• Units. This may seem obvious, but it is not only an issue of meters or feet but also U.S. survey feet or international feet.
On its website, the USGS has posted a useful reference about metadata specifically as it relates to Federal Geographic Data Committee (FGDC) data.1 The FGDC is an interagency committee that promotes the coordinated development, use, sharing and dissemination of geospatial data on a national basis. If this discussion of metadata is all new to you, it will be helpful reading just to get familiar with the jargon.
It can be easy to overlook metadata. An excellent example is where we are less than complete when we state latitude and longitude. Someone may say, “I’m from 40.25 degrees, 76.73 degrees.” Generally, we assume the first number is latitude and the second longitude. If so, we then look it up and decide that must be Pennsylvania. The truth of the matter is we didn’t have the metadata to reach that conclusion with 100 percent accuracy. There are actually four places on Earth it could be, since we have the Eastern and Southern Hemispheres as choices, too. Even software manufacturers sometimes forget about the importance of metadata. I know of a navigation system that fails when it enters the Eastern Hemisphere--perhaps because the Western Hemisphere programmers forgot about the other one.