April 1, 2006
The zen of GIS is that everything is somewhere, right? And so it is. But we have to be able to find what we want in order to work with it. In GIS, we have powerful tools to help locate and organize our information.
A View of ArcCatalogIn the previous installment of this series ("Managing Survey Data in a GIS," POB February 2006) we focused on data storage technology. This time we will explore one of the methodologies for getting the most out of our data repositories using ESRI's ArcCatalog. ArcCatalog is one component of the three-program ArcGIS package from ESRI (ArcMap and ArcTools make up the rest of the ArcGIS suite).
The ArcCatalog window resembles Windows Explorer. And in some ways it works in a very similar fashion. The left pane is called the catalog tree (see Figure 1). The catalog tree displays all the drives and folders the user's computer is connected to. It is the window that displays the paths to all the data available to the user. The "Connect to Folder" icon (horizontal yellow arrow and globe) on the toolbar is the same as the one in the "Add Data" window of ArcMap.
Users can connect to folders on the hard drive, CD-ROM, local network drives or various other external drives using the catalog tree. Two additional folders in the catalog tree are the "Database Connections" folder, which supports RDBMS (Relational Data Base Management System) and SQL (Structured Query Language), and the "Coordinate Systems" folder. The Coordinate Systems folder is the designated location for managing all available coordinate systems. It has a complete catalog of predefined systems and projections and also allows the user to add custom coordinate systems.
Database Connections ExplainedTwo types of database connections are supported in ArcCatalog: Arc SDE (Spatial Database Engine) and OLE (Object Linking and Embedding). OLE databases are usually non-spatial. See Figure 2 on p. 49.
A large enterprise or "federated" GIS typically has two components, the production environment and the publication environment. This is one of the more misunderstood aspects of GIS. But conceptually, it is really very basic. The production environment is where the data maintenance is performed. The publication environment is where it is intended for the data to be viewed or read.
It is important to understand why production data and published data must be handled separately. In the warehouse, or publication environment, the data remains static. All queries are pointed to that static, published data set. Editing continues in the live data set, but access is restricted to authorized editors until the next publication.
Querying actively edited data in real-time generates errors in both the reporting and the system in a data-sharing environment. A healthy, vital GIS is perpetually under construction. Data users can be very frustrated by queries that fail or results that change because the data is actively being edited. The best solution for this type of problem is to frequently publish and "warehouse" data in a database. When the end user is connected to one of these managed databases, he typically has the ability to view or copy from the database. But he cannot perform edits.
Viewing the DataThe true essence of GIS is understanding the data. A critical step in understanding data is viewing and examining it in some detail. The catalog tree, as we have established, displays the hierarchy of all the data the user has access to. The entire aggregate is known as (surprise, surprise) the catalog. Clicking on an item in the catalog selects it for viewing. The right pane allows the user to view the elements of the data and perform a variety of maintenance tasks. The tabs in the right pane give the user the tools to review, examine, prepare and select data sets for projects.
The first tab is "Contents" (see Figure 3). The contents tab displays the elements of data sets. There are four choices on the toolbar that the user can select from to view the contents: "List" (the default), "Large Icons," "Thumbnails" and "Details." The List and Large Icons tabs are more or less self-explanatory. The Details tab is similar to details in Windows Explorer. It shows the descriptive characteristics of each item and allows the user to sort and manipulate the items in the view. The Thumbnails view is a "snapshot" of each item in the contents list that displays the features of each item or a description.
The second tab is "Preview." This window can be toggled between two settings, "Table" and "Geography" views. Setting the Preview box to Table view displays tabular data (if available) in the selected feature set. For non-spatial data sets, this is the only viewing option.
In the Geography view, the window displays feature data, vector (points, lines and polygons) or raster (image) data. The appearance of raster data sets in the Geography view is dependent upon the number of raster bands. TIFF (Tagged Image File Format) and DOQQ (Digital Ortho Quarter Quads) are common types of raster data (see Figure 4). When the user is navigating around the catalog tree looking for data to use in a project, ArcCatalog has a handy tool on the right side of the toolbar called the "Create Thumbnail" tool. This differs from the Thumbnail display option under the contents tab; it allows the user to create a custom view using the pan and zoom tools.
Thumbnails created by the user are saved in a Map document (.mxd). When a document is open in ArcMap, the user can simply "drag and drop" the thumbnail into a Map document as a layer. In fact, all spatial data sets in the catalog can easily be dragged and dropped directly into an open ArcMap document.
Viewing CADD Data
Viewing CADD files in ArcCatalog is the same as viewing any other data set: the user simply selects CADD files. CAD drawings are represented as data sets in ArcCatalog. Drawing features are shown as individual feature classes. Remember, a feature class is a group of geometric objects such as points, lines or polygons and their annotations. All components of a feature class must be on the same coordinate system. Feature classes can be constructed into feature data sets (see Figure 5 on p. 50). But it is also important to remember that ArcCatalog displays those features in the default symbology. So CADD drawings will rarely if ever appear exactly the same as they appear in native CADD. Version 9.2 of ArcGIS promises to improve this condition.
From a data management standpoint, most drawings will appear almost instantly in the preview window at the click of a mouse. There is far less waiting for them to "open." And the user can still "zoom" all features as well as toggling them on and off in the feature class.
CADD drawings will also display tables in ArcCatalog. However, those tables will contain only elements embedded in the drawing's layers. Data in project files will appear as a separate data set. But ArcCatalog will not display all of these files as Windows Explorer does. As we build larger and larger data sets, finding individual files and documents becomes more problematic.
This CADD/GIS interoperability is not a new idea. It is, though, very powerful. It's more than just the seamless integration of CADD and GIS. It's the concept of managing all data under a sort of "central command" linked by a common denominator. In this case, it is geography that provides that critical link.
Creating MetadataReturning to the right pane of the catalog, the third tab is Metadata. Metadata is the data about the data. Users can create metadata from basically any data managed in ArcCatalog. The metadata vehicle in ArcCatalog is called a stylesheet. (Some readers may recall we briefly touched on this in "The Data Tsunami," POB July 2003.) With the basics of ArcCatalog covered, let's create and edit some real metadata. ArcCatalog will actually create the metadata for the user if it doesn't already exist. This happens automatically after clicking the "Metadata" tab. After clicking, the user may ask, "Is it metadata yet?" Well, yes, but there is room for improvement.
Immediately above the tabs is the metadata toolbar. There are five buttons from left to right: "Edit metadata," "Metadata properties," "Create/Update metadata," "Import metadata" and "Export metadata" (see Figure 6 on p. 50). All of the buttons except "Update" bring up additional dialog boxes.
For the stylesheet, we will choose the FGDC ESRI. When the metadata tab in the right pane is clicked, ArcCatalog will display all of the characteristics defined in the selected stylesheet as an HTML (Hypertext Markup Language) page in the right pane.
In this particular case I have selected an AutoCAD drawing file.
The "Edit metadata" button brings up a dialog box with two rows of tabs. And yes, each tab brings up an additional dialog box and some of those have yet additional tabs. But no one should be intimidated by this. In many cases the user need only populate a field one time and others only for major changes. ArcCatalog tracks and caches, or remembers and then pulls that part of the metadata from the system on subsequent entries. The user only needs to re-enter changes and information for new fields.
Further, the user only needs to fill in the fields of the tabs he feels are necessary to provide to other users of his data. In this case we choose the "Spatial Reference" tab, fill in the boxes and save. We have now created metadata for an AutoCAD drawing. The metadata is stored in Extensible Markup Language (XML) format automatically. And it can very easily be exported using the Export Metadata tool and sent to another user along with its parent file(s).
Successful Data ManagementIn this piece we have viewed some data using ArcCatalog. Some of the data was GIS data and some of it was CADD data. Then we took a few gentle "baby steps" into the world of metadata creation. But the overall purpose of the exercise was to demonstrate that even vast reservoirs of data and information can be managed with a relatively uncomplicated plan. It is easier to manage CAD in a GIS framework than just using Windows and the CAD program. The tools and technology are readily available.
The possibilities of this type of data management are only limited by the imagination of the user. Having the data all available in one place makes it simple for the user to quickly pull up point files to check values, check the most recent imagery to help lay out a project, or search for a project file. And, as always, there is help available at the ESRI Virtual Campus at http://campus.esri.com. "Using ArcCatalog: Tips and Tricks" and "Geoprocessing CAD With ArcGIS" will help both new and experienced users with the data management concept discussed here.