“There’s one thing I always wanted to do before I quit.”
“What’s that?”

--Groucho Marx, “Animal Crackers”

For several years now I have been advising (if not agitating) surveyors to get more involved in the burgeoning field of Geographic Information Systems (GIS). I have on occasion written a suggestion or two about how to develop a GIS from scratch. I have even been known to be critical about how some systems were constructed and operated. But I had a dirty little secret. I had never actually constructed a brand-new GIS from scratch.

I was happy managing a GIS. And I was even happier about having the opportunity to share my experience via this column. But I always had a secret desire to build a GIS from inception. And I wanted to accomplish it from a private sector perspective.

“I knew if I could get my chance, then I could make the people dance.”

--Don McLean, “American Pie”

In the fall of 2006, I retired from my position with the county of San Diego and waited to see what life after government service was like. I always said only two things could get me to “unretire.” One would be a job as county surveyor for a small rural county. The other would be the chance to build a GIS “from the ground up” for a small city or town.

I’m sure everyone has heard the old cliché “Be careful what you wish for because you just might get it.” And it didn’t take long. I was contracted by the city of El Cajon, Calif., to produce a blueprint for a new GIS from scratch. Perfect! El Cajon is a city of 95,000 people spread over 14.4 square miles in San Diego County. I had barely gotten started when I realized despite my “vast depth of experience,” I had never actually written one of these plans before. Of course I had written annual budget plans, project plans and a slew of reports and proposals. But amazingly, I had never written so much as a bare-bones outline for constructing a GIS from the ground up.

Well, not to worry, right? There’s plenty of canned stuff out there on the ‘Net to use as a model. And I know all this stuff anyway, right? Not! It didn’t take me long to realize there were going to be no significant shortcuts. There was no “Easy” button to be found anywhere. From scratch means just that--from scratch.

Despite my status as an outside consultant, I knew from experience that plan writing is anything but a solo effort. In GIS, perhaps even more so than surveying, team building is a fundamental key to a successful mission. Working in the proverbial vacuum will quickly steer your project into the nearest ditch. So before writing a single line, I knew I had to identify all the players and interview them. Because of my experience as a GIS manager, I knew that people from a variety of different disciplines dabble in GIS and GIS-type activities. So I wasn’t surprised to find about eight copies of ESRI ArcView of various releases pigeonholed in several city departments. This was actually a good start, because it demonstrated that there was interest. What had been lacking was “The Plan.”

Developing a plan to share data across internal organizational boundaries requires both a method and a design. It also requires everyone to acknowledge the one rule in GIS that never changes: “It’s about the data.” The presumption that GIS is all about the software is a popular misconception. The most common question I get as a GIS consultant is “What software do I need to do GIS work?” The second most frequent query is “How much does a copy of ArcView cost?” It isn’t that these questions are not reasonable and valid, but they often presume too much. My typical answers are “It depends” or “What is it you are trying to do?” And that is often the reason designing a GIS for any organization can be challenging. It really is all about the data.

This GIS diagram illustrates the complexity of a sample urban data model.

The Data Model

OK, if it’s about the data--what data? And for that matter, what is data? Good questions that deserve answers. And that brings us to the data model.

A core element for a GIS is a logical data model. The data model is the heartbeat that controls the pulse of the data throughout the system. It describes a complex version of the real world in a database. Logical data models are typically of three types: relational, object-oriented or object-relational. It is important to choose the data model carefully. All logical links will be made to the model you select. So, those links must be identified.

In plainer English, what data do we want to gather? And how do we want to use and present it? Once the data model is developed, all future layers and links will be based upon it. So it is important to make the right choices at the development stage. All GIS layers as a business rule will conform to the approved model both as matters of necessity and consistency.

The Relational Data Model

The relational data model is the most common, as well as the simplest and easiest to work with. The relational data model typically utilizes the most basic relationships between features and attribute data. It is often the best choice for a new GIS of moderate size. In the relational model, data is stored in standard tabular format. These models are normally intuitive, easy to work with and fairly uncomplicated to modify and edit.

The relational data model is also described as a file-based model. The Shape file is by far the most common variety of the relational data model encountered. The relational data model easily links the geographic component to the attribute data of common municipal assets such as fire hydrants, valves, manholes, utility poles, traffic signs and structures.

A typical GIS model.

The Object-Oriented Data Model

The object-oriented data model has become more popular as advances in technology allow greater levels of sophistication. Objects are real-world entities including both natural and manmade features like rivers and structures.

The geodatabase is an example of an object-oriented data model. In a geodatabase, the features can have intelligence. Autodesk Map 3D uses a different form of an object-oriented model. In Autodesk Map 3D, attribute fields are “encapsulated” into data fields, giving the objects intelligence.

Object-oriented data models often contain behaviors. For example, modeling the behaviors of natural phenomena like floods and wildfires is possible in real time. Facility performance models for sanitary or storm sewer systems can also be programmed into an object-oriented data model. The object-oriented data model requires a greater degree of skill and effort to create and maintain.

The Object-Relational Data Model

If you guessed that this model incorporates the best features of the relational and object-oriented data models, you might be a GIS guy. However, it is a bit more complex than that. The object-relational data model places the behaviors of the object-oriented model on top of a relational data model.

This is also referred to as an “intelligent” data model. The object-relational data model is constructed to work with extended environments such as SQL (Structured Query Language) and RDBMS (relational databases).

This model provides a robust engine to support enterprise-wide GISs where there are multiple data editors and a large base of end users. The object-relational model is almost a must for supporting applications like interactive web-based map pages. The ESRI Spatial Database Engine (SDE) is a good example of this type of data model.

Content and Business Rules

Once an appropriate data model has been selected, there are a couple of other important choices to make before we start constructing our database. Of course, we need to determine exactly what data will populate our data model. That content will typically emerge as part of the data model development phase.

But there is another important issue: business rules. Business rules are important because they have a direct impact on the size and integrity of the data model. The format for data categories is important. Raster data consists of continuous numeric values like elevations. Continuous categories (e.g., soil types) use the raster model. Discrete features (e.g., parcel and road locations) are represented as vector data models.

Single and double precision accuracies can vary the size of the database significantly as well as the inclusion of imagery and other referenced digital media. These issues also need to be addressed in the data model design phase. For more detailed information on data models, visit www.esri.com and search for the term “data models” to find case studies, samples and more.

Is There More to Creating a GIS Than the Data?

The data model is a primary component of any useful GIS plan. But there is a lot more to consider. We will look a little more deeply into GIS plan development in my October column.