First, I want to express that I am honored to be here and to be able to chat with you about gazetteers. I guess I will start with a few words to position how the Open GIS Consortium belongs in this discussion. As I look at the people in this room, I see some of you are what I call "fire fighters". You are actively engaged in modeling urban development, or ecology, or forest management, or other applications of geospatial information. Some of you are not so much fighting the fires as you are filling up the tanker trucks so that others can fight fires more effectively. And you know what I mean. You are capturing data; you're contributing to gazetteers; you're building the reservoir of knowledge. Others in the audience are fire truck builders; I see several here who are members of industry who are building software, database tools, human interfaces, and other devices that are used by fire fighters. The Open GIS Consortium isn't any of those; instead, we are drawing blueprints for the next generation of fire trucks. Today the hose from one fire truck does not always fit the valve of another. You know exactly what I mean; there are a lot of issues centered on interoperability in GIS. But the next generation of fire trucks and the next generation of fire fighters need to be able to share their "information" more easily. We need to be able to build workflows that use best-of-breed components from various parts of industry, and to be able to mix and match these and get out of the "stovepipes" that were discussed earlier.
Another way of looking at the Open GIS Consortium is that we are implementing the standards that are being built in the various standards committees. Certainly we're cooperating very closely with ISO TC211. We are also developing relationships with TC204, the transportation people, and with JTC1 subcommittee 32, the SQL Multi-Media people, to make sure that we have common standards across our industry.
With that in mind, I want to show you what Bob Rugg was reticent to show you, I guess, and that's a little piece of universal modeling language that ISO TC211 is using to clarify the meaning and behavior of gazetteers. This (Figure 1) says that in "ISO-land", features are positioned one of two ways. In the first way, features are positioned with coordinates - or, more precisely, they are positioned by a direct position (in the lower left of the figure). A direct position is coordinates along with a "type" that tells you what those coordinates mean. The "type" in this instance is usually a datum, a spheroid, and a projection. The second way of specifying the location of a feature is by "identifier" (in the upper right of the figure). The "identifier" might be an address, a place name, a phone area code, or other "name". The set of all the features that are positioned by identifier is called a gazetteer. So a gazetteer is the object in the upper left of Figure 1 and it is related somehow to name space "types", and I think that's what Linda Hill was just talking about. A gazetteer is also characterized by a bunch of location instances, and those are the footprints or points or however else you reference the feature. Both of these, names and locations, are related through some kind of generic name table, which has types of its own. So that's a simplified draft definition of what a gazetteer is. I wouldn't copy it down because it's certainly not finished and I'm not even sure it's right.
I'd rather introduce the next topic, which is "what is a gazetteer anyway?" I've heard the word a lot this morning. I don't hear a lot of consensus that we agree exactly about what a gazetteer is. Here's my offering of what a gazetteer is. This is the world's simplest definition of a gazetteer: a gazetteer is a mapping whose domain is a set of place names and whose range is a set of values. The "values" can be coordinates, temperatures, driving instructions, or most anything. Of course, once you implement that simple gazetteer into a computer, then it becomes easy to make the functional arrow point both ways, and then a gazetteer becomes a set of relationships that is either a set of functions or inverse functions (Figure 2). Again the mapping is between names and values, but now the values can be thought of as vectors with multiple components. I don't know if this defines the general case of a gazetteer or not, but I think it's a worthy thing to come up with a rigid and robust definition of gazetteer.
In the Open GIS Consortium we are centered on asking what the behaviors of a gazetteer should be. What are the keywords or the protocols by which one should request gazetteer services? How do I engage a gazetteer? How do I trigger it to do something useful for me? If the industry does not have a common standard for that, then we will never be interoperable.
Let me start a list of attributes that any good gazetteer should possess.
First, gazetteers should be interoperable. Interoperability depends a lot on how gazetteers interface to other operations and to other gazetteers. So we need to write down the "protocols" or the "interface commands" or the "object invocations". We have to settle on those "names", the way we spell them, and the way we attach parameters to them, and the meanings of all of those parameters, and the valid ranges of all of those parameters. That's where interoperability comes from. And that's really what Open GIS is about.
Second, gazetteers should also be invertible - that's making the functional arrow go the other way.
Third, gazetteers should support nested environments because so many names are nested one inside of another.
Fourth, gazetteers should be seamless in the sense that if I join two gazetteer services together, they should look to the client as if they were one (smarter) gazetteer.
Fifth, gazetteers should be pan-thematic. One should be able to roam smoothly across the various themes that a gazetteer holds.
Sixth, gazetteers should be locational. The information in digital gazetteers needs to be tightly bound to spatial and topological information. Digital gazetteers will be able to create map-like views, and be able to relay geometric and topological information to their users.
Gazetteers should be computational. They should be smart. They should be able to answer questions. They should be able to answer questions like, how far is it from A to B? That is, from name A to name B. Which means that it should be able not only to return coordinates, but also to do calculations on those coordinates.
Eighth, digital gazetteers should be dynamic. That is, it should be possible to update them at any time.
Ninth, gazetteers should be authentic. That is, the various records in them should come from the authority for those records. And of course I mean not just a single authority but families of authorities.
Tenth, digital gazetteers should be clever, and the standards that we stick on gazetteers should not prevent them from being clever. When I think of gazetteers, I'm always reminded of the Paris subway flier. It's just a simple front and back of a small piece of paper that tells you where all the Paris metro stops are. It provides their relative geometry; their topology; their names - and you can locate them by name or by stop number or by "end" destination. It's a very fancy gazetteer - it's very clever. We shouldn't preclude that kind of cleverness.
Eleventh, digital gazetteers should be self-aware. What does that mean? Well, that means that when a client knocks on a gazetteer server door and says "Gazetteer, are you able to do the following function?" the gazetteer should know about itself and say "yeah, I do it" or "no, I don't do it" or, more likely, "no, I don't do it, but I have some friend servers who do - can I pass you to them?" Sometimes I think a gazetteer should behave even more surreptitiously. You be the client and I'll be the server - you knock on my door and say "Can you do Task A?" and I, the server, looking at my list of capabilities, find that I can't do Task A, but I'm not telling the client that - I'll turn to another server and ask "Can you do Task A?" and if he says yes, then I'll come back to the client and "yes, I can do Task A". Well, that's the kind of self-awareness that all GIS components should have, including gazetteers.
Last, gazetteers should be monolithic - that is, the set of all gazetteers should look to a client as one really big gazetteer. Gazetteers should meld into one another forming one consistent whole.
Please allow me to close with a word or two about the Open GIS Consortium. We are a not-for-profit membership organization, and our members are corporations, universities, and government entities. We have about 180 members now, and we're all dedicated to the complete integration of geospatial data and geoprocessing into mainstream IT, that is, mainstream Information Technology. Of course, what we are doing is developing consensus interface specifications that will enable precisely that vision. Figure 3 provides a good vantage point from which to think about Open GIS and the role of gazetteers in the future marketplace.
Here are some achievements of the Consortium. Two years ago the OGC issued its first suite of industry consensus interface specifications. This was a bucket of specifications called Simple Feature Access. These are specifications that enable a GIS component to exchange information, or to request information from another component. For example, suppose I have a feature here that happens to be Main Street. I say "Computer, give me all the parcels that are adjacent to Main Street," which is a topological query. Or I could say "Give me all the parcels within 30 feet of Main Street", which would be a geometric query. Or I could say "Give me all the parcels that have Main Street as a part of their address", which would be a feature attribute query. For each of these kinds of queries, the response to the query would be a feature collection that is provided with a very specific description in the specifications. The Open GIS Consortium's specifications provide a set of protocols that allow GIS components to interoperate, whether they come from the same vendor or not.
A few months ago we issued the catalog service interfaces and also the gridded coverage interfaces. These specifications, like the Simple Feature Access specifications, are free to the public and you can access them from our web page (http://www.opengis.org) and download them.
Another major achievement is the Web Mapping Interoperability Testbed. Four weeks ago last Friday, the Open GIS Consortium held its first major demonstration and we demonstrated what we call Web Mapping Technology. Web Mapping Technology is a lot like that client-server scenario I just gave you. It's where a client may use a catalog to find data stores that contain features of interest, and it enables the client to make simultaneous requests from multiple distributed servers that may belong to another GIS vendor, or on different computer platform or different network accessible over the Web. Essentially the client says "Listen up, all you data servers. I have this window here on my screen that I'm managing and in that window I want to see a weather map from server A, I want a street map overlaid from server B, I want to see place names from server C, I want to see hospitals from server D. It is very much like the scenario that Mike Goodchild talked about earlier; the difference is that Mike's system is in a stovepipe, and the system that I'm talking about is mix and match any GIS system anywhere. Thirty-three vendors participated in providing technology in support of that testbed. Some of them provided servers; some of them provided clients; some of them provided clients and servers; some provided data; and some offered technical support.
Oh, I didn't tell you what happens next. The client tells the servers exactly the datum, projection, scale, and neat line of the window where it wants the results painted. Well, when the client asks for information from the servers, the client asks "Are you able to serve the data back to me in a PNG or JPEG that I know how to interpret? The server says, "yeah, I know how to do that" if it's Web-Mapping-Enabled. So a Web-Mapping-Enabled data server knows how to listen for these requests from clients; knows how to take those requests and structure them into JPEGs and PNGs that come back to the client. The client layers those, using transparent pixel technology, so that you can see all of this information stacked one on top of another. It's the vertical integration that Mike talked about.
Coordinate transformation is coming next. In order to achieve industry consensus, we work in a double submission process. Essentially we write specifications for how we think the world ought to work. We call those Abstract Specifications. We issue those in Request for Proposal, and those Request for Proposal are open and public to anybody. We're asking vendors to come back with proposals that are, in fact, interface specification documents at the implementation level of detail. That means sufficient detail that if two software engineers develop software according to these specifications, then the two applications plug and play for each other (even if each was written with no knowledge of the other).
It's called a "double submission" process, because when the responses come back from industry that have the implementation specifications, we reject those - that's our policy. That's because we usually get more than one proposal the first round, and the way we handle that is that everyone gets a blue pencil and everybody opens everybody's proposal and everybody writes down what's good and what's bad and everybody sees everybody else's blue pencil. And the vendors who submitted those proposals receive all of those comments, and they can see where they're pretty good and where they're not so good, and so far, when the second submission comes (which is three months after the first submission), it comes from a single team. The proposers have worked out all of the differences among themselves and it becomes pretty easy to achieve industry consensus. That's the Request for Proposal process. It leads to Implementation Specifications.
The first submissions for Coordinate Transformations are due November 15, that's three weeks in advance for our December 1999 meeting, which will be at UCLA. I just came from our 27th bi-monthly meeting in Tokyo.
As I said, we start with the Abstract Specifications that describe how the world ought to work. Currently there are the 16 volumes that comprise the Abstract Specification:
Topic 1. Feature Geometry
Topic 2. Spatial Reference Systems
Topic 3. Locational Geometry Structures
Topic 4. Stored Functions and Interpolation
Topic 5. The OpenGIS™ Feature
Topic 6. The Coverage Type
Topic 7. Earth Imagery Case
Topic 8. Relationships Between Features
Topic 9. Quality
Topic 10. Feature Collections
Topic 11. Metadata
Topic 12. The OpenGIS™ Service Architecture
Topic 13. Catalog Services
Topic 14. Semantics and Information Communities
Topic 15. Image Exploitation Services
Topic 16. Image Coordinate Transformation Services
I will talk about a couple of these that are especially related to gazetteers.
Topic 3 is about Locational Geometry Structures, and that's where the spatial reference by identifier lives. So work is going on within the coordinate transformation working group to extend what's been done so far, and to include gazetteer-like structures.
The other activity related to gazetteers is in Topic 14. That topic volume has existed quietly for about three years, but a special interest group was established at the just-past meeting in Tokyo to carry it forward. That SIG is called the Semantics Work Group, and it is going to address all of the issues of schema translation from one information community to another.
We live in a world of stovepipes today. We're in a world where we are blessed with a lot of really good GIS systems. However, horizontal traffic between these systems (that would enable me to mix and match their services) doesn't work very well today. So, one could describe Open GIS as an attempt to erase those barriers and open up the horizontal channels. The result will be the vertical integration we talked about earlier.
In closing, let me emphasize that, from the Open GIS Consortium point-of-view, what this meeting is about is business, because what "gazetteer enabled technology" really means is better planning, better force management, better business, better directions to a traveler, better access to a pizza place. That's what the ultimate good is: it's business; it's geospatially enabled business - it's digital gazetteer enabled business.
Thank you.