|
The ADL Gazetteer Protocol
Linda L. Hill
Center for Global Georeferencing Research
Version 1.1
Contents
This document describes a protocol for accessing general-purpose
gazetteer services.
A gazetteer is a dictionary of geographic placenames.
Gazetteers have traditionally appeared as back-of-the-book indexes in
atlases; as place encyclopedias, such as the Columbia Gazetteer of the
World; as thesauri, such as the Getty
Thesaurus of Geographic Names; and as toponymic authority files,
such as NIMA's GEOnet Names Server
and the U.S. Geological Survey's Geographic Names Information
System. In an atlas, a gazetteer provides an alphabetical list of
the placenames that appear in the atlas, and it maps those names to
page numbers and map grid locations. Place encyclopedias often
include descriptive information for locations, as does the Getty
Thesaurus, and sometimes include latitude/longitude coordinates as
well. Toponymic authority files focus on differentiating official
placenames versus variant names, and they associate names with
coordinate locations primarily for disambiguation purposes. Other
toponymic reference works publish scholarly information about the
origins of geographic names.
A digital gazetteer builds on these traditional gazetteers. It
maps geographic placenames (the names of natural features such as
mountains and lakes and the names of human constructs such as cities
and states) to coordinate-based geographic locations. The services it
provides are largely oriented around searching: answering "Where
is...?" queries given all or a portion of a geographic name ("Where is
the place named 'Santa Barbara'?") and "What's there?" queries which
return all places, or all places of a specified class, within a given
region ("What schools are in Santa Barbara County?"). Digital
gazetteers augment traditional gazetteers by providing bidirectional
mappings among placenames, map locations, and classifications. And
they expand on the notion of named geographic features to include
virtually any category of feature that can be geolocated (e.g.,
weather events such as hurricanes), any type of name or label for a
place (e.g., postal codes and UTM grid names), and names with only
local or specialized scope (e.g., research study areas). Descriptive
information and associated data (e.g., population and elevation) can
also be included in digital gazetteers. The ADL gazetteer protocol
builds on this generalized concept of what a gazetteer is.
This document first semi-formally defines an abstract model of a
gazetteer. That model is then used as the basis for defining a set of
services (i.e., a set of network-invokable functions), several report
formats, and a query language.
A caveat: the gazetteer protocol described herein provides
relatively low-level services. The services are intended to be simple
enough that they can be implemented by all gazetteers, yet powerful
enough to be useful to clients both in their own right and for
combining into higher-level services. To get a sense of the level of
this specification, consider the common gazetteer functionality of
finding places by entering qualified placenames, as in "find 'Santa
Barbara, California'". The ADL gazetteer protocol does not provide
such high-level functionality, but it does provide sufficient building
blocks for achieving that functionality. Specifically, the protocol
supports 1) finding a place named "California" belonging to class
"states"; 2) disambiguation in the case of multiple returns; and 3)
finding a place named "Santa Barbara" that is contained within the
place named "California".
In this section we semi-formally define an abstract model of a
gazetteer. The ADL gazetteer protocol is built on (i.e., is written
against) this model.
A gazetteer is a set of gazetteer entries. A
gazetteer entry describes a single geographic place by an
identifier and several key attributes of the place: one or more names,
one or more footprints, and zero or more classes. There is no
intrinsic structure to a gazetteer beyond simple containment of
gazetteer entries, although relationships between entries may be
explicitly represented by the gazetteer (see below).
An identifier is a string that unambiguously identifies
the entry within the gazetteer. The identifier need not be
universally unique.
A name is a complete, unqualified name for the place. For
example, the name of the city of Los Angeles is "Los Angeles", not
"Los Angeles, California". A gazetteer entry can have more than one
name, in which case the names may denote alternative names for the
place (e.g., the city "Köln" is also known as "Cologne") or
varying names over time (e.g., the country "Thailand" was formerly
known as "Siam").
A footprint is an approximation, expressed in
latitude/longitude coordinates, of the subset of the Earth's surface
occupied by the place. Note that a footprint need not be contiguous.
For example, a footprint for the state of Hawaii might consist of a
union of disjoint polygons, one per island. A gazetteer entry can
have more than one footprint, in which case the footprints must
represent different approximations or resolutions of the same
conceptual footprint.
A class classifies the place with respect to a set of
terms. More specifically, a class is the association of the place
with a term drawn from a simple vocabulary of terms or thesaurus (a
vocabulary augmented with inter-term relationships). A gazetteer
entry may belong to multiple classes, and even to multiple classes
from the same thesaurus. Note that if a gazetteer consists of a
single class of places (consider "The Knopf Gazetteer of Cemeteries of
the Southwest"), its entries will not be considered to be classified
for the purposes of this specification unless each entry carries the
classification for searching and reporting purposes.
Each attribute of a gazetteer entry (i.e., each name, each
footprint, and each class) may be qualified as being primary
(i.e., the attribute is the preferred or official value) and/or
historical (the attribute is known to not be currently
valid). For example, a gazetteer entry for the city Köln may
mark the name "Köln" as primary but not "Cologne"; a gazetteer
entry for the country Thailand may mark the name "Siam" as
historical.
For each gazetteer entry, the following conditions on qualifiers
must hold:
- Exactly one name must be marked as primary.
- Exactly one footprint must be marked as primary.
- If the entry has been classified, at least one class must be
marked as primary.
Finally, a gazetteer may be augmented with inter-entry
relationships. A relationship is a named, directed, binary
association between gazetteer entries. For example, a gazetteer might
support a capital-of relationship which relates capital
cities and administrative areas: the city of Sacramento is the capital
of the state of California, and so on. (The ADL gazetteer protocol
defines the necessary structures to support relationships in general,
but it does not define any particular relationships, just as it does
not define any particular classification scheme.)
Functionally speaking, the ADL gazetteer protocol consists of the
following six independent, stateless services. Each service follows
the classical model of function invocation: zero or more arguments are
passed to the service, the service executes synchronously, and a
result and/or an error indication is returned. Support for the
get-capabilities service is mandatory; all other services
are optional. Clients should anticipate that gazetteers may apply
different access control policies to different services.
- capabilities description
<-
get-capabilities()
Returns a description of the overall capabilities of the
gazetteer (the services and query types the gazetteer supports, the
thesauri the gazetteer uses, etc.). See Capabilities below.
- reports
<-
query(query, {"standard" |
"extended"} [, geometry language])
Returns reports for the gazetteer entries selected by a query.
query is a query expressed in the gazetteer query language;
see Query language below. Either
standard or extended reports may be returned; see Reports below. The geometry language used in the
reports may optionally be requested. The geometry language(s) and the
subset of the query language that the gazetteer supports are described
in the gazetteer's capabilities; see Capabilities below. Clients should
anticipate that a gazetteer may return an error indication in response
to a nominally supported query due to implementation limitations.
Also, a gazetteer may return both reports and an error indication, as
when an internal result limit is reached during otherwise successful
query processing.
- reports
<- download({"standard"
| "extended"} [, geometry language])
Similar to the query service, the
download service returns standard or extended reports for
every entry in the gazetteer.
- identifier
<-
add-entry({standard report |
extended report})
Adds an entry to the gazetteer and returns the identifier of
the new entry. The entry's attributes are specified by a standard or
extended report; see Reports below. (The
identifier in the report, if any, is ignored.) A gazetteer may
disallow addition of entries using standard reports.
relate-entries(relationship,
identifier1,
identifier2)
Creates a relationship named relationship between the
gazetteer entries identified by identifier1 and
identifier2. The relationship must be one of the
relationships supported by the gazetteer; see Capabilities below.
remove-entry(identifier)
Removes the entry identified by identifier from the
gazetteer. All relationships that reference the removed entry are
removed as well.
An XML-over-HTTP implementation of the services is described next.
In this formulation, a gazetteer service is invoked by submitting an
HTTP POST request to a URL representing the gazetteer's common access
point for all services. The format and discovery of this URL are
outside the scope of this document.
Both service requests and service responses must have MIME content
type text/xml and consist of a single
<gazetteer-service> element in namespace
"http://www.alexandria.ucsb.edu/gazetteer". The
version attribute of this element indicates the version
of the gazetteer protocol used by the client (in requests) or the
gazetteer implementation (in responses).
In a service request, the <gazetteer-service>
element must contain a single subelement expressing the request.
Subelement <S-request>
corresponds to service S above, e.g., subelement
<get-capabilities-request> corresponds to the
get-capabilities service. Arguments to the request, if
any, are encoded as subelements of the request subelement.
In a service response, the <gazetteer-service>
element must contain a single subelement containing the response.
Similar to requests, subelement
<S-response> corresponds to
service S. Each response subelement contains optional,
service-specific, "normal" content (e.g., reports in the case of the
query service) and a mandatory <error>
subelement, the latter of which is nillable. A successful response is
indicated by the presence of normal content and a nil
<error> element, while a non-nil
<error> element indicates an error and describes it
by an implementation-specific code and/or text description. An
implementation may return both normal content and an error,
such as when a query is successfully processed and results are
successfully returned, but the number of results returned is limited
due to an implementation constraint.
Gazetteer implementations should generally return HTTP status code
200 (OK), and should use HTTP error codes only for low-level errors
such as syntactically malformed requests and authentication problems.
Higher-level errors should be returned using the mechanism described
above.
| gazetteer-service.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:gaz="http://www.alexandria.ucsb.edu/gazetteer"
targetNamespace="http://www.alexandria.ucsb.edu/gazetteer"
elementFormDefault="qualified">
<include schemaLocation="gazetteer-capabilities.xsd"/>
<include schemaLocation="gazetteer-query.xsd"/>
<include schemaLocation="gazetteer-standard-report.xsd"/>
<element name="gazetteer-service">
<complexType>
<choice>
<element ref="gaz:get-capabilities-request"/>
<element ref="gaz:get-capabilities-response"/>
<element ref="gaz:query-request"/>
<element ref="gaz:query-response"/>
<element ref="gaz:download-request"/>
<element ref="gaz:download-response"/>
<element ref="gaz:add-entry-request"/>
<element ref="gaz:add-entry-response"/>
<element ref="gaz:relate-entries-request"/>
<element ref="gaz:relate-entries-response"/>
<element ref="gaz:remove-entry-request"/>
<element ref="gaz:remove-entry-response"/>
</choice>
<attribute name="version" type="string" use="required"/>
</complexType>
</element>
<element name="get-capabilities-request">
<complexType/>
</element>
<element name="get-capabilities-response">
<complexType>
<sequence>
<element ref="gaz:gazetteer-capabilities"
minOccurs="0"/>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="query-request">
<complexType>
<sequence>
<element ref="gaz:gazetteer-query"/>
<element name="report-format">
<simpleType>
<restriction base="string">
<enumeration value="standard"/>
<enumeration value="extended"/>
</restriction>
</simpleType>
</element>
<element name="geometry-language" type="anyURI"
minOccurs="0"/>
</sequence>
</complexType>
</element>
<element name="query-response">
<complexType>
<sequence>
<choice minOccurs="0">
<element name="standard-reports">
<complexType>
<sequence>
<element ref="gaz:gazetteer-standard-report"
minOccurs="0" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="extended-reports">
<complexType>
<sequence>
<any processContents="lax" minOccurs="0"
maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
</choice>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="download-request">
<complexType>
<sequence>
<element name="report-format">
<simpleType>
<restriction base="string">
<enumeration value="standard"/>
<enumeration value="extended"/>
</restriction>
</simpleType>
</element>
<element name="geometry-language" type="anyURI"
minOccurs="0"/>
</sequence>
</complexType>
</element>
<element name="download-response">
<complexType>
<sequence>
<choice minOccurs="0">
<element name="standard-reports">
<complexType>
<sequence>
<element ref="gaz:gazetteer-standard-report"
minOccurs="0" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="extended-reports">
<complexType>
<sequence>
<any processContents="lax" minOccurs="0"
maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
</choice>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="add-entry-request">
<complexType>
<choice>
<element ref="gaz:gazetteer-standard-report"/>
<element name="extended-report">
<complexType>
<sequence>
<any processContents="lax"/>
</sequence>
</complexType>
</element>
</choice>
</complexType>
</element>
<element name="add-entry-response">
<complexType>
<sequence>
<element name="identifier" type="string"
minOccurs="0"/>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="relate-entries-request">
<complexType>
<sequence>
<element name="relationship" type="string"/>
<element name="identifier" type="string" minOccurs="2"
maxOccurs="2"/>
</sequence>
</complexType>
</element>
<element name="relate-entries-response">
<complexType>
<sequence>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="remove-entry-request">
<complexType>
<sequence>
<element name="identifier" type="string"/>
</sequence>
</complexType>
</element>
<element name="remove-entry-response">
<complexType>
<sequence>
<element ref="gaz:error"/>
</sequence>
</complexType>
</element>
<element name="error" nillable="true">
<complexType>
<sequence>
<element name="code" type="string" minOccurs="0"/>
<element name="description" type="string"
minOccurs="0"/>
</sequence>
</complexType>
</element>
</schema> |
An example of a service request is shown below. The request asks a
gazetteer for standard reports for all populated places whose names
contain the phrase "las vegas".
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-service
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
version="1.1">
<query-request>
<gazetteer-query>
<and>
<name-query operator="contains-phrase"
text="las vegas"/>
<class-query thesaurus="ADL Feature Type Thesaurus"
term="populated places"/>
</and>
</gazetteer-query>
<report-format>standard</report-format>
</query-request>
</gazetteer-service> |
A possible successful response to the above request is shown below.
The response contains a single standard report for a place named "Las
Vegas", also known as "Sin City". The success of the response is
indicated by the nillity of the <error>
subelement.
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-service
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:gml="http://www.opengis.net/gml"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
version="1.1">
<query-response>
<standard-reports>
<gazetteer-standard-report>
<identifier>1001652</identifier>
<names>
<name primary="true">Las Vegas</name>
<name>Sin City</name>
</names>
<bounding-box>
<gml:coord>
<gml:X>-115.25</gml:X>
<gml:Y>36.15</gml:Y>
</gml:coord>
<gml:coord>
<gml:X>-115.12</gml:X>
<gml:Y>36.25</gml:Y>
</gml:coord>
</bounding-box>
<footprints>
<footprint-reference xlink:href="http://..."
geometry-type="Polygon" num-points="4632"
primary="true"/>
</footprints>
<classes>
<class thesaurus="ADL Feature Type Thesaurus"
primary="true">populated places</class>
</classes>
</gazetteer-standard-report>
</standard-reports>
<error xsi:nil="true"/>
</query-response>
</gazetteer-service> |
Finally, here's a possible error response to the above request:
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-service
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
version="1.1">
<query-response>
<error>
<code>-908</code>
<description>Database connection failure.</description>
</error>
</query-response>
</gazetteer-service> |
The ADL gazetteer protocol is defined in terms of the abstract
model given in Gazetteer model above.
In practice, gazetteer implementations will differ from the abstract
model, typically by being more complex. To allow clients to take
advantage of this potentially richer information in a structured
manner, the gazetteer protocol defines two formats for gazetteer
entries: the standard report and the extended report.
The extended report of a gazetteer entry is a
gazetteer-specific format, and is undefined by the gazetteer protocol.
The intention is that all of the information a gazetteer possesses
about an entry be representable by the format. If a gazetteer
supports extended reports, the report format must be defined by an XML
schema; see Capabilities below.
The standard report of a gazetteer entry corresponds to
the abstract gazetteer model. An XML schema for the report format is
listed below. The schema defines element
<gazetteer-standard-report> in namespace
"http://www.alexandria.ucsb.edu/gazetteer". Subelements
<identifier>, <names>,
<footprints>, <classes>, and
<relationships> and attributes primary
and historical correspond directly to the model.
Each footprint may be described either directly using a
<footprint> element or indirectly using a
<footprint-reference> element. In the direct case
the footprint is defined as a single subelement (the
"footprint-defining element") of the <footprint>
element. In the indirect case, the footprint-defining element is
indirectly referred to by a URL, and the optional
geometry-type and num-points attributes can
be used to give clients an indication of the size and nature of the
footprint. Attribute geometry-type, if present, must be
the unqualified XML name of the footprint-defining element and
num-points must be the number of points in the
geometry.
In both of the above cases, the possible footprint-defining
elements may be drawn from the Open
GIS Consortium's Geography
Markup Language (GML) or from another geometry language supported
by the gazetteer; see Capabilities, below.
Support for GML is mandatory. GML's footprint-defining elements
(<gml:Box> and elements in class
gml:_Geometry) are defined in terms of an abstract
Cartesian coordinate system, but we mandate here that the coordinate
system must be the WGS84 latitude/longitude coordinate system.
Specifically, the first (X) coordinate must be longitude in signed
decimal degrees east of the Greenwich meridian and the second (Y)
coordinate must be latitude in signed decimal degrees north of the
equator.
Element <bounding-box> is the bounding box
(i.e., the smallest enclosing graticule-aligned rectangle) of the
entry's primary footprint.
| gazetteer-standard-report.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:gaz="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:gml="http://www.opengis.net/gml"
xmlns:xlink="http://www.w3.org/1999/xlink"
targetNamespace="http://www.alexandria.ucsb.edu/gazetteer"
elementFormDefault="qualified">
<import namespace="http://www.opengis.net/gml"
schemaLocation="geometry.xsd"/>
<import namespace="http://www.w3.org/1999/xlink"
schemaLocation="xlinks.xsd"/>
<attributeGroup name="qualifiers">
<attribute name="primary" type="boolean" default="false"/>
<attribute name="historical" type="boolean"
default="false"/>
</attributeGroup>
<element name="gazetteer-standard-report">
<complexType>
<sequence>
<element name="identifier" type="string"/>
<element name="names">
<complexType>
<sequence>
<element name="name" maxOccurs="unbounded">
<complexType>
<simpleContent>
<extension base="string">
<attributeGroup ref="gaz:qualifiers"/>
</extension>
</simpleContent>
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="bounding-box" type="gml:BoxType"/>
<element name="footprints">
<complexType>
<choice maxOccurs="unbounded">
<element name="footprint">
<complexType>
<choice>
<element ref="gml:_Geometry"/>
<element ref="gml:Box"/>
<element name="other-footprint">
<complexType>
<sequence>
<any processContents="lax"/>
</sequence>
</complexType>
</element>
</choice>
<attributeGroup ref="gaz:qualifiers"/>
</complexType>
</element>
<element name="footprint-reference">
<complexType>
<attributeGroup ref="xlink:locatorLink"/>
<attribute name="geometry-type">
<simpleType>
<restriction base="string">
<enumeration value="Box"/>
<enumeration value="Point"/>
<enumeration value="LineString"/>
<enumeration value="Polygon"/>
<enumeration value="MultiPoint"/>
<enumeration
value="MultiLineString"/>
<enumeration value="MultiPolygon"/>
<enumeration value="other"/>
</restriction>
</simpleType>
</attribute>
<attribute name="num-points"
type="positiveInteger"/>
<attributeGroup ref="gaz:qualifiers"/>
</complexType>
</element>
</choice>
</complexType>
</element>
<element name="classes" minOccurs="0">
<complexType>
<sequence>
<element name="class" minOccurs="0"
maxOccurs="unbounded">
<complexType>
<simpleContent>
<extension base="string">
<attribute name="thesaurus"
type="string" use="required"/>
<attributeGroup ref="gaz:qualifiers"/>
</extension>
</simpleContent>
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="relationships" minOccurs="0">
<complexType>
<sequence>
<element name="relationship" minOccurs="0"
maxOccurs="unbounded">
<complexType>
<attribute name="name" type="string"
use="required"/>
<attribute name="identifier" type="string"
use="required"/>
</complexType>
</element>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema> |
Here's an example of a standard report with an indirect
footprint:
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-standard-report
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:gml="http://www.opengis.net/gml"
xmlns:xlink="http://www.w3.org/1999/xlink">
<identifier>1001652</identifier>
<names>
<name primary="true">Las Vegas</name>
<name>Sin City</name>
</names>
<bounding-box>
<gml:coord>
<gml:X>-115.25</gml:X>
<gml:Y>36.15</gml:Y>
</gml:coord>
<gml:coord>
<gml:X>-115.12</gml:X>
<gml:Y>36.25</gml:Y>
</gml:coord>
</bounding-box>
<footprints>
<footprint-reference xlink:href="http://..."
geometry-type="Polygon" num-points="4632"
primary="true"/>
</footprints>
<classes>
<class thesaurus="ADL Feature Type Thesaurus"
primary="true">cities</class>
</classes>
<relationships>
<relationship name="principal-city-of"
identifier="1241232"/>
</relationships>
</gazetteer-standard-report> |
The footprint corresponding to the above example might like
something like this:
<?xml version="1.0" encoding="UTF-8"?>
<Polygon xmlns="http://www.opengis.net/gml">
<outerBoundaryIs>
<LinearRing>
<coordinates>-115.12,36.25 -115.17,...</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon> |
The query service, described under Services above, returns all gazetteer entries
that satisfy one or more constraints placed against entry attributes.
The constraints are expressed in the form of a language.
The gazetteer query language consists of boolean combinations
(and, or, and and
not) of five types of queries. Support for any given type of
query is optional. The query types are as follows:
identifier-query
identifier
-
Returns the gazetteer entry identified by identifier.
name-query operator
text
-
Returns all gazetteer entries having at least one name that matches
text according to text-matching operator operator.
If a gazetteer supports name queries, it must support the following
operator:
equals
- A gazetteer entry name matches text if it equals
text, ignoring insignificant differences in whitespace.
Other text-matching operators gazetteers are encouraged to support
include:
contains-all-words
- A gazetteer entry name matches text if it contains all of
the words in text. For example, entry name "San Luis Obispo"
matches text "obispo luis" under this operator.
contains-any-words
- A gazetteer entry name matches text if it contains any of
the words in text. For example, entry name "Hope Ranch"
matches text "hope" under this operator.
contains-phrase
- A gazetteer entry name matches text if it contains all of
the words in text in the same consecutive order. For
example, entry name "Black Forest Drive" matches text "forest drive"
under this operator, but entry names "Forest Lake Drive" and "Drive
Forest" do not.
matches-pattern
- A gazetteer entry name matches text if it matches
text when the latter is treated as a regular expression.
Specifically, an asterisk ("
*") in text matches
zero or more characters and a question mark ("?") matches
any single character. Note that a gazetteer implementation may limit
the regular expressions it accepts. For example, a gazetteer may
support right truncation only (i.e., it may accept asterisks only at
the end of text).
The semantics of all of the above operators have deliberately been
left somewhat fuzzy to accommodate differing implementations.
Specifically, exactly what constitutes a word is left undefined, and
it is unspecified whether the gazetteer implementation employs word
stemming or other fuzzy word matching techniques. In any case, the
above operators should be case-insensitive.
footprint-query
operator {polygon |
box |
identifier}
-
Returns all gazetteer entries having a footprint that matches a
query region according to spatial operator operator. (If a
gazetteer entry has multiple footprints, it is unspecified which
footprint(s) are used for matching.) The query region may take any of
the three forms listed next; note that support for any given form is
optional.
- polygon
- A simple polygon with geodesic edges, defined in WGS84
latitude/longitude coordinates.
- box
- A rectangle whose edges are aligned with the WGS84
latitude/longitude graticule.
- identifier
- One of the footprints of the gazetteer entry identified by
identifier (which footprint is unspecified).
If a gazetteer supports footprint queries, it must support the
following operator:
within
- A gazetteer entry footprint matches the query region if the
footprint is a subset of the region.
Other spatial operators gazetteers are encouraged to support
include:
contains
- A gazetteer entry footprint matches the query region if the
footprint is a superset of the region.
overlaps
- A gazetteer entry footprint matches the query region if the
footprint intersects the region.
A gazetteer implementation may limit the query regions it accepts.
For example, an implementation may disallow polygons that enclose a
pole. Also, an implementation may support matching on footprint
bounding boxes only.
class-query thesaurus
term
-
Returns all gazetteer entries belonging to class term, or
any subclass of term recursively (if the gazetteer supports
subclasses or thesaurus relationships), where term is a term
drawn from a thesaurus or simple vocabulary associated with the
gazetteer. For example, if class "capital cities" is a subclass
(i.e., a specialization) of class "cities", then a class query of
"cities" will return all cities (capital and not) whereas a query of
"capital cities" will return only capital cities.
relationship-query
relationship identifier
-
Returns all gazetteer entries having relationship
relationship to a target gazetteer entry identified by
identifier. Note that a gazetteer must not consider a
relationship query with an inappropriate target to be malformed or
erroneous. For example, suppose a gazetteer supports the
capital-of relationship, but only for target gazetteer
entries that are countries. A relationship query in which the target
is a cemetery is not to be considered malformed, but should simply
yield zero results.
Clients should be aware that a gazetteer implementation may not be
able to search over all attributes of a gazetteer entry. For example,
an implementation may be able to search over primary names only.
An XML schema for the gazetteer query language is listed below.
The schema defines element <gazetteer-query> in
namespace "http://www.alexandria.ucsb.edu/gazetteer".
Subelements <identifier-query>,
<name-query>, <footprint-query>,
<class-query>, and
<relationship-query> correspond to the query types
described above. The elements <and>,
<or>, and <and-not> support
boolean combinations of queries.
Query regions in footprint queries may be specified using the Open GIS Consortium's Geography Markup Language (GML)
or another geometry language supported by the gazetteer; see Capabilities, below. Support for GML is
mandatory. GML defines the <gml:Box> and
<gml:Polygon> elements in terms of an abstract
Cartesian coordinate system, but we mandate here that the coordinate
system must be the WGS84 latitude/longitude coordinate system.
Specifically, the first (X) coordinate must be longitude in signed
decimal degrees east of the Greenwich meridian and the second (Y)
coordinate must be latitude in signed decimal degrees north of the
equator.
| gazetteer-query.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:gaz="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:gml="http://www.opengis.net/gml"
targetNamespace="http://www.alexandria.ucsb.edu/gazetteer"
elementFormDefault="qualified">
<import namespace="http://www.opengis.net/gml"
schemaLocation="geometry.xsd"/>
<element name="gazetteer-query">
<complexType>
<sequence>
<group ref="gaz:query"/>
</sequence>
</complexType>
</element>
<group name="query">
<choice>
<element ref="gaz:identifier-query"/>
<element ref="gaz:name-query"/>
<element ref="gaz:footprint-query"/>
<element ref="gaz:class-query"/>
<element ref="gaz:relationship-query"/>
<element ref="gaz:and"/>
<element ref="gaz:or"/>
<element ref="gaz:and-not"/>
</choice>
</group>
<element name="identifier-query">
<complexType>
<attribute name="identifier" type="string"
use="required"/>
</complexType>
</element>
<element name="name-query">
<complexType>
<attribute name="operator" use="required">
<simpleType>
<restriction base="string">
<enumeration value="contains-all-words"/>
<enumeration value="contains-any-words"/>
<enumeration value="contains-phrase"/>
<enumeration value="equals"/>
<enumeration value="matches-pattern"/>
</restriction>
</simpleType>
</attribute>
<attribute name="text" type="string" use="required"/>
</complexType>
</element>
<element name="footprint-query">
<complexType>
<choice>
<element ref="gml:Box"/>
<element ref="gml:Polygon"/>
<element name="identifier" type="string"/>
<element name="other-region">
<complexType>
<sequence>
<any processContents="lax"/>
</sequence>
</complexType>
</element>
</choice>
<attribute name="operator" use="required">
<simpleType>
<restriction base="string">
<enumeration value="contains"/>
<enumeration value="overlaps"/>
<enumeration value="within"/>
</restriction>
</simpleType>
</attribute>
</complexType>
</element>
<element name="class-query">
<complexType>
<attribute name="thesaurus" type="string"
use="required"/>
<attribute name="term" type="string" use="required"/>
</complexType>
</element>
<element name="relationship-query">
<complexType>
<attribute name="relationship" type="string"
use="required"/>
<attribute name="identifier" type="string"
use="required"/>
</complexType>
</element>
<element name="and">
<complexType>
<sequence>
<group ref="gaz:query" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="or">
<complexType>
<sequence>
<group ref="gaz:query" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="and-not">
<complexType>
<sequence>
<group ref="gaz:query" minOccurs="2" maxOccurs="2"/>
</sequence>
</complexType>
</element>
</schema> |
An example of a gazetteer query is shown below. This example
requests all places whose names contain the phrase "santa barbara" and
that overlap a given spatial region, and that are neither populated
places nor cemeteries. A place named "Santa Barbara County Hospital"
might match such a query.
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-query
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:gml="http://www.opengis.net/gml">
<and-not>
<and>
<name-query operator="contains-phrase"
text="santa barbara"/>
<footprint-query operator="overlaps">
<gml:Box>
<gml:coordinates>-140,30 110,35</gml:coordinates>
</gml:Box>
</footprint-query>
</and>
<or>
<class-query thesaurus="ADL Feature Type Thesaurus"
term="populated places"/>
<class-query thesaurus="ADL Feature Type Thesaurus"
term="cemeteries"/>
</or>
</and-not>
</gazetteer-query> |
The get-capabilities service described under Services above returns a description of a
gazetteer's overall capabilities. An XML schema for the description
is listed below. The schema defines element
<gazetteer-capabilities> in namespace
"http://www.alexandria.ucsb.edu/gazetteer". Within this
element are the following subelements:
<version>
- The version of the gazetteer protocol the gazetteer supports.
<description>
- A human-readable description of the gazetteer. It is suggested
that the description include: the scope and purpose of the gazetteer;
details on the gazetteer's interpretation and implementation of the
protocol; appropriate usage guidelines; and rights and liability
clauses.
<extended-report-schema>
- If the gazetteer supports extended reports, the URL of the
reports' XML schema.
<thesauri>
- The thesauri (or simple vocabularies) the gazetteer uses to
classify its entries. Each thesaurus is described by a name and the
URL of its ADL
Thesaurus Protocol interface.
<relationships>
- The names of the relationships the gazetteer is capable of
representing.
<other-geometry-languages>
- The geometry languages the gazetteer supports (other than GML, which is required). Each
language is described by an XML namespace.
<services>
- The services the gazetteer supports.
<query-types>
- The types of queries the gazetteer supports.
<name-query-operators>
- If the gazetteer supports name queries, the text-matching
operators the gazetteer supports.
<footprint-query-operators> and
<footprint-query-operands>
- If the gazetteer supports footprint queries, the spatial operators
and geometry types the gazetteer supports.
| gazetteer-capabilities.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:gaz="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:xlink="http://www.w3.org/1999/xlink"
targetNamespace="http://www.alexandria.ucsb.edu/gazetteer"
elementFormDefault="qualified">
<import namespace="http://www.w3.org/1999/xlink"
schemaLocation="xlinks.xsd"/>
<element name="gazetteer-capabilities">
<complexType>
<sequence>
<element name="version" type="string" default="1.1"/>
<element name="description" type="string"
minOccurs="0"/>
<element name="extended-report-schema" minOccurs="0">
<complexType>
<attributeGroup ref="xlink:locatorLink"/>
</complexType>
</element>
<element name="thesauri" minOccurs="0">
<complexType>
<sequence>
<element name="thesaurus" minOccurs="0"
maxOccurs="unbounded">
<complexType>
<attribute name="name" type="string"
use="required"/>
<attributeGroup ref="xlink:locatorLink"/>
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="relationships" minOccurs="0">
<complexType>
<sequence>
<element name="relationship" type="string"
minOccurs="0" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="other-geometry-languages"
minOccurs="0">
<complexType>
<sequence>
<element name="geometry-language" minOccurs="0"
maxOccurs="unbounded">
<complexType>
<attribute name="namespace" type="anyURI"/>
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="services">
<complexType>
<attribute name="get-capabilities" type="boolean"
fixed="true"/>
<attribute name="query" type="boolean"
default="false"/>
<attribute name="download" type="boolean"
default="false"/>
<attribute name="add-entry" type="boolean"
default="false"/>
<attribute name="relate-entries" type="boolean"
default="false"/>
<attribute name="remove-entry" type="boolean"
default="false"/>
</complexType>
</element>
<element name="query-types" minOccurs="0">
<complexType>
<attribute name="identifier" type="boolean"
default="false"/>
<attribute name="name" type="boolean"
default="false"/>
<attribute name="footprint" type="boolean"
default="false"/>
<attribute name="class" type="boolean"
default="false"/>
<attribute name="relationship" type="boolean"
default="false"/>
</complexType>
</element>
<element name="name-query-operators" minOccurs="0">
<complexType>
<attribute name="contains-all-words"
type="boolean" default="false"/>
<attribute name="contains-any-words"
type="boolean" default="false"/>
<attribute name="contains-phrase"
type="boolean" default="false"/>
<attribute name="equals" type="boolean"
fixed="true"/>
<attribute name="matches-pattern"
type="boolean" default="false"/>
</complexType>
</element>
<element name="footprint-query-operators"
minOccurs="0">
<complexType>
<attribute name="contains" type="boolean"
default="false"/>
<attribute name="overlaps" type="boolean"
default="false"/>
<attribute name="within" type="boolean"
fixed="true"/>
</complexType>
</element>
<element name="footprint-query-operands"
minOccurs="0">
<complexType>
<attribute name="box" type="boolean"
default="false"/>
<attribute name="identifier" type="boolean"
default="false"/>
<attribute name="polygon" type="boolean"
default="false"/>
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema> |
Here's an example of a gazetteer capabilities description:
<?xml version="1.0" encoding="UTF-8"?>
<gazetteer-capabilities
xmlns="http://www.alexandria.ucsb.edu/gazetteer"
xmlns:xlink="http://www.w3.org/1999/xlink">
<version>1.1</version>
<description>This gazetteer...</description>
<extended-report-schema xlink:href="http://..."/>
<thesauri>
<thesaurus name="ADL Feature Type Thesaurus"
xlink:href="http://www.alexandria.ucsb.edu/..."/>
</thesauri>
<relationships>
<relationship>adjacent-to</relationship>
<relationship>capital-of</relationship>
</relationships>
<other-geometry-languages>
<geometry-language
namespace="http://www.esri.com/ArcXML"/>
</other-geometry-languages>
<services query="true" add-entry="true"/>
<query-types identifier="true" name="true" footprint="true"
class="true"/>
<name-query-operators contains-all-words="true"
contains-any-words="true" contains-phrase="true"/>
<footprint-query-operators contains="true"/>
<footprint-query-operands box="true" identifier="true"/>
</gazetteer-capabilities> |
- 1.1
- Swapped the interpretation of the GML first (X) and second (Y)
coordinates. Added a
<description> subelement to
the <gazetteer-capabilities> element.
- 1.0a
- Clarified the meaning of a gazetteer entry having more than one
footprint. Other, minor changes.
- 1.0
- Original version.
Greg
Janée
Last modified: 2002-12-09 19:58
|