This page describes some incomplete work undertaken in 2003 to define a standard, XML-based language for describing geographic regions, i.e., geometric regions on the Earth's surface. For brevity, we'll call any such language a geometry language. A geometry language defines a set of possible shapes and standard representations and encodings of those shapes, and also addresses the handling of cartographic quantities (Earth datums, projections, and coordinate systems), either by mandating standard quantities or by providing standard declaration mechanisms.
The motivation for a standard geometry language is rooted in the observation that every system/service/effort that has had to deal with geographic regions has ended up defining its own geometry language. All these languages have broadly similar capabilities to varying degrees, yet all have enough idiosyncracies to bedevil easy interoperability. It is instructive to compare and contrast the geometry languages embedded in specifications such as:
<spatial-value> in ADL-bucket-report.dtd
and <spatial-constraint> in ADL-query.dtd)<bounding-box> and
<footprints> in gazetteer-standard-report.xsd
and <footprint-query> in gazetteer-query.xsd)(A number of additional geometry languages are derived from one or more of the above.) A standard geometry language would facilitate interoperability across different systems, particularly among consumers of geographic regions such as renderers and spatial indexers.
From the perspective of distributed geospatial digital libraries and distributed gazetteer services, which use geometry only for the limited purposes of representing object footprints and query regions and performing spatial comparisons between the two, a geometry language must satisfy three requirements:
The Open GIS Consortium's Geography Markup Language (GML), version 3.0, is one well-known attempt to define a standard geometry langauge. It is a comprehensive specification having many desirable characteristics, but it suffers from two defects that are shared by many of the aforementioned geometry languages. First, in balancing the concerns of consumers of the language, who generally prefer uniformity and simplicity, versus producers, who generally prefer expressiveness and flexibility, GML weighs heavily in favor of producers. It defines many, many possible shapes and shape-related options. The effect of this imbalance is that, in practice, consumers can not and do not accept but an idiosyncratic fraction of the entire GML language. The second defect is that GML does not meet any of the conditions of requirement 3 above.
The XML schema below represents a first effort at defining a geometry language that addresses these concerns. The language is a profile of GML, that is, a subset and logical restriction of GML such that any instance document that adheres to the language below also adheres to GML and can be interpreted by any GML consumer.
This deliberately simple geometry language supports just three
possible shapes: points, polylines ("linestrings" in GML parlance),
and simple (i.e., self-intersection-free and hole-free) polygons.
Each shape is represented in the language by both an XML schema type
(e.g., PolygonType) and an XML element (e.g.,
<Polygon>). However, the intention of the language
is that only schema type AbstractFeatureType be
referenced by application schemas; this usage forces a bounding box to
be associated with every shape in instance documents. SRSs can be
declared using the srsName attribute.
| ADL-geometry.xsd |
|---|
<?xml version="1.0" encoding="UTF-8"?> |
The above geometry language, expressed as a profile of GML, has a number of nice properties, not the least of which is that it weeds out 99% of the 600-plus-page GML specification. However, there are a number of serious deficiencies which are still unresolved:
srsName attribute to be placed on
the <Envelope> element. Unfortunately, at the time
of this work, there appears to be no standard means of referring to
SRSs.<Envelope>
element "defines an extent using a pair of positions defining opposite
corners," that is, using a pair of minimum and maximum coordinate
values. A consequence of being defined this way, as opposed to being
defined in terms of explicitly-labeled east and west boundaries, is
that it is not possible to describe a bounding box that crosses the
±180° meridian (or other discontinuity).
If east/west
bounding coordinates are mapped to minimum/maximum coordinates
according to their values, then a bounding box such as Russia's will
be misinterpreted (its east bounding coordinate, being less than its
west, will be considered the minimum coordinate value), with the
result that the GML envelope will describe the longitudinal complement
of the desired bounding box. Always mapping the west (east) bounding
coordinate to the minimum (maximum) coordinate value, even when west
is numerically greater than east, would solve the problem (this is
effectively equivalent to explicitly labeling the east and west
boundaries), but the GML specification gives no indication that this
is admissible or that SRSs may employ such modular arithmetic.
It seems that the
only unambiguous and correct method of encoding a bounding box that
crosses the ±180° meridian is to convert the bounding box
to a whole-world band. But this loss of shape fidelity results in
many false positives by spatial search engines and is
unacceptable.AbstractFeatureType element type. Thus, instead of being
able to say
<element name="my-element" type="gml:PolygonType"/>
the application schema must say
<element name="my-element">
<complexType>
<complexContent>
<extension base="gml:AbstractFeatureType"/>
</complexContent>
</complexType>
</element>
But notice in the above that the ability has been lost to restrict the
possible shapes <my-element> can take on to, say,
polygons. The geometry language is further misleading because
declarations such as PolygonType and
<Polygon> are publicly visible, and application
schemas will naturally assume that they can be directly referenced.
An alternative approach would be to abandon
AbstractFeatureType altogether, and use GML's
<metaDataProperty> element for storing associated
bounding boxes. In this approach, an application could say
<element name="my-element" type="gml:PolygonType"/>
and an instance document would look like
<my-element>
<gml:metaDataProperty>
<gml:GenericMetaData>
<gml:boundedBy>
<gml:Envelope>
<gml:coordinates>...</gml:coordinates>
</gml:Envelope>
</gml:boundedBy>
</gml:GenericMetaData>
</gml:metaDataProperty>
<gml:exterior>
...
</gml:exterior>
</my-element>
Whether the language should support aggregate shapes (i.e., sets of
disjoint shapes treated as first-order shapes)—and if so, which
kinds—is an open question. On the one hand, aggregate shapes
are desirable because they can offer vastly greater fidelity to true
region shapes: consider the footprint of the United States described
as an aggregate of three shapes (contiguous 48 states; Alaska; Hawaii)
versus as a convex hull or bounding box of those shapes. On the other
hand, aggregate shapes bring concomitantly large increases in
interface and implementation complexity. Then again, if the language
were to support aggregates, consumers would always have the option of
falling back to bounding boxes.Finally, below is an extension to the above geometry language that adds a disk shape (defined by center and radius) and several convenience declarations. As an extension, it is necessarily incompatible with GML.
| ADL-geometry-extended.xsd |
|---|
<?xml version="1.0" encoding="UTF-8"?> |
created 2004-08-25; last modified 2009-11-20 10:17