Each of the items in a geospatial digital library or geographic information system has associated with it a subset of the Earth's surface that represents the item's spatial coverage or spatial relevance [1]. We'll refer to this subset as the item's footprint. Visualizing footprints against a background map is a useful way to contextualize and evaluate the associated items, especially the items belonging to a query result set. And in the specific case of a query result set, visualization can also yield information about the query itself— for example, whether or not the query was underspecified, and if so, how it might be further constrained.
Visualizing footprints is relatively easy if all footprints are points:
However, visualizing shapes that have areal extent introduces several complications (for this discussion we'll focus on boxes, but the same complications arise with polygons, circles, etc.):
As of this writing, few spatial query systems support visualization of query result sets. Most systems (Geospatial One-Stop and the Geography Network are representative) support map-based, spatial search, but revert to simple linear textual listings when presenting results. Those systems that do support visualization of spatial result sets (Google Local, Yahoo! Local, and MSN are recent examples) support visualization of point footprints only.
The Alexandria Digital Library's default webclient supports visualization of boxes. Currently it can display only one result footprint at a time, a soon-to-be-removed limitation of the third-party map software it is bundled with. Improved map software will allow multiple footprints to be displayed simultaneously, raising the issue of how query result sets can best be visualized.
To evaluate different footprint visualization techniques on real data, we have developed a test collection of result sets that illustrates the wide variety of result sets users encounter in practice and the kinds of visualization challenges such result sets pose. The collection was generated by randomly selecting 100 queries from the ADL log files (among queries that yielded at least one result, that is), categorizing the associated result sets, and hand-selecting the following 11 representatives. Footprints are visualized here as red rectangles [2].
Ideal.
Here, the query "online images containing phrase 'simi valley'"
yielded 8 results having 8 homogeneous, clustered, yet clearly
distinct footprints. This is the simplest and ideal case; if only it
were always this easy.
Duplicates.
A ubiquitous problem is that footprints are not unique; not only do
they overlap and occlude one another, they are often completely
coincident. For example, this query ("cartographic works overlapping
a query region and containing words 'new haven'"; the query region is
indicated here in green) yielded 38 results and but only 7 distinct
footprints.
Inside.
In queries that use the "within" spatial operator, the query region
often provides good context for interpreting the results.
Outside.
The "overlaps" and "within" spatial operators account for
approximately 80% and 16% of spatial queries, respectively; the
"contains" operator (as in, "find items that completely contain the
query region") is used only 4% of the time. When it is, it often
provides very poor context for interpreting the results, as this
example shows.
Range of sizes.
A result set can hold footprints of wildly different sizes. Here, the
query "containing phrase 'point dume'" yielded the cluster of 22
results shown on the right, along with another 6 (coincident) results
whose footprints are over 1,000× larger in terms of area.
High density.
Result set density varies considerably. Here, a relatively
underspecified query ("digital items overlapping a query region")
yielded 244 unique footprints. Given the query limit of 250 results,
a reasonable inference the user could draw from this visualization is
that there is plenty of data distributed over the query region, and
some additional criteria must be supplied.
Medium density.
Low density.
This query ("online, containing word 'oil'") yielded a very low
density result set. The footprints are so small in relation to their
spatial distribution that iconic representation might be more
appropriate in this case.
Telescope effect.
An "overlaps" spatial query (the tiny green dot) produced this classic
result set. The results span the spectrum from the specific (a result
with a footprint slightly larger than the query region; approximately
1 deg2) to the most general possible (a political map of
the world; 64,800 deg2). Given the relative specificity of
the query in this case, the user would likely find most of these
results to be irrelevant, and thus the whole-world view of the result
set that we've shown here is less than desirable. This is a
frequently occurring pattern.
Tiled.
The tiling of this result set is due to the tiled nature of the
underlying collection.
Extreme.
A result set that is extreme in terms of the number of footprints (250
results; 218 unique footprints), distribution of footprints
(worldwide), density of footprints (it's not apparent in this
visualization, but the density is high in the Southern California
region), and range of footprint sizes (the ratio of the areas of the
largest and smallest footprints is
1.8 × 106). The query in this case
("online maps") was overly general, and so the goal of visualizing the
result set should be to communicate that fact and to indicate how the
user might profitably refine the query.Typically, icons are used in visualization when literal representations would be too small or too dense. The Acme GeoRSS Map Viewer gives some nice examples of and an algorithm for automatically clustering point footprints and visualizing footprint clusters with icons.
Here's an idea for using icons for the opposite reason:
for visualizing items that are too large. The figure to the
right is result set 9 above ("telescope effect"), but zoomed in so
that (the edges of) some of the larger footprints are no longer
visible. The existence of these footprints has instead been indicated
by arrow icons on the right side of the map. If the figure were an
active map, clicking on the icons would presumably zoom the map out to
reveal the selected footprint.
There are obviously many variants and refinements of this core
idea. Icons could represent distinct footprints, or they could
represent fixed, larger zoom levels at which more footprints are
visible. A rule is needed to determine when a footprint is visualized
literally versus iconically. One possibility is to visualize a
footprint literally whenever at least two of the footprint's edges are
at least partially visible (i.e., at least one corner or two parallel
edges). In any case, the general idea is to use icons to indicate the
existence of items that would not otherwise be apparent.
There doesn't appear to be a lot of previous work in this area. The closest is the MetaViz project [3], which experimented with a number of techniques for visualizing geographic metadata. This work did not really address the issues of footprint occlusion and disambiguation and footprint disappearance as a function of map zoom. Nevertheless, the techniques are intriguing, and it would be nice to apply and extend this work to the real world result sets given above.
A little farther afield is the University of Maryland's work on Generalized Query Previews.
[1] There are some subtleties in defining spatial relevance that we're ignoring here. See: Douglas R. Caldwell, Unlocking the Mysteries of the Bounding Box, Coordinates: Online Journal of the Map and Geography Round Table, ser. A, no. 2 (American Library Association; August 29, 2005).
[2] Background maps and rendering courtesy of Google Maps.
[3] Volker Jung, MetaViz: Visual Interaction with Geospatial Digital Libraries, International Computer Science Institute (Berkeley, CA) technical report TR-99-017 (October 1999).
created 2006-01-24; last modified 2012-05-07 11:21