Spatial Footprint Visualization

Contents

Background

Each of the items in a geospatial digital library or geographic information system has associated with it a subset of the Earth's surface that represents the item's spatial coverage or spatial relevance [1]. We'll refer to this subset as the item's footprint. Visualizing footprints against a background map is a useful way to contextualize and evaluate the associated items, especially the items belonging to a query result set. And in the specific case of a query result set, visualization can also yield information about the query itself— for example, whether or not the query was underspecified, and if so, how it might be further constrained.

Visualizing footprints is relatively easy if all footprints are points:

However, visualizing shapes that have areal extent introduces several complications (for this discussion we'll focus on boxes, but the same complications arise with polygons, circles, etc.):

As of this writing, few spatial query systems support visualization of query result sets. Most systems (Geospatial One-Stop and the Geography Network are representative) support map-based, spatial search, but revert to simple linear textual listings when presenting results. Those systems that do support visualization of spatial result sets (Google Local, Yahoo! Local, and MSN are recent examples) support visualization of point footprints only.

The Alexandria Digital Library's default webclient supports visualization of boxes. Currently it can display only one result footprint at a time, a soon-to-be-removed limitation of the third-party map software it is bundled with. Improved map software will allow multiple footprints to be displayed simultaneously, raising the issue of how query result sets can best be visualized.

Result set test collection

To evaluate different footprint visualization techniques on real data, we have developed a test collection of result sets that illustrates the wide variety of result sets users encounter in practice and the kinds of visualization challenges such result sets pose. The collection was generated by randomly selecting 100 queries from the ADL log files (among queries that yielded at least one result, that is), categorizing the associated result sets, and hand-selecting the following 11 representatives. Footprints are visualized here as red rectangles [2].

  1. result set 01 Ideal. Here, the query "online images containing phrase 'simi valley'" yielded 8 results having 8 homogeneous, clustered, yet clearly distinct footprints. This is the simplest and ideal case; if only it were always this easy.
    Query
    Results
  2. result set 02 Duplicates. A ubiquitous problem is that footprints are not unique; not only do they overlap and occlude one another, they are often completely coincident. For example, this query ("cartographic works overlapping a query region and containing words 'new haven'"; the query region is indicated here in green) yielded 38 results and but only 7 distinct footprints.
    Query
    Results
  3. result set 03 Inside. In queries that use the "within" spatial operator, the query region often provides good context for interpreting the results.
    Query
    Results
  4. result set 04 Outside. The "overlaps" and "within" spatial operators account for approximately 80% and 16% of spatial queries, respectively; the "contains" operator (as in, "find items that completely contain the query region") is used only 4% of the time. When it is, it often provides very poor context for interpreting the results, as this example shows.
    Query
    Results
  5. result set 05 Range of sizes. A result set can hold footprints of wildly different sizes. Here, the query "containing phrase 'point dume'" yielded the cluster of 22 results shown on the right, along with another 6 (coincident) results whose footprints are over 1,000× larger in terms of area.
    Query
    Results
  6. result set 06 High density. Result set density varies considerably. Here, a relatively underspecified query ("digital items overlapping a query region") yielded 244 unique footprints. Given the query limit of 250 results, a reasonable inference the user could draw from this visualization is that there is plenty of data distributed over the query region, and some additional criteria must be supplied.
    Query
    Results
  7. result set 07 Medium density.
    Query
    Results
  8. result set 08 Low density. This query ("online, containing word 'oil'") yielded a very low density result set. The footprints are so small in relation to their spatial distribution that iconic representation might be more appropriate in this case.
    Query
    Results
  9. result set 09 Telescope effect. An "overlaps" spatial query (the tiny green dot) produced this classic result set. The results span the spectrum from the specific (a result with a footprint slightly larger than the query region; approximately 1 deg2) to the most general possible (a political map of the world; 64,800 deg2). Given the relative specificity of the query in this case, the user would likely find most of these results to be irrelevant, and thus the whole-world view of the result set that we've shown here is less than desirable. This is a frequently occurring pattern.
    Query
    Results
  10. result set 10 Tiled. The tiling of this result set is due to the tiled nature of the underlying collection.
    Query
    Results
  11. result set 11 Extreme. A result set that is extreme in terms of the number of footprints (250 results; 218 unique footprints), distribution of footprints (worldwide), density of footprints (it's not apparent in this visualization, but the density is high in the Southern California region), and range of footprint sizes (the ratio of the areas of the largest and smallest footprints is 1.8 × 106). The query in this case ("online maps") was overly general, and so the goal of visualizing the result set should be to communicate that fact and to indicate how the user might profitably refine the query.
    Query
    Results

Iconic representation of boxes

Typically, icons are used in visualization when literal representations would be too small or too dense. The Acme GeoRSS Map Viewer gives some nice examples of and an algorithm for automatically clustering point footprints and visualizing footprint clusters with icons.

box icons exampleHere's an idea for using icons for the opposite reason: for visualizing items that are too large. The figure to the right is result set 9 above ("telescope effect"), but zoomed in so that (the edges of) some of the larger footprints are no longer visible. The existence of these footprints has instead been indicated by arrow icons on the right side of the map. If the figure were an active map, clicking on the icons would presumably zoom the map out to reveal the selected footprint.

There are obviously many variants and refinements of this core idea. Icons could represent distinct footprints, or they could represent fixed, larger zoom levels at which more footprints are visible. A rule is needed to determine when a footprint is visualized literally versus iconically. One possibility is to visualize a footprint literally whenever at least two of the footprint's edges are at least partially visible (i.e., at least one corner or two parallel edges). In any case, the general idea is to use icons to indicate the existence of items that would not otherwise be apparent.

Previous work

There doesn't appear to be a lot of previous work in this area. The closest is the MetaViz project [3], which experimented with a number of techniques for visualizing geographic metadata. This work did not really address the issues of footprint occlusion and disambiguation and footprint disappearance as a function of map zoom. Nevertheless, the techniques are intriguing, and it would be nice to apply and extend this work to the real world result sets given above.

A little farther afield is the University of Maryland's work on Generalized Query Previews.

Footnotes

[1] There are some subtleties in defining spatial relevance that we're ignoring here. See: Douglas R. Caldwell, Unlocking the Mysteries of the Bounding Box, Coordinates: Online Journal of the Map and Geography Round Table, ser. A, no. 2 (American Library Association; August 29, 2005).

[2] Background maps and rendering courtesy of Google Maps.

[3] Volker Jung, MetaViz: Visual Interaction with Geospatial Digital Libraries, International Computer Science Institute (Berkeley, CA) technical report TR-99-017 (October 1999).

created 2006-01-24; last modified 2012-05-07 11:21