A spatial similarity function is a scalar function of two regions in the Euclidean plane. We intuitively think of the function as indicating the similarity of the regions in terms of size, shape, and location. Here's a quick comparison of three candidate spatial similarity functions.
[If your browser can't display mathematical symbols, try reading this version instead.]
Given regions P and Q, functions A (think addition) and U (think union) are given by
A(P, Q) = 2 area(P∩Q) / (area(P) + area(Q))
U(P, Q) = area(P∩Q) / area(P∪Q)
Function H, the Hausdorff distance, is given by
H(P, Q) = max(h(P, Q), h(Q, P))
where
h(P, Q) = max{∀p∈P | min{∀q∈Q | dist(p, q)}}
The helper function h(P, Q) can be thought of as the "directed distance" from P to Q: it is the distance between the point p∈P that is farthest from any point in Q and the point q∈Q that is closest to p. Intuitively, the Hausdorff distance is a measure of the mismatch between P and Q; if the Hausdorff distance is d, then every point of P is within distance d of some point of Q and vice versa. H(P, Q) is a true distance function and forms a metric space over the set of all closed, bounded sets in the Euclidean plane.
If both regions are rectangular boxes, calculating the Hausdorff distance is relatively easy, and sample Python code to do so is given below.
Some examples...
![]() |
Two boxes that overlap slightly. |
||||||||||
![]() |
The small box has moved diagonally so that it is now entirely contained in the larger box. A and U increase, H decreases. |
||||||||||
![]() |
The small box has moved some more. Notice that A and U are unchanged from the previous example because the intersection area is the same, but H has decreased because it takes into account the relative locations of the boxes: the centers of the boxes are now closer. |
||||||||||
![]() |
As the boxes become more and more similar, A and U approach 1 and H approaches 0. |
||||||||||
![]() |
A and U are always zero if there is no intersection; H reflects the distance between the boxes. |
||||||||||
![]() |
A long, skinny box intersects a square box. |
||||||||||
![]() |
The long, skinny box has been replaced by a squatter box having the same area. Because the intersection area is the same, A and U are unchanged. H has decreased because the boxes are now more similarly shaped and located. |
||||||||||
![]() |
One box contained in another. Notice the sizes of the boxes. |
||||||||||
![]() |
Same example, scaled larger. Being fractions, A and U are unchanged. H has increased because it is based on absolute distances. In other words, A and U are independent of scale, H is not. |
||||||||||
![]() |
A small box contained within a much larger box. |
||||||||||
![]() |
Same example, but the small box has been replaced by a box that occupies almost half the large box. A and U increase dramatically, but H is virtually unchanged. This is because Hausdorff distance takes into account only extremal distances, not preponderances of distances. |
The above examples show that Hausdorff distance has many of the characteristics we desire for a spatial similarity function with the exception of being entirely dependent on extremal distances. An improvement over Hausdorff distance might be made by incorporating more information about the distribution of points, for example, by computing the average distance between all pairs of points in the two regions. Unfortunately, analytic solutions to average distance problems are surprisingly complex and burdensome to compute; see Birendranath Ghosh, "Random Distances Within a Rectangle and Between Two Rectangles," Bulletin of the Calcutta Mathematical Society 43 (1951): 17-24; and Steven R. Dunbar, "The Average Distance Between Points in Geometric Figures," The College Mathematics Journal 28(3) (May 1997): 187-197.
The Python code below computes the Hausdorff distance between two rectangular boxes P and Q. The algorithm is based on the following observation. If h(P, Q) = d, then the distance d must be achieved by one of the corners of P; call it p. The closest point to p in Q will be either the corner of Q nearest p or the perpendicular projection of p onto the nearest boundary of Q.
import math
class Point:
def __init__ (self, x, y):
self.x = x
self.y = y
class Box:
def __init__ (self, min, max):
# min and max are Points representing the
# box's extremal corners
self.min = min
self.max = max
def H (P, Q):
return max(h(P, Q), h(Q, P))
def h (P, Q):
return max(dist(Point(P.min.x, P.min.y), Q),
dist(Point(P.min.x, P.max.y), Q),
dist(Point(P.max.x, P.min.y), Q),
dist(Point(P.max.x, P.max.y), Q))
def dist (p, Q):
q = Point(min(max(p.x, Q.min.x), Q.max.x),
min(max(p.y, Q.min.y), Q.max.y))
return math.sqrt((p.x-q.x)*(p.x-q.x) +
(p.y-q.y)*(p.y-q.y))
Addendum: in general, the Hausdorff distance between any two closed, convex (and bounded and nonempty) sets is achieved on the boundaries of those sets. See: M. D. Wills, Hausdorff Distance and Convex Sets, Journal of Convex Analysis 14(1) (January 2007): 109-118.
created 2003-05-21; last modified 2009-01-20 09:35