11 PUBLICATIONS

Home Alexandria Digital Library: ANNUAL REPORT Prev

11 PUBLICATIONS

In relation to publications that have resulted from project-related activities, we first present selected abstracts from some of the research papers describing the results of our activities. These abstracts are organized by research area. We then present a bibliography of papers.

11.1 Abstracts of Selected Articles

11.1.1 Testbed Research and Development

Alexandria Digital Library: Rapid Prototype and Metadata Schema C. Fischer, J. Frew, M. Larsgaard, and T. R. Smith

The Alexandria Project is focused on the design, implementation, and deployment of a digital library for spatially-indexed information. We describe and evaluate the architecture and functioning of a rapid prototype system (RPS) for the Alexandria Digital Library. The RPS was constructed to permit investigation of a number of important issues and to provide a functioning digital library. It is populated with a limited but heterogeneous set of spatially-indexed information, including digitized maps, aerial photographs, and satellite images. The RPS is being used to evaluate the implementation of spatial metadata standards in a relational schema; the relative merits of map-based versus form-based user interfaces; and the use of high-level scripting languages to customize large software packages. A central aspect of a digital library for spatially-indexed information is the catalog component and metadata schema. We evaluated both the "Federal Geospatial Data Committee - Content Standard for Digital Geospatial Metadata" (FGDC) and the "United States MAchine Readable Cataloging" (USMARC) standards during the construction of the rapid prototype. We chose to employ a hybrid schema that combined the best aspects of both standards. The rapid prototype system is currently serving as a facility

The WWW Prototype of the Alexandria Digital Library D. Andresen, L. Carver, R. Dolin, C. Fischer, J. Frew, M. Goodchild, O. Ibarra, R. Kothuri, M. Larsgaard, B. Manjunath, D. Nebert, J. Simpson, T. Smith, T. Yang, Q. Zheng

The Alexandria Digital Library (ADL) is focused on providing broad access to distributed collections of spatially-indexed information. ADL has a four-component architecture involving collections, catalog, interfaces, and ingest facilities. The first stage in the construction of ADL resulted in the design and implementation of a rapid prototype (RP) system. The second stage, which is described in this paper, involves an expansion of the functionality of the RP and its extension to the World-Wide-Web (WWW). We describe issues arising in each of the components of the architecture in extending the library to WWW as well as our current resolution of these issues. We also discuss an extension of the class of supportable queries to include simple, content-based queries involving geographic "features" and image textures. The metadata of ADL has been extended to include gazetteer information supporting the first class of extended queries. We discuss image processing and parallel computing support for ADL.

The Alexandria Digital Library: Overview and WWW Prototype Terence R. Smith, D. Andresen, L. Carver, R. Dolin, C. Fischer, J. Frew, M. Goodchild, O. Ibarra, R. Kothuri, M. Larsgaard, B. Manjunath, D. Nebert, J. Simpson, T. Yang, Q. Zheng.

The goal of the Alexandria Digital Library (ADL) is to provide online access to distributed collections of geographically-referenced information. The ADL will comprise a set of Internet nodes implementing various combinations of collections, catalogs, interfaces, and ingest facilities (the four primary components of the ADL architecture.) ADL development efforts to date have concentrated on the catalog and user interface components. The first ADL development cycle yielded a stand-alone rapid prototype (RP) system, based on commercial database management (DBMS) and geographic information system (GIS) technology. The second (current) development cycle is assembling a "Web prototype" (WP) spatial data library accessible from the World Wide Web (WWW). In addition to the metadata issues associated with the catalog, and the functionality issues associated with a complex WWW interface, the WP includes preliminary applications of image processing and parallel computing technologies.

11.1.2 Information Systems Research

Indexing Hierarchical Data R. V. Kothuri et al.

Map and imagery data in geographic information systems are inherently hierarchical with multiple levels of spatial nesting. Although this type of hierarchy is widely prevalent in spatial data domains, the issue of indexing such nested data has not received much attention in the database and indexing community. In this paper, we address several issues that arise while designing index structures for hierarchical data. B-trees and related structures can only index unidimensional point data. We extend B-trees (to IB-trees) to handle data objects that span over a range of values rather than single-valued points in the data space. There are two main advantages of our proposal: first, an acceptable worst case bound exists on the various operations, and second, searching of multiple paths as in R-trees is completely avoided. Using the IB-trees we propose the Level-based IB-tree (LIB) structure that adequately reflects the nesting of data objects. Objects in d-dimensional data space are decomposed into an interval in each dimension and indexed using the LIB-structure for that dimension. We conclude our analysis with experimental results comparing the performance of LIB structures with R* trees on a sample of real data obtained from the Alexandria gazetteer.

Content Based Placement and Browsing of Image Data S. Prabhakar et al.

With the rapid advances and fusion in computer and communication technologies, there is an increasing demand to build large image repositories. The Alexandria project at UC Santa Barbara has been initiated to build a "digital library" for maps and image data. One of the major hurdles that need to be overcome in the design of such a library is related to the storage and retrieval of image data. One of the approaches that is being pursued in Alexandria is to provide content based retrieval of images. That is, image processing techniques are used to extract image features and then use these features to organize the image data in a multidimensional vector space. A typical user session will usually involve browsing through a rather large collection of images with the user input being the deciding factor in determining the exact answer to the queries. Since query results may contain redundant images, this may result in a significant waste of I/O and network bandwidth. In order to minimize the waste of these precious resources, the Alexandria architecture plans to use multi-resolution representation, referred to as "wavelet" transform, of images. The wavelet approach represents an image by several coefficients, one of them with visual similarity to the original image but at a lower resolution. Thus, this coefficient can be thought of as the "thumbnail" or "icon" of the original image.

This paper addresses the problem of storing the wavelet coefficients on disk(s) so that thumbnail browsing as well as image reconstruction can be performed efficiently. Several strategies are evaluated to store the image coefficients on parallel disks. These strategies can be classified into two broad classes depending on whether the content of the images, and content-based metrics are used or not in the placement of image coefficients. Disk simulation is used to evaluate the performance of these strategies. The data used in the simulation are of two types: the entire 1,856 textures from a standard collection of textures as well as 10,000 to 50,000 real Landsat images. The results indicate that if content based retrieval is used to access the images then this information should also be used for the placement of images on the disk. In particular, when content based placement is used to store image coefficients on disks, performance improvements up to 40% are achieved using as few as four disks.

Efficient Retrieval for Browsing Large Image Databases D. Wu et al.

The management of large image databases poses several interesting and challenging problems. These problems range from ingesting the data and extracting metadata to the efficient storage and retrieval of the data. Of particular interest are the retrieval methods and user interactions with an image database during browsing. In image databases, the response to a given query is not an exact well-defined set; rather, the user poses a query and expects a set of responses that should contain many possible candidates from which the user chooses the answer set.

In this paper we start by exploring the browsing model in Alexandria, a digital library for maps and satellite images. Designed for content-based retrieval, the relevant information in an image is encoded in the form of a multi-dimensional feature vector. Various techniques have been previously proposed for the efficient retrieval of such vectors by reducing the dimensionality of such vectors. In this paper, we first show that for even moderately large databases (in fact, only 1856 texture images), these approaches do not scale well for exact retrieval. However, as a browsing tool, these dimensionality reduction techniques hold much promise.

11.1.3 Image Processing Research

Texture Features and Learning Similarity W. Y. Ma and B. S. Manjunath

This paper addresses two important issues related to texture pattern retrieval: feature extraction and similarity search. A Gabor feature representation for textured images is proposed, and its performance in pattern retrieval is evaluated on a large texture image database. The basic idea is to extract features at multiple scales and orientations using Gabor filters. These features compare favorably with other existing texture representations. In particular, a comprehensive comparison with multiresolution simultaneous autoregressive (MR-SAR) features, orthogonal wavelets, and tree-structured wavelet features is made using the entire Brodatz album. We conclude that the Gabor feature provide the best representation for texture pattern retrieval among these different features.

In the second part of the paper, we discuss learning similarity in the texture feature space. A hybrid neural network learning algorithm is used to cluster the image patterns in the feature space. It achieves the objective of maintaining the topology while reducing the dimensionality, and groups perceptually similar patterns into the same cluster. With learning similarity, the performance of similar pattern retrieval improves significantly.

An important aspect of this work is its application to real image data. Gabor feature extraction with similarity learning is used to search through aerial photographs of 30-100MB using texture content. Feature clustering enables efficient search of the database. Our preliminary results on searching over 280,000 image patterns from the airphotos indicate that search time can be easily reduced by a factor of 50-100.

An approach to efficient storage, retrieval, and browsing of large scale image databases. N. Strobel, S. K. Mitra, and B. S. Manjunath

This paper suggests a wavelet transform based multiresolution approach as a viable solution to the problems of storage, retrieval and browsing in a large image database. We also investigate the performance of an optimal uniform mean square quantizer in representing all transform coefficients to ensure that the disk space necessary for storing a multiresolution representation does not exceed that of the original image. In addition, popular wavelet filters are compared with respect to their reconstruction performance and computational complexity. We conclude that, for our application, the Haar wavelet filters offer an appropriate compromise between reconstruction performance and computational efforts.

An Eigenspace Update Algorithm for Image Analysis B. S. Manjunath, S. Chandrasekaran, and Y. F. Wang

During the past few years several interesting applications of eigenspace representation of the images have been proposed. These include face recognition, video coding, pose estimation, etc. However, the vision research community has largely overlooked parallel developments in signal processing and numerical linear algebra concerning efficient eigenspace updating algorithms. These new developments are significant for two reasons. Adopting them will make some of the current vision algorithms more robust and efficient. More important is the fact that incremental updating of eigenspace representations will open up new and interesting research applications in vision such as active recognition and learning. The main objective of this paper is to put these in perspective and discuss a recently introduced updating scheme that has been shown to be numerically stable and optimal. We will provide an example of one particular application to 3D object representation projections and give an error analysis of the algorithm. Preliminary experimental results are shown.

11.1.4 Parallel Processing Research

SWEB: Towards a Scalable World Wide Web Server on Multicomputers D. Andresen, T. Yang, V. Holmedahl, O. Ibarra, SWEB: Towards a Scalable World Wide Web Server on Multicomputers

We investigate the issues involved in developing a scalable World Wide Web (WWW) server on a cluster of workstations and parallel machines. The objective is to strengthen the processing capabilities of such a server by utilizing the power of multicomputers to match huge demands in simultaneous access requests from the Internet. We have implemented a system called SWEB on a distributed memory machine, the Meiko CS-2, and networked workstations. The scheduling component of the system actively monitors the usages of CPU, I/O channels and the interconnection network to effectively distribute HTTP requests across processing units to exploit task and I/O parallelism. We present the experimental results on the performance of this system.

Experimental Studies on a Compact Storage Scheme for Wavelet-based Multiresolution Subregion Retrieval

Subregion retrieval is an important feature of a digital library system for browsing large-scale images. The challenge is to access desired subregions efficiently from compressed image data. We have developed a wavelet-based image storage scheme that provides fast image subregion retrieval in progressively higher resolutions, while accomplishing good image compression ratios. The method is based on the quadtree and Huffman coding schemes and our preliminary experiments with sample satellite images show that 70-90% space reduction ratio is achieved for quantized image coefficient data.

11.2 References

Agrawal, D., J. Bruno, A. El Abbadi, and M. Krishnaswamy, Managing Concurrent Activities in Collaborative Environments, International Conference on Cooperative Systems, Vienna, Austria, 1995.

Alexandrov, A. D., W. Y. Ma, A. El Abbadi and B. S. Manjunath, "Adaptive Filtering and Indexing for Image Databases," in Proc. SPIE conf. on Storage and Retrieval of Image and Video Databases-III, San Jose, CA, pp. 12-23, Feb. 1995.

Andresen, D., L. Carver, R. Dolin, C. Fischer, J. Frew, M. Goodchild, O. Ibarra, R. Kothuri, M. Larsgaard, B. Manjunath, D. Nebert, J. Simpson, T. Smith, T. Yang, Q. Zheng, "The WWW Prototype of the Alexandria Digital Library," Proceedings of ISDL'95: International Symposium on Digital Libraries, Tsukuba, Japan, August 1995.

Andresen, D., T. Yang, V. Holmedahl, O. Ibarra, SWEB: Towards a Scalable World Wide Web Server on Multicomputers To appear in Proceedings of the 10th International Parallel Processing Symposium (IPPS '96), IEEE. Hawaii, April 1996.

Andresen, D., Yang, T., Ibarra, O., and T. R. Smith, Scalability Issues for High Performance Digital Libraries on the World Wide Web. To appear in Advances In Digital Libraries 96, 1996.

Buttenfield, B. P. Evaluating User Requirements for a Digital Library Testbed. Proceedings, AUTO-CARTO 12, Charlotte, North Carolina, 27 February - 1 March: 207-214, 1995.

Buttenfield, B. P. GIS and Digital Libraries: Issues of Size and Scalability. In Smith, L. C . (ed.) GIS and Libraries. Champaign-Urbana: University of Illinois Press (forthcoming), 1995.

Buttenfield, B. P. and M. P. Kumler, Tools for Browsing Environmental Data: The Alexandria Digital Library Interface. Proceedings Third International Conference on Integrating Geographic Information Systems and Environmental Modeling. Santa Fe, New Mexico, January 21-25, 1996

Buttenfield, B. P. GIS and Digital Libraries: Issues of Size and Scalability. In Smith, L. C. (ed.) GIS and Libraries. Champaign- Urbana: University of Illinois Press (forthcoming).

Fischer, C., J. Frew, M. Larsgaard, and T. R. Smith, Alexandria Digital Library: Rapid Prototype and Metadata Schema in Proceedings of ADL95, N. Adam, B. Bhargava, M. Halem and Y. Yesha (editors), Lectures Notes in Computer Science, Springer Verlag, 1995.

Frank, S. M., M. F. Goodchild, H. J. Onsrud, and J. K. Pinto (1995) A survey on user requirements for framework GIS data. Proceedings, URISA 95, San Antonio, TX, July 16-20, 1: 637-651.

Frew, J., L. Carver, C. Fischer, M. Goodchild, M. Larsgaard, T. Smith, and Q. Zheng, "The Alexandria Rapid Prototype: building a digital library for spatial information" in Proceedings of the 1995 ESRI User Conference Proceedings, Environmental Systems Research Institute, Inc., Redlands, CA, May 22-25, 1995.

Goodchild, M. F., (1994) Future directions for geographic information science. Geographic Information Sciences 1: 1-7.

Goodchild, M. F., (1995) Future directions for geographic information science. Proceedings, GeoInformatics '95, Hong Kong, May 26- 28, 1995, 1: 1-10.

Goodchild, M. F., (1995) Sharing imperfect data. In H. J. Onsrud and G. Rushton, editors, Sharing Geographic Information. New Brunswick, NJ: Rutgers University Press, pp. 413-425.

Goodchild, M. F., (1995) Sharing spatial data among physical scientists. In H. J. Onsrud and G. Rushton, editors, Sharing Geographic Information. New Brunswick, NJ: Rutgers University Press, pp. 475-489.

Goodchild, M. F., (1995) Spatial databases for global environmental issues. In Toward Global Planning of Sustainable Use of the Earth, edited by S. Murai. Proceedings of the Eighth TOYOTA Conference, Mikkabi, November 8-11, 1994. Amsterdam: Elsevier, pp. 43-58.

Goodchild, M. F., (1995) The application of advanced information technology in assessing environmental impacts. Proceedings, 1995 Bouyoucos Conference: Applications of GIS to the Modeling of Non-Point Source Pollutants in the Vadose Zone, Riverside, CA, May 1-3, 20-32.

Goodchild, M. F.,(1995) Technical advances in spatial data sharing. Proceedings, URISA 95, San Antonio, TX, July 16-20, 1: 651-661.

Grumbach, S. and J. Su, Towards Practical Constraint Databases, Proc. ACM Symp. on Principles of Database Systems 1996 (to appear).

Haley, G. M. and B. S. Manjunath, "Rotation-invariant texture classification using modified Gabor filters," Proc. second international conference on image processing, ICIP 95, Vol. I, October 1995, pp. 262-265.

Kothuri, R. and A. K. Singh, Indexing Hierarchical Data. Technical Report TR95-14, UCSB CS Department, UCSB, 1995

Lee, C., T. Yang, and Y.-F., Wang, Partitioning and Scheduling for Image Processing Operations, To appear in Proc. of IEEE Symp. on Parallel and Distributed Processing, Texas, Oct. 1995.

Lee, C., Y.-F., Wang, and T. Yang, Static Global Scheduling for Optimal Computer Vision and Image Processing Operations on Distributed-Memory Multiprocessors, To appear in Proc. of 6th International Conference on Computer Analysis of Images and Patterns. September 1995.

Ma, W. Y. and B. S. Manjunath, "A comparison of wavelet transform features for texture image annotation," Proceedings of IEEE International Conference on Image Processing, vol. II, pp. 256-259, Washington D. C., October 1995.

Ma, W. Y. and B. S. Manjunath, "Image indexing using a texture dictionary," Proceedings of SPIE conference on Image Storage and Archiving System, vol. 2606, pp. 288-298, Philadelphia, Pennsylvania, Oct. 1995.

Ma, W. Y. and B. S. Manjunath, "Texture features and learning similarity," to be presented at the IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA, June 1996.

Manjunath, B. S. and W. Y. Ma, "Texture features for browsing and retrieval of image data," in technical report CIPR95-06, Univ. of California at Santa Barbara, July, 1995.

Ma, W. Y. and B. S. Manjunath, "Texture features for browsing and retrieval of image data," to appear in the IEEE Transactions on Pattern Analysis and Machine Intelligence (Special Issue on Digital Libraries), Nov. 1996.

Ma, W. Y. and B. S. Manjunath, "Texture-based pattern retrieval from image databases," Journal of Multimedia Tools and Applications, vol. 2, no. 1, pp. 35-51, Jan. 1996.

Ma, W. Y., and B. S. Manjunath, "Pattern Retrieval in Image Database Based on Adaptive Signal Decompositions," in Proc. of the 28th Asilomar Conf. on Signal, System and Computers, Pacific Grove, CA, pp. 1351-1355, Oct 31-Nov 2, 1994.

Manjunath, B. S., S. Chandrasekaran, and Y. F. Wang, "An Eigenspace update algorithm for image analysis," Proc. IEEE International Symposium on Computer Vision 1995, Coral Gables, Florida (November 1995), pp. 551-556.

Nebert, D. D. 1995. Trends in Internet service of maps and spatial data sets, presented at Association of American Geographers Conference, March 1995, Panel Discussion on Project Alexandria

(http://h2o.er.usgs.gov/public/AAG/page1.html)

Nebert, D. D. and J. Fullton. 1995. Use of Z39.50 to search and retrieve geospatial data, IN: Proceedings, Digital Libraries '95 Austin, TX, June 11-13, 1995.

(http://h2o.er.usgs.gov/public/DLIpaper395.html)

NRCS (1995) Data Rich and Information Poor: A Report to the Chief of the Natural Resources Conservation Service by the Blue Ribbon Panel on Natural Resource Inventory and Performance Measurement. Washington, DC: Natural Resources Conservation Service, US Department of Agriculture (contributing author).

Plewe, Brandon 1995 "The GEOWEB Project: Map Tiling for Distributed Data." Master's Thesis, Department of Geography, SUNY-Buffalo, Buffalo, NY 14261.

Poulakidas, A., A. Srinivasan, O. Egecioglu, O. Ibarra, and T. Yang, Experimental Studies on a Compact Storage Scheme for Wavelet-based Multiresolution Subregion Retrieval, Proceedings of NASA 1996 Combined Industry, Space and Earth Science Data Compression Workshop, Utah, April 1996.

Poulakidas, A., A. Srinivasan, O. Egecioglu, O. Ibarra, T. Yang, Wavelet-based storage compression and fast subregion retrieval (Extended Abstract), Working Paper. 1995, UCSB.

Ramponi, G., Norbert Strobel, Sanjit K. Mitra, and Tian Hu-Yu "Nonlinear Unsharp Masking Methods for Image Contrast Enhancement" (submitted to Journal of Electronic Imaging, Special Issue on Nonlinear Image Processing)

Smith, T. R., 1996. The Alexandria Digital Library: Overview and WWW Prototype, IEEE Computer.

Strobel, N., and Sanjit K. Mitra "Quadratic Filters for Image Contrast Enhancement" in Proc. of the 28th Asilomar Conf. on Signal, System and Computers, Pacific Grove, CA, pp. 208-212, October 1994.

Strobel, N., Sanjit K. Mitra, and B. S. Manjunath, "An Approach to Efficient Storage, Retrieval, and Browsing of Large Scale Image Databases," Proceedings of the SPIE on Digital Image Storage and Archiving Systems, vol. 2606, pp. 324-335, Philadelphia, Pennsylvania, October 1995


Home Alexandria Digital Library: ANNUAL REPORT Prev
Last modified on 1996-02-27 at 18:19 GMT by the Alexandria Web Team