next up previous
Next: Interface Design Subteam Up: INTERFACE DESIGN AND Previous: UCSB Component of

Colorado Component of the Interface Evaluation Team

Membership: Buttenfield (leader) Larsen, Reitsma, Kersky, Tsou, Rokoske, Rock, Smith, Kole

Mission Statement of Team The goal of the Colorado component of the Interface Evaluation Team is to evaluate the effectiveness of ADL from the user perspective, and to establish needs and requirements of three classes of target users. Knowledge gained from these activities informs system design and implementation. A variety of evaluation methods are applied to evaluate system functionality and user perspectives, and part of the Team's mission is to compare results of these methods to assess their relative effectiveness.

Overviews of the ADL User Evaluation effort are reported in Buttenfield and Goodchild (1996), and in Buttenfield (1997a). Overviews of the educational priorities are reported in Buttenfield and McLafferty (1996).

Progress with Transaction Log Analyses:

The first area of research progress relates to analyses of the ADL Web transaction logs. In conventional user evaluation empirical testing, users complete pre-defined tasks in a controlled (lab) environment. Task performance is subsequently decomposed into quantifiable variables of time-on-task and performance measures, and into descriptive variables gleaned by entry- and exit- interviews. Videotaping can augment these data with observed behavior patterns. These methods have been applied with success in numerous published studies, and used to evaluate early interface versions of ADL.

However, the assumptions which form a basis for conventional user testing do not strictly apply when the product under evaluation is on the Internet. This is because:

  1. Internet users may run multiple Internet sessions concurrently. They may stop a task periodically in response to distractions on the Internet or within their local environment. This invalidates the "structured work pattern" assumption.
  2. Internet users are not necessarily local, therefore comparative lab testing for a local subject pool may be biased.
  3. Entry- and exit- interviews may be ignored. Internet users characterize an ideal access route as one that is unimpeded by prerequisite self- identification. The incentive to describe a Web site experience diminishes once the Web site has been visited. The Internet effectively distances the users, the evaluators and the system designers.
These findings are drawn from user surveys run on the ADL Rapid Prototype at Buffalo and Colorado, and from comments taken during focus group sessions at Colorado, and have been reported in Buttenfield and Kumler (1996); Buttenfield (1996d) and most recently in Buttenfield and Larsen (1997).

Instead of monitoring single users or single sessions on the Web Testbed, we aggregate transaction log data for user activity as a whole, using a public domain transaction log parsing tool called ``Analog''. We can summarize ADL Web traffic by month, week, day, or hour; we can disaggregate to observe traffic on a single hotlink, or to measure frequent paths across multiple Web pages taken by users (Buttenfield, 1996a). Results of analyzing frequencies prove most informative. For example, a large volume of ``hits'' on the ``Bad'' button were observed on the gazetteer and catalog search results pages, during one month.

Checking the original logs, we discovered that users would perform a search and then jump back to the map browser page without tagging the footprints they wished to display. When the map browser display came up ``empty'', they returned to the Search Results pages and ``expressed their frustration''. By parsing the logs automatically, we could refine our attention in the logs to sequences of user activity indicating problem areas in the interface (Buttenfield, 1996c).

We envisioned graphical methods to observe the patterns of use, as another tool to reduce the large volume of transaction log information for analysis and interpretation. This proved to be most challenging, since modifications to the interface are coupled with modified use patterns: we could create frequency plots, but could not create a visual display of use patterns of one page in relation to other pages. How ``close'' in a user's mind is the online help file from the catalog, for example? We wanted to create a landscape of the frequency with which users jump between pages, and anchor each month's activity to view use pattern changes over time (Buttenfield, 1996d).

We solved the problem using multidimensional scaling to recover coordinates from frequencies of lag-one jumps between Web pages. Preliminary maps of the "use pattern landscapes" were presented at a UCSB colloquium in late January, and submitted a paper on them (Rock, Buttenfield and Reitsma, 1997). Our work will continue in this area applying gravity models and spatial interaction models to the problem. Our intention is to use the models to infer changes in use patterns that might result from adding or deleting Web links. In this way we can model use patterns prior to modifying the Web interface, in theory.

Spatial Metaphors for Retrieving Information from Very Large Catalogs

This effort began in Buffalo and has continued this year due in large to the dissertation research of Skupin who studied the following problem: in a catalog or gazetteer search, a user may define an initial query that returns hundreds of 'hits'. Systems present search results either in a list, or by alerting the user that the result-set is too large. If one could organize a large result set in such a way that a user could view it all at once, then query refinement might proceed by direct manipulation of a symbolic representation of the set, as opposed to tagging individual items in a long list, or trial-and-error query refinement.

We have developed a graphical display not unlike a scatter plot, where retrieved items are represented by points. Point locations are derived from statistical clustering of keyword similarities to put similar items closer together. (In ADL, keywords can be derived from metadata records.) The displays are hierarchical. One can zoom in to the scatter plots to uncover additional items.

Initial user response to these displays is very positive, and navigation within the scatter space is easy to learn. At present, we have developed an operable solution for small data sets (under 100 result set items) (Skupin and Buttenfield, 1996; Skupin and Buttenfield, 1997) although the solution appears to be scalable. We are currently implementing this for Web queries. Additional work on direct manipulation interface tools is underway at Colorado (Tsou and Buttenfield, 1996; Buttenfield and Tsou, 1997).

Representing Metadata and Testing User Reactions

One of the big challenges to browsing a large archive such as ADL is to recover information about an item prior to retrieving the actual item, which for spatial data is often quite large in terms of disk space. A lot of attention in the geographic community has been paid to graphical representations of metadata in general, and in uncertainty measures (data accuracy or precision) specifically. If a user can visualize the variations in data quality from the metadata record, it could inform the user about that data's fitness for use without going through the download procedure (Beard and Buttenfield, 1997).

For spatial data, accuracy can vary substantially across a map or image file, thus a single number does not capture what a user needs to know, in most cases. The challenge is that we do not understand the extent to which users comprehend data accuracy displays. Research by two other Buffalo doctoral students focuses on effective graphical tools for displaying uncertainty (Leitner and Buttenfield, 1996; Leitner and Buttenfield, 1997).

Accomplishments of Team not Included under Research or Testbed

Collaborating with ADL staff at UCSB, the CU Team prepared a report summarizing the findings from data collected by various user evaluation methods over an eight-month period. The data included videotapes of user sessions, think-aloud and talk-aloud reports, tape recordings of patron-librarian interactions in conventional library searches, user survey analyses, and logs of Web transactions.

The Colorado team focused on analyzing surveys from the UNIX Rapid Prototype, and on transaction log data. At present, the Teams are working through the summaries to recover commonalities in research results. For example, we have found in user surveys as well as from the cognitive affect buttons (Good, Bad and Comment Buttons on the Web) that users are alienated by some of the ADL vocabulary and jargon. Another finding that has surfaced in several studies is that users are uncomfortable with a library that does not 'exist' at a single place in time or space, and this disorienting feeling impedes their navigation through the interface. A completed copy of the joint UIE report will be ready for the May site visit.



next up previous
Next: Interface Design Subteam Up: INTERFACE DESIGN AND Previous: UCSB Component of



Terence R. Smith
Thu Feb 20 13:50:53 PST 1997