|
/ modules / paradigms / Adaptor_IndivisibleConcatenation.py
SYNOPSIS
Adaptor_IndivisibleConcatenation (paradigms,
mapping=TextUtils.mappings.nonAlphanumericToWhitespace,
deleteList=TextUtils.deleteLists.keepAll)
paradigms
A list of one or more underlying paradigms.
mapping
A Python character mapping table (i.e., a string of
length 256, indexed by ASCII character code) to process
constraint text with. Defaults to
'nonAlphanumericToWhitespace', which preserves
alphanumeric characters and maps all other characters to
whitespace (i.e., to word separators). Used only when
the constraint operator is "contains-all-words".
deleteList
A string of zero or more characters to delete from
constraint text. The default is the empty string, which
keeps all characters. Used only when the constraint
operator is "contains-all-words".
DESCRIPTION
An adaptor that adds support for bucket-level textual searching
to a set of constituent paradigms (the "underlying" paradigms),
each of which supports some kind of textual search, by treating
a bucket-level search as a virtual search over the logical
concatenation of the constituent textual content.
This paradigm is identical to Adaptor_Concatenation except that
in the latter paradigm field-level search is supported, and the
underlying paradigms are explicitly associated with fields; this
paradigm supports bucket-level search only. As a consequence,
the underlying paradigms are specified as a simple list.
As in Adaptor_Concatenation, a bucket-level constraint (O, T),
where O is a textual operator and T is constraint text, is
handled as follows. If O is "contains-any-words" or
"contains-phrase", the constraint is passed to all underlying
paradigms and the resulting queries are UNIONed together.
Otherwise, if O is "contains-all-words", this paradigm parses T
into one or more words (W1, W2, W3, ...) by: 1) deleting from T
any characters that appear in 'deleteList'; 2) mapping the
remaining characters using 'mapping'; and 3) treating sequences
of whitespace characters as word separators. For each word W
this paradigm then passes a new constraint (O, W) to each
underlying paradigm and UNIONs the resulting queries, and those
UNIONs are then INTERSECTed. If underlying paradigm i returns
query Qi(W) on word W, then the overall returned query has the
form:
(Q1(W1) UNION Q2(W1) UNION Q3(W1) ...) INTERSECT
(Q1(W2) UNION Q2(W2) UNION Q3(W2) ...) INTERSECT ...
Exceptions thrown:
no query words specified
AUTHOR
Greg Janee
gjanee@alexandria.ucsb.edu
HISTORY
$Log: Adaptor_IndivisibleConcatenation.py,v $
Revision 1.2 2003/12/15 23:54:17 peter
Mondified source code documentation so that it formats properly when
creating HTML documents with happydoc.
Revision 1.1 2003/12/08 23:32:56 valentin
update to oct2003
Revision 1.1 2003/10/21 20:04:57 gjanee
Initial revision
|