ADL_query_translator modules/paradigms/Textual_LikeDelimitedSubstring.py

modules / paradigms / Textual_LikeDelimitedSubstring.py 


 SYNOPSIS

     Textual_LikeDelimitedSubstring (table, idColumn, textColumn,
         delimiter, cardinality,
         mapping=TextUtils.mappings.uppercaseAlphanumericOthersToWhitespace,
         deleteList=TextUtils.deleteLists.keepAll, function=None)

         table
             A table to query, e.g., "holding".

         idColumn
             The table's object identifier column (i.e., the column
             to be selected), e.g., "holding_id".

         textColumn
             The table column containing the text to search over
             (i.e., the column against which the constraint is to be
             placed), e.g., "subject_text".

         delimiter
             A single character that serves to delimit words in
             'textColumn', e.g., "^".

         cardinality
             A Cardinality object representing the cardinality of
             'table' with respect to 'textColumn'.

         mapping
             A Python character mapping table (i.e., a string of
             length 256, indexed by ASCII character code) to process
             constraint text with.  Defaults to
             'uppercaseAlphanumericOthersToWhitespace', which maps
             alphanumeric characters to their uppercase equivalents
             and all other characters to whitespace (i.e., to word
             separators).

         deleteList
             A string of zero or more characters to delete from
             constraint text.  The default is the empty string, which
             keeps all characters.

         function
             A function to apply to 'textColumn' (e.g., "UPPER"), or
             None.  Defaults to None.

 DESCRIPTION

     Translates a textual constraint to a boolean combination of one
     or more substring matches using SQL LIKE operators.

     This paradigm assumes that the text in 'textColumn' has been
     encoded such that words are delimited by a common delimiter
     character and phrases are separated by two or more delimiter
     characters.  For example, assuming the delimiter character is
     "^", a column value containing the two phrases "I am Sam" and
     "Sam I am" would be encoded as:

         ^I^am^Sam^^Sam^I^am^

     Given a textual constraint (B, O, T) where B is a textual
     bucket, O is one of the standard textual operators, and T is a
     text string, this paradigm parses T into a sequence of one or
     more words (W1, W2, W3, ...) by: 1) deleting from T any
     characters that appear in 'deleteList'; 2) mapping the remaining
     characters using 'mapping'; and 3) treating sequences of
     whitespace characters as word separators.  The paradigm then
     returns one of the following queries (we use "^" here to
     represent the delimiter character).  If O is
     "contains-all-words":

         SELECT idColumn FROM table
             WHERE textColumn LIKE '%^W1^%' AND
                   textColumn LIKE '%^W2^%' AND
                   textColumn LIKE '%^W3^%' ...

     If O is "contains-any-words":

         SELECT idColumn FROM table
             WHERE textColumn LIKE '%^W1^%' OR
                   textColumn LIKE '%^W2^%' OR
                   textColumn LIKE '%^W3^%' ...

     If O is "contains-phrase":

         SELECT idColumn FROM table
             WHERE textColumn LIKE '%^W1^W2^W3^...^%'

     If a text column function is specified (e.g., "UPPER"), the
     returned query will have the form:

         SELECT idColumn FROM table
             WHERE UPPER(textColumn) LIKE ...

     Under certain circumstances the query

         SELECT idColumn FROM table
             WHERE 1 = 0

     may be returned.

     The semantics of the "contains-all-words" operator will
     generally be correct only if the cardinality is "1" or "1?".  If
     the cardinality is "0+" or "1+", wrap this paradigm in an
     Adaptor_IndivisibleConcatenation paradigm.

     Exceptions thrown:

         no query words specified

 AUTHOR

     Greg Janee
     gjanee@alexandria.ucsb.edu

 HISTORY

     $Log: Textual_LikeDelimitedSubstring.py,v $
     Revision 1.4  2003/10/21 20:34:37  gjanee
     Minor (but critical) documentation change.

     Revision 1.3  2003/01/29 21:13:50  gjanee
     Recoded slightly to take advantage of new paradigm convenience
     functions.

     Revision 1.2  2003/01/24 04:14:12  gjanee
     Minor update to conform to the "transparent immutable objects"
     programming model.  Fixed an obscure bug.

     Revision 1.1  2002/10/31 22:32:13  gjanee
     Initial revision

Imported Modules   

import UniversalTranslator
import edu.ucsb.adl.middleware
import paradigms
import string
import types

Classes   

Textual_LikeDelimitedSubstring


This document was automatically generated Thu Mar 4 12:45:30 2004 by HappyDoc version WORKING