Re: Attribute Architecture -- new type?

On Fri, Aug 15, 2003 at 12:58:57PM +0100, Robert Sanderson wrote:
> access=title
> comparison=any
> format=string
> Term=Utz Jaws Skyscraper
> 
> If we allow any to be applied to a single term rather than being an error, 
> then is this a single, whacked out, term or is it 3 terms?

I would have said 1 term, because 'format=string' to me would say
to treat the Term as a single string. If it was 'format=coordinate'
then 'Term=12,41 54,41 52,54' would have been 3 terms.

> > Would knowing the
> > number of terms change how the parser did its job? If parsing words
> 
> In the above case, yes, it would split 1 string in to 3 strings.

How to know to split into 3 strings? Does comparison=any + format=string
mean split terms on whitespace? Or does it mean at word boundaries?
(Words may be separated by punctuation without any whitespace.)

Just trying to clarify what you are suggesting in my mind. I think you
are suggesting (2), I am suggesting (3). But here are some alternative
interpretations.

(1) Use of any/all/adj implies the query term consists of words, irrespective
    of what format= says (ie: comparison=any + format=string means input
    string is words, not a single string).

(2) Use of any/all/adj as a comparison means split the term on white space
    (no connotations of words - just split on whitespace). Then treat each
    value separated by whitespace as input into the normal comparison
    process.

(3) Use of any/all/adj does not affect how to extract multiple terms from
    the query string. format=string alone does this. If multiple terms are
    extracted from the query string, then any/all/adj kicks in. (Otherwise
    all 3 are identical in their behaviour - the single term must match.)

(Let me know if I have captured the semantics you are proposing correctly.)

I dislike (1) because any/all/adj overrides the format attribute.
I don't think you are proposing this. (2) is a valid option (more below).
(3) is what I had been pushing.

The implication of (2) is that the query string before its changed into
terms has to check to see if any/all/adj is in effect, and if so, then
split the query string on whitespace and then do the processing
as if 3 query strings had been supplied joined by AND/OR/PROX nodes.

But what if format=word is specified and a word parser splits on hyphens
so that 'book-case' has two terms extracted ('book' and 'case').
What does 'any' (and 'all' and 'adj') mean with

    access=title
    comparision=any
    format=word
    Term=child's book-case 

versus

    access=title
    comparision=any
    format=string
    Term=child's book-case 

Does the first mean the title must contain 'child's' or 'book' or 'case'?
Does the second mean the title must equal 'child's' or 'book-case'?

Just trying to work out the exact semantics of what you are suggesting.
I do not agree or disagree yet - I am not sure what text you would put
into the spec to describe the behaviour.

> > from a string, how would the client know the exact parsing rules
> > the server is going to use? (eg: book-case - how many terms?)
> 
> It could just say 'multiple'.  But the client knows (in theory) what the
> Term means, at least to the point that it was a single term or multiple
> terms.  Exact number of terms is probably too much, so 
> null/single/multiple/unknown is a better division.

I agree that if this route is taken, removing 'exact number' is better.

Alan

Received on Sunday, 17 August 2003 23:58:52 UTC