[Bug 4715] [FT] editorial: 3.5.3 Distance Selection

http://www.w3.org/Bugs/Public/show_bug.cgi?id=4715


jmdyck@ibiblio.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |




------- Comment #4 from jmdyck@ibiblio.org  2008-02-16 09:25 -------
(In reply to comment #3)
> [2] Resolved by removing "adjacent" and "consecutive" where relevant in the
> document.

Changing "adjacent tokens [etc]" to "no intervening tokens [etc]" doesn't
resolve the problem, because "intervening" is no more defined than "adjacent"
was.
   [2a] Given 2 matches, what does it mean for there to be no intervening
        tokens/sentences/paragraphs?
   [2b] Given n>2 matches, what does it mean?

> [1] No change. Decided not to move the bullet because this is the first place
> that the concept of distance arises and it is appropirate to place this
> sentence here.

I disagree on both counts. How one computes distances in the search context is
one thing, and how one expresses conditions on such distances is another, and
there's no need to jumble them up.

More constructively, I suggest the following.

(1) Take:

    A distance selection may cross element boundaries when computing distance.

and merge it with the later sentence:

    The distances computed by a distance selection are not affected by the
    presence or absence of element boundaries in the text being searched.

The two are basically the same. The latter is perhaps slightly more
informative, so you could just drop the former.

(2) Take:

    The following rule applies to the computation of distance:
    o Zero words (sentences, paragraphs) means no intervening tokens
      (sentences, paragraphs).

Reword it to something like:

    A distance of zero words (...) means ...

And add it to the "distances computed" para.

(3) Add sentences to answer [2a+b] above.

(4) If you like, move the whole para on computing distances earlier. It would
fit roughly where the "Distance is specified" sentence is. E.g.:

    ... matched tokens and phrases satisfy the specified distance conditions.

    Distances in the search context are measured in units of tokens,
    sentences, or paragraphs. Roughly speaking, the distance between
    two matches is the number of intervening units, so a distance of
    zero tokens (...) means no intervening tokens (...)
    More precisely, ...
    {sentence re element boundaries}
    {sentence re stop words}

    An FTDistance expresses a distance condition in terms of an FTUnit
    and an FTRange. An FTUnit can be <code>words</code>,
<code>sentences</code>,
    or <code>paragraphs</code>, where <code>words</code> refers
    to token distances.  An FTRange specifies a range of integer values,
    providing a minimum and maximum value for the distance in question.
    Each one of ...

Received on Saturday, 16 February 2008 09:25:24 UTC