Re: Rough Draft of Robust Anchoring: the RangeFinder API

Hi, Liam–

Thanks for the use cases.

I'm sorry for being dense, but I'm not sure how this fits in with the 
RangeFinder API.

Both of these cases are about using XPath to locate multiple ranges in a 
single pass, while RangeFinder is an iterative API that incrementally 
finds a single range at a time, within a particular scope of the 
document tree, with an optional initial starting point (thus the CSS or 
XPath selector).

I'm not an expert in XPath, so I'm also not sure how to interpret your 
examples absent markup examples to apply them to.


That being said, here's a quick reaction to the prose aspects of the use 
cases:

1) Find (annotate) all cells in which the net revenue is negative: in 
this case, with the RangeFinder API, you'd narrow the scope to the table 
element, then you'd look for instances of the minus sign, then use regex 
in JS to see if that is followed by a number. If you were looking for a 
specific negative number, that would be more straightforward. I 
considered adding some sort of "wildcard/regex" syntax to the search 
string component, but was discouraged from doing that, for performance 
reasons; it might still be a worthwhile idea to explore.

2) Find all students whose tutor is not listed: this sort of operation 
could be done in a manner similar to the example above (finding 
instances of the student's name, then looking for related course 
information in JS by scanning the DOM, assuming you know the DOM 
structure); but this is not really the point of RangeFinder. It's not 
intended as a generic pattern matcher, but rather as a narrowly-focused 
API to find instances of text, or other known ranges, with some ability 
to apply fuzzy logic around location in the document, text edit 
distance, and a few other factors.

The functionality you're describing sounds interesting, but it sounds 
like a different technology; in fact, since you're describing a solution 
in XPath, is there anything else needed to solve your use case?


As a side note regarding XPath, I'm most interested in the robust/fuzzy 
aspects that I understand were left out of XPath, but which were under 
consideration; can you share any info on that?

Regards–
–Doug

On 3/24/15 7:36 PM, Liam R. E. Quin wrote:
> On Wed, 2015-02-25 at 00:48 -0500, Doug Schepers wrote:
>> Hi, folks–
>>
>> Just a quick note. Rob asked me to move this file, to keep the
>> deliverables organized. It's now located at:
>>
>>    http://w3c.github.io/web-annotation/api/rangefinder/
>
> And now at https://specs.webplatform.org/rangefinder/w3c/master/
>
> I promised Doug at least a couple of uses cases for the XPath
> selector. I can write them up in more detail if they're felt to be
> reasonable.
>
> (1) consider a table such as a profit/loss statement in an annual
> report; let's annotate all cells in which the net revenue is negative.
> The XPath expression might be something like
>      //table[@id = 'profit-and-loss']//th[. = 'Net Revenue']/following-
> sibling::td[. < 0]
>
> (2) Find all students whose tutor is not listed:
>
>    //li[@class = 'student']
>        [
>           [@class='tutor']
>           [
>              not(//li[@class='tutor']/@id = concat('#', @href))
>           ]
>        ]
>
> These are both fairly complex examples in the spirit of "make the easy
> easy and the complex possible". Note that any identifier pointing at
> actual text will not be possible with CSS selectors, although a
> combination of selectors and byte ranges within a containing element
> can be used. But there should also be a checksum and/or text
> comparison in case the wrong text is highlighted, of course.
>
> Hope this helps. I have both simpler and more complex examples of
> course, if needed.
>
> Liam
>
>
>>
>> Even this is a temporary location, though... I'll be moving it to
>> specs.webplatform.org soon, and adding the annotation capability to
>> it.
>>
>> Feel free to review, but be aware that the URL is transitory.
>>
>> Regards–
>> –Doug
>>
>> On 2/24/15 1:33 PM, Doug Schepers wrote:
>>> Hi, folks–
>>>
>>> After talking about Robust Anchoring with many people over the
>>> course of the last couple years (!), with encouragement and good
>>> criticisms, I've refined my notion of what's needed for a client-
>>> side API for Robust Anchoring.
>>>
>>> I've drawn up a strawman of my current thinking for an API called
>>> RangeFinder [1].
>>>
>>> It's very rough in places, but I'd appreciate any feedback on the
>>> spec as it stands. I'd greatly appreciate any thoughts or opinions
>>> on it at this stage.
>>>
>>> I'm not sure it's mature enough for this yet, but at some point,
>>> I'd like to engage the research and academic communities and the
>>> experts who've published on text search algorithms, to polish this
>>> up and make it not quite as embarrassing as it is currently. If
>>> anyone knows who we should contact in that regard, please chime
>>> in. This is a great opportunity to leverage all that research in
>>> the service of Web developers and browsers!
>>>
>>> [1] http://w3c.github.io/web-annotation/rangefinder-api/
>>>
>>> Regards– –Doug
>>>
>>
>>
>

Received on Wednesday, 25 March 2015 05:46:46 UTC