Re: FindText API Updated Editor's Draft from Doug Schepers on 2015-10-06 (public-annotation@w3.org from October 2015)

From: Doug Schepers <schepers@w3.org>
Date: Tue, 6 Oct 2015 11:26:30 -0400
To: Ivan Herman <ivan@w3.org>
Cc: W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <5613E826.7020406@w3.org>
Hi, Ivan–

On 10/6/15 10:47 AM, Ivan Herman wrote:
> Hey Doug,
>
> After a first read, I have two questions/comments.
>
> - (This is minor:) the idea of using an edit distance for
> suffix/prefix is great. However: the way you specify the (maximal)
> edit distance is through a number, ie, the number of editing steps.
> However, shouldn't this edit distance limit be expressed (or at least
> alternatively express) through a percentage of the editing distance
> over the size of the suffix/prefix? I mean: if the suffix is 4
> characters long, then an edit distance of 3 is significant, whereas
> the same distance is insignificant if the suffix is 100 characters
> long. Would a percentage be a good alternative?

I'd considered that, but actually, it's very easy for the developer to 
develop their own threshold, whether it's a percentage or some other 
consideration, based on the target strings.

For instance, if you have decided that 5% is the absolute maximum edit 
distance, then you multiple the length of the target string by 0.05 and 
floor it. So, a 100-character string has an edit distance of 5, a 
20-character string has an edit distance of 1, and any shorter string 
has an edit distance of 0 (edit distance operations can't be fractions).

This way, you could even apply a non-linear pattern to your edit 
distance criteria, for example, or any number of other patterns that we 
don't have to bake in.


> - (This may be major, but may simply be a result of my own
> ignorance:) I have read about, and actually used in a simple setting,
> Promises, but they still twist my mind, I must admit. One thing that
> seems to be fairly complex when using Promises is when one has to
> create cycles using them, primarily when the number of steps in the
> cycle is unknown in advance. On the other hand, using the search()
> method in the current spec would require exactly that: you do some
> sort of an iterative go through the search results. Maybe there is an
> easy way to express that with promises which I simply do not know,
> but if this really is complex then what this tells me is that the
> searchAll() might become the method of choice (and one could then run
> a traditional cycle on the results). There are, obviously,
> performance issues, though.

Promises mess me up, too. And you're right

That's why I sought help from Alex, Bill, and Chris… (and Doug, and 
E-van Herman, and F…orget it… bad joke…), who helped me understand it a 
bit better.

As Bill mentioned, they suggested that I change the API to have a 
SearchAll() method, which seems reasonable. I've incorporated some of 
their suggestions, and I imagine that with implementer feedback it will 
improve dramatically over the next few months.



BTW, I've started to put out the call to see if we can find someone 
who'll create a polyfill, for faster iteration on the spec. Since part 
of my spec is loosely based on the Hypothes.is robust anchoring code 
(which in turn is based on some previous experiments and serious 
academic research), it should be easy to pull out that (open source) JS 
code and create a wrapper that matches the spec.


> B.t.w., I believe that the example:
>
> var rf = new FindText({ text: "Rage, rage" }); var result =
> rf.search(); // result is 1st instance of string result =
> rf.search(); // result is 2nd instance of string result =
> rf.search(); // result is 3rd instance of string, the target
> instance
>
> would not work, exactly for this reason. Each rf.search() returns a
> Promise, ie, one has to use a rf.search().when(function{…}) pattern
> for each entry, and it is not clear in my mind how the iteration
> materializes in the code.

Yeah, as I mentioned (and as is noted in the spec, the examples are all 
completely out of date and wrong, at this point. That's what I'm fixing 
today.


> Apologies if I am completely wrong in terms of these Promises...

Nope, I think you were right.

Regards–
–Doug
Received on Tuesday, 6 October 2015 15:26:33 UTC