Re: FindText API Updated Editor's Draft from Ivan Herman on 2015-10-06 (public-annotation@w3.org from October 2015)

From: Ivan Herman <ivan@w3.org>
Date: Tue, 6 Oct 2015 16:47:45 +0200
To: Doug Schepers <schepers@w3.org>
Cc: W3C Public Annotation List <public-annotation@w3.org>
Message-Id: <DCA59A08-4FD9-4A4C-A32E-B49EC23AA288@w3.org>
Hey Doug,

After a first read, I have two questions/comments.

- (This is minor:) the idea of using an edit distance for suffix/prefix is great. However: the way you specify the (maximal) edit distance is through a number, ie, the number of editing steps. However, shouldn't this edit distance limit be expressed (or at least alternatively express) through a percentage of the editing distance over the size of the suffix/prefix? I mean: if the suffix is 4 characters long, then an edit distance of 3 is significant, whereas the same distance is insignificant if the suffix is 100 characters long. Would a percentage be a good alternative?

- (This may be major, but may simply be a result of my own ignorance:) I have read about, and actually used in a simple setting, Promises, but they still twist my mind, I must admit. One thing that seems to be fairly complex when using Promises is when one has to create cycles using them, primarily when the number of steps in the cycle is unknown in advance. On the other hand, using the search() method in the current spec would require exactly that: you do some sort of an iterative go through the search results. Maybe there is an easy way to express that with promises which I simply do not know, but if this really is complex then what this tells me is that the searchAll() might become the method of choice (and one could then run a traditional cycle on the results). There are, obviously, performance issues, though.

B.t.w., I believe that the example:

var rf = new FindText({ text: "Rage, rage" });
var result = rf.search(); // result is 1st instance of string
    result = rf.search(); // result is 2nd instance of string
    result = rf.search(); // result is 3rd instance of string, the target instance

would not work, exactly for this reason. Each rf.search() returns a Promise, ie, one has to use a rf.search().when(function{…}) pattern for each entry, and it is not clear in my mind how the iteration materializes in the code.

Apologies if I am completely wrong in terms of these Promises...


Cheers

Ivan


> On 05 Oct 2015, at 21:03 , Doug Schepers <schepers@w3.org> wrote:
> 
> Hi, folks–
> 
> This weekend, I made substantial changes to the FindText API [1] (formerly called the RangeFinder API).
> 
> I improved the internationalization aspects and options, based on feedback from the I18n WG and from their updated CharMod spec (Character Model for the World Wide Web: String Matching and Searching… which seems tailor-made for us!).
> 
> I also fleshed out the algorithm for search (though it still needs lots of work), which was one of two critical changes needed before FPWD.
> 
> The remaining critical change is for me to update the examples, which is important because those will shape many people's first impressions of the spec (because examples are easy to read and understand). This is my plan for the rest of the day. This involves describing the workflow in terms of Promises, which I'm sad to admit I've never used in running before.
> 
> Luckily, I have two meetings set up for this afternoon with folks to help me with that:
> 
> * Chris Birk and Bill Hunt, from OpenGov Foundation
> * Alexander Schmidtz, from jQuery
> 
> These guys are very familiar with Promises, and so my examples and API design will have at least a bit of vetting and validation before pushing FPWD. There will always be room for improvements, but we should be ready to go by tomorrow.
> 
> 
> I welcome feedback from any of you on this spec!
> 
> 
> [1] http://w3c.github.io/findtext/
> [2] http://w3c.github.io/charmod-norm/
> 
> Regards–
> –Doug
> 


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Tuesday, 6 October 2015 14:47:55 UTC