Re: FindText API Updated Editor's Draft

Hi, Bill–

Thanks for your feedback. Replies inline…

On 10/12/15 8:22 AM, Bill Hunt wrote:
> Sorry to have been out of the loop for a few days, so I'll try to only
> touch on the highlights here -
>
> 1) Randall's proposed code is a good way to handle this. The calling
> code doesn't have access to the underlying promises here, but that's not
> strictly necessary - a promise is a promise, as they say.
>
> 2) Ivan, I still feel very strongly that a searchAll() is necessary.
>   This is a very common use case, as common as searching for one item,
> and forcing users to re-implement that boilerplate code over and over is
> less than ideal. It means that there will always be wrapper libraries
> used, which is what I think we've been trying to avoid in the
> web-javascript world for years (since the "browser wars").

I'm also inclined to think we need at least a searchAll(); for 
performance reasons, based on early feedback I got, I'm inclined to 
think we need search() ("find first instance") as well.


> Moreover, it's almost universal to have the find one / find many pairing
> in most other get-data interfaces that users will be familiar with.
>   E.g.,the Waterline ORM used by Sails has findOne and find (all/many),
> as do most ORMs.
> https://github.com/balderdashy/waterline#user-content-query-methods

Interesting reference point, thanks.


> And of course for any REST interface, users are able to reach the index
> for a list, or a particular resource by identifier; it's a bit of a
> stretch, but the most common interface I can think of.
>
> Regular Expressions are probably the closest use case for this
> interface, so users would probably expect something similar to the
> native regexp interface, which only has find many:
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match
>
> On the other hand, iterators for getting data have all but disappeared
> from modern Javascript - it's purely anecdotal, but I can't think of a
> time I've used an iterator interface in the last few years.  You
> certainly see it in things like Wordpress' loop, but I think it'd be a
> bit strange to me to encounter one in Javascript.  Again, where they do
> exist (as with database interfaces), they've been hidden by thick layers
> of wrapper libraries.

I'd like wider opinion here… having some sort of next() method seems 
like it could be useful, even if iterators are not common in modern JS.


> Last, if the calling code has access to textDistance, I assume it will
> be very common to return alternatively all results to the end user in
> order of best to worst match, which can't be done easily as-is.  Again,
> you'd need a wrapper to iterate over all the results and aggregate them,
> then sort that list.

I'm thinking we should return the result values (e.g. what was the edit 
distance for each of text, prefix, and suffix?) so that apps can decide 
how to sort their results.


> 3) It might be nice (as I think was proposed earlier) to allow users to
> use either promises or synchronous code, as the Waterline example above
> does, and leave it to the end user to choose which they want - but that
> again complicates matters and can lead to confusion.  Of the options,
> I'd lean to the side of promises-only, again for large documents' sake.

That seems like it might complicate things. Waterline needs to do that 
because it encapsulates multiple underlying storage systems; I don't 
think we have the same design constraints for FindText.


> 4) Doug, I'm not sure if I've read the document closely enough so I
> apologize if this is an incorrect reading, but as it stands this API is
> only for text-matching, correct?

At this point, yes. Earlier on, the RangeFinder API let you find 
arbitrary ranges that may or may not be text, but I stripped that out 
for the more focused FindText API.


> So users would be unable to do any
> sort of pattern-matching (i.e., regular expressions)?  That seems like a
> useful thing to have, the very first plugin I added to Chrome was a
> find-in-page regex replacement for the default search.

That might be interesting to include. I absolutely see the utility of that.

I have some concerns about performance (based on much earlier 
conversations), when combined with the edit distance functionality; I 
see edit distance  as a much more needed functionality for the robust 
anchoring and general search functionality.

If we did do this, we'd have to define a strict subset of RegEx that 1) 
works today in browsers and 2) doesn't have much of a performance hit.

This could turn out to be a non-trivial standardization effort (e.g. 
defining each expression precisely, doing performance evaluations, 
getting agreement about what to include), and might be worthy of a v2 or 
a separate module that can be used with FindText.

Could you file an issue about this in Github?


> I apologize if I've missed any nuance here, I've tried to catch up on
> the week or so I've missed.

No apology needed. Thanks for following up!

Regards–
-Doug

Received on Wednesday, 14 October 2015 09:26:20 UTC