- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 10 Apr 2010 13:25:52 +0200
- To: Mark Davis ☕ <mark@macchiato.com>
- CC: Ian Hickson <ian@hixie.ch>, Ted Hardie <ted.ietf@gmail.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Maciej Stachowiak <mjs@apple.com>, Larry Masinter <LMM@acm.org>, Marc Blanchet <Marc.Blanchet@viagenie.ca>, Sam Ruby <rubys@intertwingly.net>, Paul Cotton <Paul.Cotton@microsoft.com>, Martin Duerst <duerst@w3.org>, Michel SUIGNARD <Michel@suignard.com>, public-html <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
On 09.04.2010 19:41, Mark Davis ☕ wrote: > When you would actually implement it, there are a few different kinds > of APIs that you would use, such as: > > end = lookingAt(string, startPosition); > if there is an IRI starting at startPosition, return the end of it - > otherwise return an error. > > <start, end> = scan(string, startPosition); > find the first instance of an IRI in a string at or after startPosition, > returning where it starts and ends. > > > The key is that if the Issue#1 specification can return the first error > point (as I outlined in the message), then one can design and implement > fast code to implement the above (or other kinds of APIs). The reference > code for /testing/ lookingAt would implement the algorithm in Issue#1 > (as amended). The reference code for /testing/ scan would just call > lookingAt in a loop, starting at position 0, returning if something is > found, and otherwise going to the next character. This would just be > reference code; the reference code can be much faster. > > Mark > ... Hi, I'd agree with you if we were talking about an Application Programming Interface. But this is just about the specification interface between HTML (& friends) and IRI. That being said, defining a sane Javascript API for handling web addresses would be great. Maybe it could be developed and deployed similar to the way the JSON thingy was. Finally: parsing addresses out of content is highly context dependent. Do you consider angle brackets as URI delimiters in plain text? Can whitespace appear in angle-bracket quoted addresses? In unquoted addresses? Is whitespace a delimiter between addresses (such as in a few set-of-URI-typed HTML attributes), or part of the address? I'd rather not have to think about this in the IRI spec. Maybe in a BCP-like companion spec, though. Best regards, Julian
Received on Saturday, 10 April 2010 11:26:44 UTC