- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Fri, 09 Apr 2010 19:12:49 +0200
- To: Mark Davis ☕ <mark@macchiato.com>
- CC: Ian Hickson <ian@hixie.ch>, Ted Hardie <ted.ietf@gmail.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Maciej Stachowiak <mjs@apple.com>, Larry Masinter <LMM@acm.org>, Marc Blanchet <Marc.Blanchet@viagenie.ca>, Sam Ruby <rubys@intertwingly.net>, Paul Cotton <Paul.Cotton@microsoft.com>, Martin Duerst <duerst@w3.org>, Michel SUIGNARD <Michel@suignard.com>, public-html <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
On 09.04.2010 18:54, Mark Davis ☕ wrote: > For Issue #1, I like the formulation. However, I'd like to see one > more piece of information (logically) returned: if the parse could not > continue to the end, then what was the last character successfully parsed. > > That is, in "http://google.com/<space>/", it would return the offset > between the "m" and the space. > > So why do this? It is because a very common problem is to find an IRI in > plain text, where the end is not known. This needs to be done in email, > word processors, HTML editors, and a host of other products. By having > an explicit specification that lets us know what the last character is, > one can then (logically) call the function again to determine whether > the segment up to the error point is a valid IRI. Hmm. Not convinced. 1) If you want to parse IRIs out of content, wouldn't you also need to consider *leading* non IRI characters? 2) What's wrong with just adding up the individual segments (plus delimiters)? > ... Best regards, Julian
Received on Friday, 9 April 2010 17:13:36 UTC