For Issue #1, I like the formulation. However, I'd like to see one more piece of information (logically) returned: if the parse could not continue to the end, then what was the last character successfully parsed. That is, in "http://google.com*<space>*", it would return the offset between the "m" and the space. So why do this? It is because a very common problem is to find an IRI in plain text, where the end is not known. This needs to be done in email, word processors, HTML editors, and a host of other products. By having an explicit specification that lets us know what the last character is, one can then (logically) call the function again to determine whether the segment up to the error point is a valid IRI. Once we have the spec all sorted out, then on that basis someone can write a fast parser that returns all and only those instances that can be complete IRIs -- and more lenient ones that allow some information (such as the scheme) to be defaulted. Mark — Il meglio è l’inimico del bene — On Thu, Apr 8, 2010 at 18:40, Ian Hickson <ian@hixie.ch> wrote: > Issue 1: > ======================================================================== > Update the IRI specification to define an algorithm with the following > characteristics: >Received on Friday, 9 April 2010 16:55:33 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:56 GMT