- From: <martin.hepp@ebusiness-unibw.org>
- Date: Fri, 12 Sep 2014 08:32:25 +0200
- To: David Sheets <sheets@alum.mit.edu>
- Cc: Austin William Wright <aaa@bzfx.net>, Damian Steer <pldms@mac.com>, Semantic Web <semantic-web@w3.org>
Dear David: FYI: There was a comprehensive discussion on URI comparison in the Semantic Web in the mailing list archive, starting with http://lists.w3.org/Archives/Public/public-lod/2011Jan/0134.html It would be good to consider these aspects when evolving any URI/IRI-relates specs. Best wishes Martin ------------------------------------------------------- martin hepp e-business & web science research group universitaet der bundeswehr muenchen e-mail: martin.hepp@unibw.de phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! ================================================================= * Project Main Page: http://purl.org/goodrelations/ On 12 Sep 2014, at 02:05, David Sheets <sheets@alum.mit.edu> wrote: > On Fri, Sep 12, 2014 at 12:27 AM, Austin William Wright <aaa@bzfx.net> wrote: >> Since I maintain URI and IRI libraries, and numerous programs that use URIs >> for stating relationships (JSON Schema, RDF Interfaces, Turtle parser, and >> more), I'm interested in getting involved, pending some questions about the >> purpose of the proposed Community Group. Certainly there's been a lot of >> drama, since I sent this message, on public-webapps, www-tag, and >> public-w3process about the fork of the "URL" document. Will a Community >> Group be able to positively impact the issue? > > I believe that a Community Group which communicates regularly and > openly about its progress on a formal specification will be able to > positively affect the present issues. In particular, I think that a > Community Group offers a place to work on a well-engineered > specification using modern tools without requiring immediate buy-in > from existing groups. Once our methods have been demonstrated, I > expect work to move to other, more traditional specification venues. > >> Will we be able to shed light on the Semantic Web uses of the URI, IRI, and >> URI Reference? (The current documents seem to think that only Web browsers >> consume URIs.) > > The Semantic Web consumers of URI references (which, to my view, > encompasses URL, URI, and IRI) are an important constituency of any > URI specification document. However, I do not currently see a place > for Semantic Web or Linked Data specific content in such a document. > That is not to say that SemWeb concerns shouldn't be considered -- > just that SemWeb uses of URI references should be clearly possible but > not called out specifically. > >> Most importantly, I don't think it's necessary -- or even normatively >> possible -- to re-define how to parse URIs in HTML or any other spec. This >> is normatively done _only_ by RFC 3986 or a published successor that >> obsoletes it. > > I intend to incubate a successor with this Community Group. It is my > sincere hope that we will, before the end of 2015, have begun the IETF > RFC process for a new URI reference standard. > >> I would like to see a "URI/IRI API" that correctly uses the RFC3986/3987 >> terminology. Would publishing an ECMAScript API be in scope? > > Yes, publishing an ECMAScript API would eventually be in scope as such > an API would expose functions which the specification describes. I am > personally the maintainer of the ocaml-uri library > <https://github.com/mirage/ocaml-uri> and I would very much like to > see a test suite and test oracle for use against ECMAScript and other > languages' libraries. > > Initially, the definition of the ECMAScript API could be sketched but > defining it more elaborately should probably wait until the functions > being specified are more clear. At that time, it may be the case that > the ECMAScript API we propose actually exposes only composites of the > specified functions (e.g.: compose parse normalize resolve). > >> And as mentioned earlier, I'm interested in research into current >> implementation bugs of user agents and non-Web applications that consume >> IRIs, and if there's a way to fix them that's not (net) harmful. This is >> also one of the intended purposes, correct? For instance, there could >> possibly be a document describing how to fix invalid URI References, if that >> is acceptable (i.e. no "URI Strict Mode"). > > It's not clear to me if you are referring to fixing implementations or > fixing URIs. In general, there doesn't seem to be a valid way to fix > URIs that may have been used in a SW context as the only general > equality is byte-for-byte. With that said, I am very much interested > in specifying functions that consume potentially invalid URIs and > normalize them to be valid. If one understands the risks, such a > function could be used to "fix" invalid URIs. > > There are a number of different normalizations: > > 1. valid -> normal > 2. invalid -> valid > 3. invalid -> normal > > Ideally, 3 is 2 compose 1. 1 should be a fixpoint over normal. These > functions would be most useful at the publication side and could be > used to great effect in careful consumers. > >> Generally, the goal is to work all the current issues of interoperability >> between Working Groups out? Wouldn't e.g. appsawg at the IETF, or another WG >> that deals with the URI, also be suited for this purpose? > > A goal is to work out the issues of interoperability between the > Working Groups and the Real World. In addition, another goal is to > produce a single specification document that describes as fully as > possible the structure and interpretation of URI references, URLs, > IRIs, URNs, etc. This single source can then be used to generate a > text document, an executable test oracle, theorems about URIs, and > potentially an exhaustive test suite. > > The venues you mention would be the ideal place for this work if the > use of formal methods, specifically specification using the Lem > <http://www.cl.cam.ac.uk/~pes20/lem/> tool, would be accepted. I do > not have high hope that these venues are yet ready for such a > proposal. Therefore, I am starting a Community Group in which to > incubate this human-readable and machine-executable specification. I > believe we should have demonstrable proof that our methods work well > and provide value before we approach traditional standardization > bodies. > > I hope that you'll join me in supporting a single, readable source of > URI specification which is guaranteed to stay in sync with an > executable model and is robust enough to be used to enumerate its own > test suite. I will begin with IPv4 and IPv6 address parsing including > interface identifiers. I am the primary author of > <https://github.com/mirage/ocaml-ipaddr> which does precisely this but > does not yet handle interface identifiers. I believe this subcomponent > of the specification can easily be written in fewer than 20 hours. > > Perhaps one of the hardest parts of this specification process will be > writing the proofs to demonstrate that high-level properties (e.g. > grammars) are satisfied by low-level specifications. Another difficult > point will be error recovery and handling. This issue in particular > will likely require nearly every syntactic component to allow a error > variants which describe the issues with parsing but allow processing > to continue. Higher level functions can then specify precisely which, > if any, errors are allowed. > > I understand this is a large amount of work but I believe, together, > we can put in place a system of specification that will capture the > behavior of URI objects and serve us powerfully for decades to come. > > Thanks for your interest, > > David > >> Thanks, >> >> Austin. >> >> On Thu, Sep 11, 2014 at 12:58 PM, David Sheets <sheets@alum.mit.edu> wrote: >>> >>> On Mon, Aug 18, 2014 at 3:22 PM, Damian Steer <pldms@mac.com> wrote: >>>> On 18/08/14 12:54, Austin William Wright wrote: >>>>> As the maintainer of a library that converts and parses URIs and IRIs, >>>>> as well as many Semantic Web-related libraries that use it, I was >>>>> reading through the HTML draft, and it appears that the core ingredient >>>>> of RDF and Semantic Web--the URI [1] and IRI [2]--is not, in current >>>>> draft, normatively referenced from its key hypertext technology, HTML >>>>> [3]. >>>> >>>> For the lazy, what is being referenced is: >>>> >>>> <http://url.spec.whatwg.org/> >>>> >>>> Hmm. >>> >>> I have just proposed a community group to do this properly. Please >>> consider supporting it and beginning the discussion of formal >>> specification of URI: >>> <http://www.w3.org/community/groups/proposed/#urispec>. >>> >>> Thanks, >>> >>> David Sheets >>> >>>> Damian >>>> >>> >> >
Received on Friday, 12 September 2014 06:32:54 UTC