- From: Randall Sawyer <srandallsawyer@gmail.com>
- Date: Sun, 7 Aug 2011 21:56:55 -0700
- To: public-iri@w3.org
- Message-ID: <CACJYzX32b5VeLhULSj-eLnLn94ZZu-nmXgAfkasd1ZDMSDc5UA@mail.gmail.com>
Hello, All! Only recently have I stumbled upon the need to parse and normalize URLs for a couple of projects I'm working on. In doing my research - including reading all of rfc3986 and part of A. Barth's "Parsing URLs for Fun and Profit" - I find it frustrating the amount of effort required to anticipate and correct malformed URLs. I have a suggestion as to how content-providers and client-developers may voluntarily make their services and products work better together. [I have searched the archives for something like this, and have not found any so far.] What I have in mind is something comparable to SGML/XML validation. Just as a *ML document may contain a declaration at the top stating that it is compliant with a specific template, what if we made it possible for an organization to declare that every existent path on their site is compliant with a specific path-syntax template? Imagine going to visit a city - and instead of just running in head long, hoping you'll be able to catch on to the local customs - you first pause at the gates long enough to read the placard listing the local customs. The former case is very much like the status quo of parsing and correcting each path segment, hoping for success. If a browser - on the other hand - was provided a set of guidelines as to the characteristics of a normalized path on that site, then computation time decreases, and access to content is facilitated. I already anticipate some issues: 1) Where to put the placard, and what to name it. These need to be the same for every site - or perhaps some universally named meta-data pointing TO the placard. [By 'placard', I mean path-syntax-template] 2) Declared compliance is not the same as actual compliance - same goes for an *ML file, though. That is the responsibility of the author(ity). 3) What if a content-provider decides to opt for a path syntax which covers MOST, but NOT ALL, of its existing paths? The template then would need to also include a list of exceptional paths (perhaps using a wildcard if the offending path is an upper level directory). Any thoughts? Is this desirable? Would it potentially interfere with existing protocols or standards? Randall
Received on Monday, 8 August 2011 04:57:23 UTC