- From: Al Gilman <asgilman@iamdigex.net>
- Date: Tue, 14 Aug 2001 09:20:27 -0400
- To: Dan Connolly <connolly@w3.org>, Mark Nottingham <mnot@akamai.com>
- Cc: Aaron Swartz <aswartz@upclink.com>, uri@w3.org
At 10:47 AM 2001-08-12 , Dan Connolly wrote: >Mark Nottingham wrote: >> >> I've started sketching out a class-based URI module to replace the >> function-based urlparse one distributed with Python... don't know how >> much time I'll have to work on it, but if you (or anyone else) is >> interested, we could give it a go. > >I've got a few thoughts on URI API design that I haven't managed >to code up. But while we're talking about it... > >Developers tend to learn about URIs from APIs, and I'd like >to clarify some things from that perspective. > >For example, a URI object shouldn't have any state. >Several APIs >bundle URI parsing with network access, putting GET and POST >methods on the same object as getFragID. Bad news. Please say getFrag. You don't know that the Frag is an ID until you have recovered the resource and determined its type by inspection. The 'fragment' that is the heuristic reason for the naming of this terminal in the parsing model is _a fragment of the URI-reference string_, not a "fragment" of the resource. It is just what follows the '#'. In general. It is commonly used for an ID to indicate a proper sub-object (not general fragment) of the recovered value of the indicated resource. But that's not definitive, i.e. not universal. So in an OO context getFrag is stateful because the class of the object returned -- what you can do with it -- depends on the state variable knowResourceRecoveredValueType. >So I'd prefer a URIOracle class that knows how to parse >URIs; its interface is exposed with static methods. (this >is pretty much the same thing as a python module with functions). > >Another opportunity I'd like to exploit is explaining the >difference between when it's OK to peek into which parts of a URI. > >At one level, the only methods are: > URIOracle.getFragID(aURIRef): # returns fragid > URIOracle.combine(absBaseURI, aURIRef) # returns absolute URIref > URIOracle.refTo(fromHere, toThere) # URI "subtraction" This level is pertinent to the topic of URIref methods, not URI methods, precisely. These are two closely related classes, but the abstract URI comprised of the equivalence class of all strings provable to indicate the same resource (by equivalence under the escaping rules, etc.) is worth regarding as a separate class from the URIref that one finds in the HREF of a hyperlink, for example. The URI is fully qualified and needs no context. The URIref appears in context and may be relative, depending for its interpretation on a BASE available from the context. > (and maybe some escaping/unescaping methods... > and maybe something for encoding form arguments... > gotta think about that). > >At this level, you can't peek in enough to tell the difference >between one scheme and another. ... If you can't tell the scheme, you are not dealing with URIs. GetScheme is perhaps the sole universal proper method for URIs. Everything else hangs on it. URIs expose the class of their indicated resources by means of the scheme component. That is the first, most important production in the reference model for defining the world in which we use URIs. If you haven't captured that, start over. ftp: URLs indicate resources which have a have a GET method. [no PUT] mailto: URLs indicate resources which have a PUT method. [no GET] data: URLs indicate resources which have a PARSE method. [no GET or PUT] to handle a URIref one contextualizes and normalizes to obtain the associated URI and case on scheme to determine the applicable proper methods of the resource indicated. More information on which of these methods is indicated by some activation may be available from the context of the URIref, as for example when it appears as the ACTION for a FORM. Al >... This level corresponds to >the application and/or presentation objects in TimBL's >diagrams of the web model > > <http://www.w3.org/DesignIssues/Model>http://www.w3.org/DesignIssues/Model > >Then there's a separate interface for use by code that does >network access; at this level, you can parse the scheme, >the host, the username/password, the path segments, etc. > >Anyway... as I say, I haven't worked out the details. I >have a formal specification of these interfaces in progress... > > <http://www.w3.org/XML/9711theory/URIclient.lsl>http://www.w3.org/XML/9711th eory/URIclient.lsl > <http://www.w3.org/XML/9711theory/URI.lsl>http://www.w3.org/XML/9711theory/U RI.lsl > > >-- >Dan Connolly, W3C <http://www.w3.org/People/Connolly/>http://www.w3.org/People/Connolly/ >
Received on Tuesday, 14 August 2001 09:02:57 UTC