RE: Talked to the xml.gov people from Bullard, Claude L (Len) on 2003-05-23 (www-tag@w3.org from May 2003)

From: Bullard, Claude L (Len) <clbullar@ingr.com>
Date: Fri, 23 May 2003 11:00:17 -0500
To: "'Paul Prescod'" <paul@prescod.net>, WWW-Tag <www-tag@w3.org>
Message-ID: <15725CF6AFE2F34DB8A5B4770B7334EE022DC35F@hq1.pcmail.ingr.com>
I was surprised to see the URN in the xml.gov policy. 
It seemed to me they were being clear that they *want* 
the most extreme separation of system and content preferring 
that all aspects of the namespace assignment be under 
the control of the agency and the US policy.  IOW, they 
chose to separate it from control by the IETF, W3C, etc.

1.  Use URN.  Make a statement to the world that the 
namespace is controlled by the authority over that 
named space.  Optionally, resolvable, but that is 
local policy.  Don't expect the WWW system to support 
this.  It is a legal XML namespace but meant to be 
a disambiguating string and nothing else unless we 
tell you otherwise.  It won't confuse web software.

2. Use URL.  Could be HTTP.  Could be FTP.  It is a 
legal namespace.  Make a statement to the world that 
you might click on this and your screen will definitely 
do something.  It might not be what you expect unless 
you read the documentation for our schema, or if that 
documentation is returned as a representation of a 
resource.  In other words, the namespace value may 
be both a disambiguating string AND a resource identifier.

It seems to me that this isn't hard to understand so 
I am still mystified that Tim went to explain this 
unless of course, his mission was to proseletyze for 
the HTTP string because that is the web thing to do. 
I can't dismiss that possibility because too much 
has been made of the "ubiquitous support" issue.

This still comes down to system politics.  Perhaps XML.GOV 
does not want to be a member of that party.  If so, 
they chose correctly.  If it is remotely possible that 
they don't understand that in the future if they should 
decide to go beyond the namespace rec and use a namespace 
identifier as a resource identifier, then they have to 
get a catalog resolver, that should be explained.  OTW, 
local choice and not really the business of the W3C unless 
asked.

My problem with the "it's just a string but if it's the 
right string, you can do more with it" is that there is 
so little clarity in what the URN/URL/URzed divisions 
achieve.  There appears to be no purpose for these except 
politics as long as we maintain that http is just a 
string.  It isn't.  It is a string with a reserved semantic 
just as xml: is reserved so that semantics to be reserved 
to the xml processor can be identified.

This is too much like the xml:id and patent issues.  The 
press provided is obscure so people believe one thing, 
while the probable outcome is another.

o xml:id - abandon declarative means in favor of hardwired 
processor semantics while insisting that this is a declarative 
means.  Not a bad outcome but let's be clear:  this is 
taking more options away from the locals. This is pretty 
clear.

o patents - let the press tell the world that the W3C has 
abandoned patents when the actual text is that it is the 
director's perogative to choose.  Not a bad outcome, but 
let's not fool ourselves that given the smart business model 
of using IP to fund other activities, it means that the role of 
the W3C will be more narrowed (and it should be) into the 
fundamentals of WWW plumbing vs applications, and that the 
applications will go to other organizations, the current 
trend.  Had standards not been conflated with specifications, 
this would not be an issue, but semantic games were played 
with the press and the resulting feedback, positive and 
negative, are forcing a fracture into competing specifications 
and standards.  I think that natural, so it doesn't bother me, 
but I don't think it is clear.

In my experience, if one has to choose between clarity and control 
when not clear, certainly don't grab for control.

The folks at xml.gov chose to keep the urn in there for 
a reason.  Anyone know what it was?  If not clear about 
that, why recommend a different solution when the one 
they have is perfectly legitimate even if from one point of 
view, considered shortsighted?  Be clear.

len

-----Original Message-----
From: Paul Prescod [mailto:paul@prescod.net]
Sent: Thursday, May 22, 2003 9:39 PM
To: Bullard, Claude L (Len); WWW-Tag
Subject: Re: Talked to the xml.gov people


Bullard, Claude L (Len) wrote:

> Then maybe the TAG should be talking about getting rid of
> URNs altogether,

Imagine the brouhaha! Better to let them persist (heh!) not hurting 
anybody but not helping much either.

>  or explaining that HTTP really is meaningless
> to systems that provide PUBLIC to SYSTEM catalog mapping?

It isn't meaningless. It is that HTTP-based dereferencing is an 
*optional* *feature* that you can take advantage of or not depending on 
your needs.

> PUBLIC identifiers were about systems that assigned names
> but said nothing about resolution, eg, identity is assigned.
> That is the semantic.
>
> SYSTEM identifiers were about system specific locations of
> entities.  Resolving an address is the semantic.

That's true. I also came from this background and fought for PUBLIC to 
be added to XML. It took me years go conclude that I was wrong back them.

> HTTP identifiers are about names that identify locations
> of entities.  They are a SYSTEM id.  If one wants to use
> them as a name, they have two semantics.  Fine.

Fair enough. They are names that can be used as locators if that is 
convenient.

> HTTP is a protocol identifier.  Saying it is a meaningless
> string until it gets handed to an HTTP handler doen't add
> much to clarify the situation.   It just means the semantic
> to be implemented is in the handler and is fuzzy in the spec
> because the namespace specification fuzz'd it.

Not at all. The namespace specification is quite clear that when HTTP 
URIs are used in namespaces, they are used as names, not locators. The 
semantics of the specification do not depend on dereferencing. There is 
no fuzz what-so-ever. Confusion in the minds of many readers, yes. Fuzz 
in the specification, no.

> In essence, it makes no difference what goes in that namespace
> id value as long as it is unique within scope. 

That is not true. RDDL demonstrates that some namespace identifiers, 
discovered context-less in the wild, are more useful (in practice) than 
other namespace identifiers. Or to put it another way, you can do 
something with this:

SYSTEM "http://www.w3.org/2001/XMLSchema"

that you _cannot_ do with this:

PUBLIC "urn:uuid:89793274983729473298473928"

To me, that IS a difference and a significant one.

>   So why did
> Tim bother to go to xml.gov, and what of value or clarification
> did he tell them since there is no reason to prefer any string
> over another in there given a policy for mapping it to a handler,
> the semantic of which is indeterminate for the purpose of it
> being a namespace identifier?

Tim went to xml.gov to tell them that the former type of string is (in 
his and my opinion) superior to the latter because using stone age 
techniques the former can be connected to a RDDL file and a RDDL file 
can contain a wealth of important information for both a computer and a 
person.

This isn't theoretical mumbo jumbo. The URI above pointing to XML Schema 
is _really useful_. If you stumbled upon it in the wild, HTTP 
dereference would _really help you_. I don't understand under what 
circumstance it is better to cut people off from this sort of value.

  Paul Prescod
Received on Friday, 23 May 2003 12:00:27 UTC