Re: speech grammar spec recommends xsi:schemaLocation [namespaceDocument-8]

Dan Connolly wrote:

> On Tue, 2005-12-13 at 11:06 -0500, noah_mendelsohn@us.ibm.com wrote:
> > Dan Connolly writes:
> > 
> > > I would have thought that
> > 
> > > <grammar version="1.0"
> >          xmlns="http://www.w3.org/2001/06/grammar">
> > 
> > > was sufficient info to ground the document in the
> > > web and, among other things, find the standard
> > > schema.
> > 
> > I think that's one way to do it, but not in all cases the best or only 

> > way.
> 
> You're suggesting I said this is the only
> way. Please don't. I stipulated to exceptions:

I didn't intend to put those words in your mouth, and I'm sorry if I 
appeared to.  As you say, you clearly stipulated to exceptions.
 
> | I would think
> | that schemaLocation was only useful/necessary in case the
> | author meant for the document to match some more constrained
> | schema.
> 
> As to the rest, it's self-fulfilling prophesy; all
> the more reason for the TAG to make the
> grounded-in-the-web case sooner rather than later.
> 
> >   I certainly support the convention of publishing RDDL or something 
> > similar at the namespace URI, and using it when practical.   Here are 
some 
> > of the reasons you might want to use xsi:schemaLocation in addition or 

> > instead:
> > 
> > * Because xsi:schemaLocation was documented in the Schema 
Recommendation 
> > long before the community moved toward consensus on RDDL, there's a 
lot of 
> > schema-aware software out there that knows how to use 
xsi:schemaLocation. 
> > At least until RDDL-aware parsers become ubiquitous, using the 
attribute 
> > seems reasonable to me.
> 
> Using the attribute for what? The schemaLocation
> attribute doesn't actually contribute anything
> novel to the intent of the document, does it?
> 
> Can you tell me a story about speech grammars or
> other documents where including the schemaLocation
> attribute is useful/necessary?

See below.

> If the consumer wants to schema-check a speech
> grammar document, surely they know where to find
> the schema, no?

> > * We know that for versioning and perhaps for other reasons, multiple 
> > schema documents describing the same namespace may be published over a 

> > period of time.
> 
> Yes, the consensus about the syntax of a language
> may evolve, and the namespace document, or the
> things it links to, should evolve to document this
> evolving consensus. I don't see how that's
> relevant.

I think you're oversimplifying here.  The schema workgroup has been 
informed of use cases in which XML implementations run in environments 
where updates are available only rarely (on floppy disk in certain cases!) 
and in which various versions of a given namespace are thus widely 
deployed for extended periods of time.  Sometimes messages are sent 
between systems that were written to different versions of the 
vocabularies and associated namespaces definitions.  To some degree, each 
receiver will validate against its own expectations, but it can also be 
useful for a message to have a standard place to say: "this is the version 
I was using when creating this message."  Some receivers will alter 
behavior and expectations based on such hints.

> 
> If the owner of a namespace publishes or endorses
> *conflicting* schemas, on purpose, for an extended
> period of time, surely that's anti-social, no?

In my experience:

a) Such conflicts can't in all cases be avoided when fixing bugs
b) There are times in which one gradually deprecates and eventually 
forbids previously legal constructions.  Obviously, doing so has 
consequences and the decision shouldn't be made lightly.  The net result 
is that, over time, the rules for the ealiest and the latest versions come 
to conflict.

That said, I'll gladly acknowledge that such messes are to be avoided 
wherever possible, and that the results of incompatiblities can easily 
become "anti-social" in the sense you mean.
 
> As to the HTML case, I don't think XHTML strict
> conflicts with XHTML Transitional. And to the
> extent that they do conflict, it very much is
> anti-social. The world would be better off if we
> could just publish one schema for XHTML.
> 
> 
> >   Presumably RDDL purposes or other properties can be 
> > developed for designating either all of the versions that have ever 
been 
> > published, or else the latest (if you maintain linear versioning for 
your 
> > NS, which is common but certainly not in general required.) 
> 
> Yes, but that's rarely worthwhile. For most
> documents, we just update them in place. Only for
> a very special few do we maintiain archival copies
> of old versions. Perhaps the balance will be a bit
> different for schemas, but they're not
> fundamentally different. It's not as though a
> schema document is really all that different from
> a natural language text document or a picture,
> when it comes to versioning issues.

I agree that you're talking about perhaps an 80% case, but not a 99% case. 
 I'm not prepared to promote xsi:schemaLocation as a deeply appealing 
solution even for what it tries to do, but the general notion of being 
able to designate in the instance a particular version of a vocabulary or 
namespace definition does seem to me to be useful at times. 
 
> > xsi:schemaLocation allows an instance document to say explicitly "this 
is 
> > the version of the schema that was in force at the time I was 
written." 
> > That seems useful to me.
> 
> Quite; that's the exception I stipulate to.

So, we're agreeing. Good.

> > Though I don't think it relates to this particular use case, there is 
> > another factor relating to namespace descriptions that I think is 
worth 
> > mentioning:
> > 
> > * While we were designing the schema language, at least one vendor 
> > described implementation experience with a production quality system 
that 
> > by default dereferenced the NS URI to get a schema.  They found that 
in 
> > many cases this was impractical, because in fact so many namespaces do 
not 
> > have retreivable representations,
> 
> And that's where the bug is.
> 
> "A URI owner SHOULD provide representations of the
> resource it identifies"
> http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-describe-resource
> 
> If following hypertext links in the web were only
> 10% reliable, rather than 94% reliable, the web
> would be a pretty un-interesting place, though
> everybody would be, technically, conforming to all
> the specs. Right now, the web of XML namespaces is
> profoundly boring.
> 
> >  and they could not tell in advance which 
> > did and which didn't.  Network timeouts tend to be quite long on 
public 
> > networks, and typically much longer than the time required to 
successfully 
> > retrieve a representation (especially if that representation is 
cached.) 
> > So, their parsers spent long periods waiting on failed connection 
> > attempts.  While the same concern applies up to a point for 
> > xsi:schemaLocation, at least someone is explicitly warranting that 
> > retrieval is a good bet for that one.  Maybe over time high speed 
> > retrievability of NS representations will become nearly universal,
> 
> Yes, the TAG should do everything it can to get
> that to happen soon.

Indeed.  Still even if resolution of namespace names were "94%" 
successful, I can think of high performance environments in which waiting 
for the timeouts on the remaining 6% would be completely unacceptable. 
Then again, the same concern applies to some degree to xsi:schemaLocation.

> >  but in 
> > the meantime we've had implementation reports suggesting that one 
wants 
> > explicit hints as to which retrievals should be attempted and which 
not. 
> 
> Can you argue that these hints are of long-term
> value to the web?  That they actually make the
> meaning of documents more clear?

Yes, but only insofar as I noted our area of agreement above.
 

Noah

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Monday, 9 January 2006 23:02:50 UTC