- From: Arjun Ray <aray@q2.net>
- Date: Fri, 20 Jun 1997 02:23:07 -0400 (EDT)
- To: w3c-sgml-wg@w3.org
On Fri, 20 Jun 1997, Sam Hunting wrote: > Tim Bray writes: > > > >Perhaps in a future revision process DTDs will transmogrify, and > > >son-of-8879 will have a much more ambitious world-view as to what > > >constitutes a markup declaration. In the meantime, these guys are > > >convinced that they need namespaces and they need them well-defined > > >by Q4 '97, and we shouldn't tell them that they can't have them. > > Well, then *this* is the fundamental requirement, then, isn't it? It certainly appears so -- an essentially external process has deemed a protean notion yclept "namespace" to be a Good Thing To Have On Hand ASAP. The issue, then, is the syntactic machinery to permit this. > I buy David Durand's plea for a conservative approch to the namespaces > issue. I'll add my Me Too. Assuming that *some* syntactic device will have to be invented, I would prefer that it (a) be least invasive in terms of the appropriate changes in SGML-bis, (b) not be *limited* to 97q4 "namespaces" in its overt syntactic function. In fact, the syntactic requirements appear to be relatively orthogonal to the whole business of why namespaces are so urgent. If it were just a matter of associating a "wider context" to an element -- typically for semantic purposes -- the AF solution via attributes *is* an answer. This can't be the problem. Rather, the problem seems to be how to "uniquify" GIs in contexts where name clashes can't be ruled out. Quite apart from the serious validation issues raised (Paul Grosso's AAP example comes to mind), there's also the chance that the app downstream from the parser, i.e. the "semantic engine", will get confused too. Hey, waitaminnit. Preventing that is supposed to be a raison d'etre of SGML... So, what are we *really* talking about here when we say that "name clashes can't be ruled out"? IOW, *why* can't they ruled out? It appears that the Canonical Problem is not inclusion of data from multiple domains. It's such inclusion *on an ad hoc basis*; this is what forces the need for syntactic distinguishability in the instance. We're talking about a syntactically explicit Cut'N'Paste mechanism. It may help, then, to work backwards from a standard SGML answer to this, notations and (external) entities, and cast this as what happens when you inline such a notated entity. Clearly, the requirement is to preserve the notation information: the data content of the entity will still need to be in a portable or transferable form (on General Principles: some day we'll need the external reference mechanism anyway, and then it will do no good if the actual data content has to assume different syntactic forms depending on where it gets to be plunked.) The argument against the AF answer is that the naming attribute even when included isn't distinguishable in the instance: it requires extra DTD machinery to work. In my own long-winded way, I've arrived at the point where I believe the three kinds of (nominally sans-DTD) proposals can be understood in terms of their different approaches to distinguishability: 1. Lexical - add a character to the set of name characters and use it as a name-compounder. HTML:A, TEILITE:NOTE, etc. 2. Syntactic - use a PI to stuff the disambiguating information. 3. Structural - use a newfangled marked section as an explicit scoping device. I would argue against (1) on the grounds that (a) it is overkill when the problem context is (or rates to be) essentially ad hoc in its incidence, (b) unnecessarily verbose, if not goofy, when the content might need to be reusable as an external entity (indeed, why couldn't it have started out that way?), and (c) it doesn't scale to the situation where multiple domains/namespaces/whathaveyous might need to be encoded (cf. two or more AF attributes.) I have no strong argument against (2). It works like a DTD in absentia, in that the information has to be separately parsed *and* buffered while the data content is parsed. That is, it's not syntactically (more accurately, lexically) explicit in its scope; we need a full parse even to start. Nevertheless, my push-come-to-shove preference is for something like (3). (a) the scope of <![ .... ]]> is lexically distinct in a reasonably opaque fashion (sort of like figuring out the extent of an IGNORE MS), (b) like a PI, we keep the "notation" information separate from the actual data content, and (c) also like a PI, it focuses on a general purpose syntactic mechanism that can be specialized for the needs of namespaces. Currently, only status keywords are allowed between the DSOs. I would propose a variant on Henry Thompson's proposal with syntax like this: <![ (name-group) [ ... ]]> Assuming PEs aren't dead in the water, hiding the namegroup also becomes possible (and in extreme cases, the PE might be redefinable to CDATA and the buck of parsing the content passed to the application, which could invoke another parser instance ... hey, ad hoc come, ad hoc go.) Indeed, the entire marked section can also be stuffed into an entity declaration if need be. > Wouldn't it be possible to enable the ":" to be added to XML names, and > then enable namespaces themselves at the "application specific > instructions level"-equivalent in XML, which I would take to be a set of > processing instructions in the Misc section of the Prolog? Name-munging certainly looks like an easy way out. But it smacks too much of forcefitting a solution whose essential appeal derives from a different paradigm (C++?) Sure, the programmers will grok it and love it. But it messes with the content (the need to "resolve" GIs gives them a data quality beyond their markup function) when what we need is just markup. > That way, the Q4 guys are happy, experimentation with namespaces can > proceed apace, and any failures wouldn't bring XML or SGML down. Perhaps, but any solution that won't scale and won't work with other mechanisms such as declarations for external entities and notations rates to be penny wise and pound foolish. IMHO, of course. Arjun
Received on Friday, 20 June 1997 02:21:39 UTC