[XMLVersioning] Definition of an XML Language (was: Re: (Partial) review of Versioning XML)

Norm writes:

> >    There are many different systems for exchanging texts in languages, 
such
> >    as SQL, Java, XML, ECMAScript, C#. We will briefly describe some 
key
> >    refinements to our lexicon for XML. An XML language has a 
vocabulary that
> >    may use terms from one or more XML Namespaces (or none), each 
> of which has
> >    a namespace name. [Definition: An XML language is an identifiable 
set of
> >    vocabulary terms with defined XML syntactic and semantic 
> constraints. ] By
> >    XML language, we mean the set of elements and attributes, or 
instances,
> >    used by a particular application.
> 
> Really? How does "used by a particular application" fit in? I would have
> thought that we meant the set of instances that conform to the rules of
> the language independent of any particular application. Surely my XML
> language is a language even before there are any applications that are
> expecting to process it.

I agree with Norm on this.  Furthermore the draft formulation comes close 
to suggesting that it's the markup that is definitive of the XML language. 
 I think the content within the markup is equally significant.  For 
example, consider the two instance fragments:

<phone>555-1212</phone>

and

<phone>011-44-332-557-9367</phone>

and assume that indeed the schema for the first requires exactly 7 digits 
with one hyphen, while the other allows any length sequence of digits with 
embedded hyphens.  Are these the same language?  I claim not, both on 
commonsense grounds, and per the terminology we've been building up in 
part 1.  There are documents that are members of the 2nd language and not 
members of the first, even though the markup is identical.

No doubt there are some interesting things to say about how markup is used 
when languges are versioned, and I expect that a significant fraction of 
part 2 will be devoted to such issues, but I think it's very important to 
discuss evolution of markup and content in combination, and I'm reluctant 
to say that by definition "An XML language is an identifiable set of 
vocabulary terms."  As Norm says and part 1 strongly suggests , an XML 
language is a class of texts, all of which are constrained by the rules of 
XML 1.x to be at least well formed, and some of which are further 
constrained for the purpose of conveying some particular sort of 
information (technical report, purchase order, TAG Finding, whatever.)  I 
think that part 2 should discuss the versioning and evolution of such 
languages.


--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Monday, 14 May 2007 13:21:35 UTC