Ext/Vers terminology with generic/xml split from David Orchard on 2006-02-24 (public-xml-versioning@w3.org from February 2006)

From: David Orchard <dorchard@bea.com>
Date: Fri, 24 Feb 2006 15:08:23 -0800
To: <public-xml-versioning@w3.org>
Message-ID: <E16EB59B8AEDF445B644617E3C1B3C9CDB8A98@repbex01.amer.bea.com>
I've updated the terminology section including diagrams to do the
generic/xml split.  


1.1 Terminology


The terminology for describing languages, namespaces, constraints,
evolvability etc. follows. Let us consider an example. Two systems need
to exchange name information. Names may not be the perfect choice of
example because of internationalization reasons, but it resonates
strongly with a very large audience. The Name language is created to be
exchanged. [Definition: A producer is an agent that creates an instance.
][Definition: A Production is the creation of an instance.]. A producer
produces an instance for the intent of conveying information.
[Definition: A consumer is an agent that consumes an
instance.][Definition: A Consumption is the processing of an instance of
a language.] A consumer is impacted by the instance that it consumes.
That is, it interprets that instance and bases future processing, in
part, on the information that it believes was present in that instance.
An instance can be consumed many times, by many consumers, and have many
different impacts.

Generally, a language has one or more vocabularies that each may have
multiple terms. Formally, [Definition: a language is an identifiable set
of vocabulary terms with defined syntactic and semantic constraints. ]
By language, we mean the set of text that are members of that language
and used by a particular application. [Definition: A vocabulary is a set
of terms]. The Name language consists of 3 terms: name, first, last. In
order to identify the terms in the Name language in XML, a namespace is
assigned to the terms. Other examples include the elements and
attributes of XHTML 1.0 or the names of built-in functions in XPath 2.0.
The Name language could consist of terms from other vocabularies, such
as Dublin Core or UBL. These terms each have their own namespaces,
illustrating that a language can comprise vocabularies from multiple
namespaces. 

The name language takes the 3 terms and specifies the constraints: that
a name consists of a first and a last. [Definition: A language has a set
of constraints that apply to the vocabulary terms in the language. ]
These constraints can be defined in machine processable syntactic
constraint languages such as XML Schema, human readable textual
descriptions such as HTML descriptions, or are embodied in software.
Languages may or may not be defined by a schema in any particular schema
language. The constraints on a language will govern the membership of
instances in the language, which may be considered the set of strings
that are in the language.

In general, the intended meaning of a vocabulary term is scoped by the
language in which the term is found. However, there is some expectation
that terms drawn from a given vocabulary will have a consistent meaning
across all languages in which they are used. Confusion often arises when
terms have inconsistent meaning across language. The Name terms might be
used in other languages, but it is generally expected that they will
still be "the same" in some meaningful sense.

[Definition: Text is a specific, discrete sequence of characters]. Given
that there are constraints on a language, any particular text may or may
not have membership in a language. Indeed, a particular string of
characters may be a member of many languages, and there may be many
different strings of characters that are members of a given language.
The text of the language are the units of exchange. Documents are texts
of a language.

These terms and their relationships are shown below

There are many different systems for exchanging texts in languages, such
as SQL, Java, XML, ECMAScript, C#. We will briefly describe some key
refinements to our lexicon for XML. An XML language has a vocabulary
that may use terms from one or more XML Namespaces (or none), each of
which has a namespace name. [Definition: An XML language is an
identifiable set of vocabulary terms with defined XML syntactic and
semantic constraints. ] By XML language, we mean the set of elements and
attributes, or instances, used by a particular application. The Name
language - consisting of name, first, last - has a namespace is assigned
to the terms. We use the prefix "namens" to refer to that namespace. The
Name language could consist of terms from other vocabularies, such as
Dublin Core or UBL. These terms each have their own namespaces,
illustrating that a language can comprise vocabularies from multiple
namespaces. An XML Namespace is a convenient container for collecting
terms that are intended to be used together within a language or across
languages. It provides a mechanism for creating globally unique names. 

We shall use the term instance
<file:///C:/AllMaterial/W3C/TAG/vers-v3.html#instance>  when speaking of
sequences of characters (aka text) in XML. [Definition: An instance is a
specific, discrete sequence of terms]. Documents are instances of a
language. In XML, they must have a root element. A name document might
be a name element as the root element. Alternatively, the name
vocabulary may be used by a language such as purchase orders. The
purchase order documents may contain name elements. Thus instances of a
language are always part of a document and may be the entire document.
XML instances (and all other instances of markup languages) consist of
markup and content. In the name example, the first and last elements
including the end markers are the markup. The values between the start
and end markers are the content. An instance has an information model.
There are a variety of data models within and without the W3C, and the
one standardized by the W3C is the XML infoset.

The XML related terms and their relationships are shown below

A stylesheet processor is a consumer of the XML document that it is
processing (the producer isn't mentioned); in the Web services context
the roles of producer and consumer alternate as messages are passed back
and forth.Note that most Web service specifications provide definitions
of inputs and outputs. By our definitions, a Web service that updates
its output schema is considered a new producer. A service that updates
its input schema is a new consumer. 

We now return to our discussion of languages in general. Extensibility
is a property that enables evolvability of software. It is perhaps the
biggest contributor to loose coupling in systems as it enables the
independent and potentially compatible evolution of languages. Languages
are defined to be [Definition: Extensible if instances of the language
can include terms from other vocabularies.]. The name language is
extensible if it can include terms from other vocabularies, like a new
middle term. 

_______________________________________________________________________
Notice:  This email message, together with any attachments, may contain
information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
entities,  that may be confidential,  proprietary,  copyrighted  and/or
legally privileged, and is intended solely for the use of the individual
or entity named in this message. If you are not the intended recipient,
and have received this message in error, please immediately return this
by email and then delete it.
Attachments

application/octet-stream attachment: ext-vers-uml.violet
image/png attachment: ext-vers-xml-uml.png
image/png attachment: ext-vers-generic-uml.png
application/octet-stream attachment: ext-vers-generic-uml.violet
Received on Friday, 24 February 2006 23:10:23 UTC