W3C home > Mailing lists > Public > www-qa@w3.org > March 2003

Re: DTD/Schema level of validity

From: Al Gilman <asgilman@iamdigex.net>
Date: Sat, 29 Mar 2003 08:36:47 -0500
Message-Id: <5.1.0.14.2.20030329075149.02c46580@pop.iamdigex.net>
To: www-qa@w3.org

At 03:14 AM 2003-03-29, Bjoern Hoehrmann wrote:

>* Karl Dubost wrote:
> >Right now, we define the level of validity of a documents with
> >regards to its DTD. For example, the Markup validator validates
> >document against the DTD.
> >
> >With the coming of new technologies, often a schema AND a DTD are developed.
> >The Schema being often more refined, more detailed than the DTD.
> >
> >For validity:
> >
> >       How do we define the validity of a document
> >       when a schema and a DTD for the same technology
> >       do not express the same constraints?
>
>Validity is defined by the schema language and the schema document. When
>a document instance meets the requirements of the associated schema
>instance it's said to be valid. For example, XML 1.0 SE defines:
>
>   An XML document is valid if it has an associated document type
>   declaration and if the document complies with the constraints
>   expressed in it.
>
>When there are multiple assoicated schema instances for a given document
>instance, you have multiple notions of Validity, if there is a DTD and a
>W3C XML Schema and the document meets all their requirements, it's both,
>XML-1.0-DTD-valid and W3C-XML-Schema-1.0-valid. I think it's not a good
>idea to redefine document Validity, regardless of how many associated
>schemas exist. For example, a XHTML 1.0 document with <img alt='text'
>src='image' width='damn fat!' /> could be XML-1.0-DTD-valid but calling
>it a Valid XHTML 1.0 document is at least misleading. XHTML 1.0 does not
>(re-)define the term "valid" in it's context, defining validity should
>be subject to schema languages only.

Disagree.  'Valid' should only be restricted to "schema-valid" in places
where this sense is worth treating as the dominant sense.  The
compound terms "valid to schema" or "PSV validity" etc. should be
used to make this sense binding clear in more general contexts.

In particular, documents may assert constraints over and above the
constraints systematically imported from a schema, and a document
which is valid in its own terms will need for these constraints to
be satisfied as well as those imported from a schema.

The distinguishing trait is that all local instances of [local and indirect
from an external reference] asserted constraint patterns are true.  There is
no need that all such constraint patterns be articulated in an external
schema.  That will be the dominant case in Web Services, which is where we
will begin to get a public that cares about valid documents, but it is not
definitive.

The simple 'valid' should generally be used to mean "valid on its own terms"
which is to say that _all_ constraint assertions declared in the document
check out.  This *must include* the constraints of a cited-as-controlling
schema but *is not limited to* such external constraint patterns.

Local declarations of constraint patterns may appear in a 'metadata' section
of an SVG or SRGS document, for example, as well as those things inherited
from a more general schema.

In the common ideology, documents aren't valid; assertions are valid.  It is
possible for a document to make multiple, independent or
partially-dependent allegations which fall under the general semantic of
'dc:conforms-to.' The historical use has to do with the validity of the
constraint conformance assertions implied by a DOCTYPE declaration.

But the issue is "well-posed constraint assertions" not "in a schema."

A document is valid iff all well-posed constraint assertions are satisfied.

The technology we need to work on is the method of enumerating the checkpoints
that are the instances of all well-posed constraint assertions.  Schemas can
and do contribute to this collection but have no exclusive role in that regard.

Schemas are a dominant mechanism to collect and reuse constraint assertions,
but not the only way they can be well-posed.

Al
Received on Saturday, 29 March 2003 08:39:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 12:13:59 GMT