Re: SVG schema: Relax NG or not?

[ personal opinion, not representing the HTML WG's view ]

Tobias Reif <tobiasreif@pinkjuice.com> wrote:

> Which W3C working groups besides the XHTML one chose RNG before now?

RDF Core [1], Web Services Description [2], XML Signature [3]
(not officially), ... an infamous HLink draft [4] was the first one
to adopt it normatively, though.

> The XHTML people say:
> 
> http://www.w3.org/TR/2003/WD-xhtml2-20030506/
> "This version includes an early implementation of XHTML 2.0 in RELAX NG 
> [RELAXNG], but does not include the implementations in DTD or XML Schema 
> form. Those will be included in subsequent versions, once the content of 
> this language stabilizes."

The original plan was to wait until the language becomes more stable
and provide all 3 schemas at the same time, but since many people
requested some sort of machine-readable schema earlier than later, we
chose to provide RELAX NG schema first, simply because the editor (me)
was too lazy to work out modular DTDs and XML Schemas while the WG is
actively changing the design of the language.  For HLink we already
provided schemas in DTD, RELAX NG and XML Schema, and the next draft
of XFrames will also include schemas in DTD, RELAX NG and XML Schema.

> It seems they chose RNG as the format for the master schema. Why? 
> Because RNG meets their requirements best, I would think.
> (Also see
> http://lists.w3.org/Archives/Public/www-svg/2003May/0048.html)

This may be a bit off-topic for www-svg, but I'll explain why, if
my experience helps the SVG community (sane people may stop here).

Last summer I did some investigation to see whether it's feasible
to have one master schema (in some schema language) and generate
other schemas nearly automatically.  It was intended from the start
that XHTML 2.0 would provide multiple schemas, and developing
complex schemas in different schema languages independently was
considered quite error-prone.

One of our primary interest was to generate reasonable XML Schema
from DTD, so I tried various conversion tools and interestingly,
for monolithic schema, converting DTD to RELAX NG by DTDinst [5]
and then convert it to XML Schema by Trang [6] produced the best
result (at that time - now Trang can convert DTD to XML Schema
directly).  Trang rocks.  Actually XHTML 1.0 in XML Schema [7]
was produced that way.  So we happen to have RELAX NG schemas
in order to produce XML Schemas.  For simple monolithic schemas,
it's certainly possible to write DTD and automatically convert
it to RELAX NG and XML Schema.

The issue was not so easy for modular schemas, although DTDinst
(now Trang) managed to convert modular DTDs into modular RELAX NG
schemas amazingly well.  Generating modular XML Schemas didn't
seem quite optimistic, and converting modular XML Schemas into
other schemas seemed even more pessimistic.

So if we accept to limit ourselves to the level of expressive power
of DTD, starting from DTD was considered most pragmatic at that time,
especially when you already have DTDs.  Both DocBook and TEI have
long history of developing modular DTDs since SGML era, so it
would be reasonable for them to generate RELAX NG and XML Schema
from DTD.

On the other hand we had enough trouble to deal with namespaces
in DTD, and XHTML 2.0 is inherently a multi-namespace vocabulary
so starting from DTD didn't seem appropriate when we develop a new
schema from scratch.

So other possibility was to start from either RELAX NG or XML Schema,
and converting RELAX NG to XML Schema seemed more promising than
the other way around.

Also we have had great trouble to make modular XML Schemas work
across different implementations, while modular RELAX NG schemas
were more interoperable.  And modular design of RELAX NG is more
straightforward than XML Schema, so when we develop a new modular
language and try various design patterns, working with RELAX NG
is much easier.  At least I was able to write up the first draft
of XHTML 2.0 schema in RELAX NG in a few hours, but I don't think
I can do that in XML Schema.

There is also an issue of expressive power.  XHTML 2.0 is not so
complex language, but still there are some constraints that not
all schema languages can express.  For example, the content model
of the 'object' element in XHTML 2.0 is something like this [8]:

    ( caption?, standby?, param*, (PCDATA | Flow)* )

This can be straightforwardly expressed in RELAX NG like this [9]:

    <define name="object">
      <element name="object">
        <ref name="object.attlist"/>
        <optional>
          <ref name="caption"/>
        </optional>
        <optional>
          <ref name="standby"/>
        </optional>
        <zeroOrMore>
          <ref name="param"/>
        </zeroOrMore>
        <ref name="Flow.model"/>
      </element>
    </define>

But neither XML Schema nor DTD can fully express this kind of mixed
content (see Uche Ogbuji's article at xmlhack.com [10] if you are
interested in this topic).  Some elements in SVG (e.g. 'text') have
similar issue, BTW.

So I chose to start from RELAX NG, primarily to make *my* life easier ;-)
This is just my case, and the conclusion might differ depends on what
vocabulary you are developing, which schema language you are most
familiar with, etc.

Whether we can achieve the original goal of automatic cenversion remain
to be seen, but I'm discussing with James Clark and we (well, *he*)
might be able to come up with a solution - possibly.  Let's see.

> But they will provide RNG, WXS (.xsd), and DTD versions of the XHTML 
> schema, probaby all normative (they don't "abandon" WXS). So if you your 
> tools are DTD based, or WXS based, you will be able to feed them what 
> they like.

Right.  As Dean said, the way we go about producing schemas isn't
as important.  The SVG WG's editor is much more diligent than me
so he could provide all schemas from the start ;-)

[1] http://www.w3.org/TR/2003/WD-rdf-syntax-grammar-20030123/#section-RELAXNG-Schema
[2] http://www.w3.org/TR/2003/WD-wsdl12-20030303/#other-schemalang
[3] http://www.w3.org/Signature/2002/07/xmldsig-core-schema.rng
[4] http://www.w3.org/TR/2002/WD-hlink-20020913/#a_RELAXNG_definition
[5] http://thaiopensource.com/relaxng/dtdinst/
[6] http://thaiopensource.com/relaxng/trang.html
[7] http://www.w3.org/TR/2002/NOTE-xhtml1-schema-20020902
[8] http://www.w3.org/TR/2003/WD-xhtml2-20030506/mod-object.html#s_objectmodule
[9] http://www.w3.org/TR/2003/WD-xhtml2-20030506/relax_module_defs.html#a_rmodule_Object
[10] http://xmlhack.com/read.php?item=1880

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Friday, 9 May 2003 09:54:17 UTC