Well-formed (was: Re: F2F Proposed Resolutions Draft Updates)

Hi Gez, Roberto and others,

At 17:09 17/06/2005, Gez Lemon wrote:

>On 17/06/05, Yvette Hoitink <y.p.hoitink@heritas.nl> wrote:
> > However, well-formedness is a concept that does not apply to all
> > technologies but only to technologies that use tags such as XML-based
> > technologies and HTML. That's why we limited the requirement to SGML-based
> > delivery units.
>
>HTML isn't required to be well-formed. Well-formedness was introduced
>with XML, so I'm confused as to why an SGML language like HTML would
>have to be well-formed, which isn't required by spec, but not
>necessarily valid, which is required by spec.

In Brussels, we tried to define a level of "correctness" that is lower than
validity and that still makes sense for formats that are not based on XML.
It is true that SGML does not define well-formedness, but if you say that a
well-formed document is essentially "one that can unambiguously be parsed 
to create
a logical tree in memory" (Jon Bosak, at 
http://www.isgmlug.org/n3-1/n3-1-18.htm),
then you can apply this concept also to SGML. It means that you use the SGML
declaration [1] and the DTD to check correct nesting of elements,
that attributes only occur in start tags, that naming rules for elements are
obeyed, that the delivery unit uses the charset defined in the SGML 
declaration,
etc.
Because "SGML applications" have an SGML declaration, I think it is possible
to define (and, consequently, require) something that may be called
"well-formedness" for HTML.

Regards,

Christophe Strobbe



[1] For an SGML declaration for XML, see 
http://www.w3.org/TR/NOTE-sgml-xml-971215.html;
for the HTML 4.01 SGML declaration, see 
http://www.w3.org/TR/1999/REC-html401-19991224/sgml/sgmldecl.html.


-- 
Christophe Strobbe
K.U.Leuven - Departement of Electrical Engineering - Research Group 
on  Document Architectures
Kasteelpark Arenberg 10 - 3001 Leuven-Heverlee - BELGIUM
tel: +32 16 32 85 51
http://www.docarch.be/ 

Received on Friday, 17 June 2005 16:41:47 UTC