W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2000

Re: DTD of natural language

From: Rick Jelliffe <ricko@gate.sinica.edu.tw>
Date: Wed, 8 Nov 2000 18:21:54 +0800 (CST)
To: "Richard A. O'Keefe" <ok@atlas.otago.ac.nz>
cc: html-tidy@w3.org, johnspeter@hotmail.com
Message-ID: <Pine.GSO.4.21.0011081815010.29520-100000@gate>
On Wed, 8 Nov 2000, Richard A. O'Keefe wrote:

> SGML and XML just aren't *useful* for expressing the structure of English
> or any other natural language.  Lisp data values, yes.  Prolog data values,
> yes.  Either would be considerably more compact than XML.  You might raise
> RDF as a counter-example, but RDF is inexpressibly clumsy and bulky, and
> the structural constraints could not be expressed as a DTD.

You mean SGML and XML DTDs: you can express any kind of structure (that
can be captured as directed, cyclic graph) in SGML and DTD.  A grammar
abstracts away certain aspects to reduce the amount of markup needed.

People involved in this area may be interested in exploring the Schematron 
toolkit, which is a schema language for XML based on making
assertions about the presence or absense of patterns (guarded XPaths) in
an XML document.  This allows non-regular-expression-based patterns in a
document to be reported or required.

http://www.ascc.net/xml/resource/schematron/schematron.html

It is a very simple language, and people find it pretty friendly.

Cheers
Rick Jelliffe
Received on Wednesday, 8 November 2000 05:22:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT