Re: DTD interrogations

Dubuc Jean-Pierre wrote:
> 
> I'm learning how to read and understand DTD and came up with those 
> interrogations.
> 
> ref.: http://www.w3.org/TR/REC-html40/intro/sgmltut.html#h-3.3

This is a tutorial, and isn't normative.  What actually defines how to 
interpret HTML DTDs is the SGML specification (which isn't a free 
download).  However, I would suggest that anyone with any familiarity 
with regular expresssions, the meanings of most things should be clear.

The XML specification is normative for XML DTDs, but few new 
developments use XML DTDs.

> 
>    
> 
> According to this statement :
> 
> A | B   Either A or B must occur, but not both.
> 
>  
> 
> This element declaration :
> 
> <!ELEMENT DL    - - (DT|DD)+>
> 
>  
> 
> Should read in my perception :
> 
> The DL element must contain one or more of either DT or DD elements in 
> any order, but not both.

Exactly one of DT and DD must occur on every pass through the 
expression, but the contents of the parentheses are re-interpreted each 
time. Your interpretation would be written as (DT+)|(DD+), although I'm 
not sure if the parentheses are needed.  How would you represent 
(DT|DD)+ with your interpretation of the notation?


> 
> Also, I'd like to know according to this statement :
> 
> +(A)    A may occur.

+ is not monadic.  (B)+(A) means any number of As may occur within the 
structure described by B.  A* means there can be a run of As, but they 
must be at the current expression level and consecutive.  At most one 
would be A?.

Note that diadic + and - are not regular expression operators.

-- 
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.

Received on Thursday, 25 September 2008 21:42:25 UTC