W3C home > Mailing lists > Public > public-html@w3.org > December 2007

Re: DogFood (and inline/block constraints)

From: T.V Raman <raman@google.com>
Date: Wed, 12 Dec 2007 08:16:06 -0800
Message-ID: <18272.2374.624116.565727@retriever.corp.google.com>
To: davidc@nag.co.uk
Cc: public-html@w3.org


As someone who came to html markup from the world of LaTeX, I
strongly agree with David.

We authored the XForms spec in xmlspec, and not having the
bizarre HTML restriction of li elements not containing paragraphs
made life a lot easier.

David Carlisle writes:
 > 
 > 
 > Ian
 > 
 > > However, this is far from a resolved issue. What would be especially 
 > > useful is a study of use cases -- occurances where people mix inlines and 
> blocks, and why unmixed alternatives don't really address the needs of the 
> author. (Your blog would be a great start to look for such cases.)
 > 
 > HTML has always stood out amongst marked up document formats in having a
 > very restricted content model for paragraphs that doesn't allow block
 > level markup. I always viewed div as "p with a fixed content model"
 > (which isn't really the intention of div, but a very plausible way of
 > using it.)
 > 
 > docbook, TEI, the W3C's xmlspec markup all allow block level markup in
 > paragraphs, as does (La)TeX.
 > 
 > Consider the following two paragraphs:
 > 
 > The subject of this paragraph is the equation
 >   E=mc^2
 > where c denotes the speed of light.
 > 
 > 
 > I have a list of three things
 > 1, the first thing,
 > 2, the second thing and
 > 3, the third thing.
 > This list is not very interesting.
 > 
 > 
 > In the first case the paragraph consists of a single sentence: the
 > "where..." is not a new paragraph it doesn't want to be marked up as 
 > <p class="no-indentation">.. 
 > It is just the end of the sentence and the end of the paragraph, so
 > should be  in the same block as the start of the sentence:
 > <p>The subject...
 > 
 > 
 > 
 > The second case with a list is similar, although there at least the text
 > following the list is a different sentence.
 > 
 > 
 > Apart from forcing the users to mark up the text in a way that is at
 > variance with the intended meaning (you can't have a sentence that spans
 > two paragraphs, even if that sentence contains a quotation that itself
 > has block structure) this restricted content model causes many problems
 > when mapping from other markup languages to HTML  (If html p is used to
 > model paragraphs.
 > 
 > 
 > See for example a recent comment on coming from the w3c's xmlspec markup.
 > 
 > http://lists.w3.org/Archives/Public/spec-prod/2007OctDec/0002.html
 > 
 >   I made a couple of other fixes to the standard XSLT but the one that
 >   probably needs doing the most that I haven't done is that a list inside
 >   a paragraph in xmlspec generates invalid XHTML. 
 > 
 > 
 > A lightning survey of other document types:
 > 
 > DoocBook paragraphs
 > 
 > http://www.docbook.org/tdg/en/html/para.html
 > 
 >    A Para is a paragraph. Paragraphs in DocBook may contain almost all
 >   inlines and most block elements. 
 > 
 > 
 > TEI pargraphs
 > 
 > http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-p.html
 > 
 >   [ a bit inscrutable but note that the content model contains (at least)
 >   lists as well as inline text]
 > 
 > 
 > XHTML2 paragraphs
 > 
 > http://www.w3.org/TR/xhtml2/mod-structural.html#sec_8.6.
 > 
 >    In comparison with earlier versions of HTML, where a paragraph could
 >    only contain inline text, XHTML2's paragraphs represent the
 >    conceptual idea of a paragraph, and so may contain lists,
 >    blockquotes, pre's and tables as well as inline text. 
 > 
 > 
 > LaTeX paragraphs
 >     [couldn't find a good URI to cite, but trust me the LaTeX system
 >     goes to some lengths to support nested block structures such as
 >     displayed mathematics and lists within a paragraph]
 > 
 > 
 > All of the above document types are commonly used for authoring and
 > converted to HTML for display. the propsoal to restrict the content
 > model for div but not follow XHTML2 in opening up the content model for
 > p makes that conversion significantly harder, and makes the resulting
 > HTML significantly less structurally useful as you need to introduce
 > spurious paragraphs together with extra CSS styling to supress any
 > typographic display that would normally be associated with a paragraph.
 > 
 > 
 > 
 > David
 > [Ian, apologies, initial mail to you cc'ed list had mistyed address]
 > 
 > ________________________________________________________________________
 > The Numerical Algorithms Group Ltd is a company registered in England
 > and Wales with company number 1249803. The registered office is:
 > Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
 > 
 > This e-mail has been scanned for all viruses by Star. The service is
 > powered by MessageLabs. 
 > ________________________________________________________________________

-- 
Best Regards,
--raman

Title:  Research Scientist      
Email:  raman@google.com
WWW:    http://emacspeak.sf.net/raman/
Google: tv+raman 
GTalk:  raman@google.com, tv.raman.tv@gmail.com
PGP:    http://emacspeak.sf.net/raman/raman-almaden.asc
Received on Wednesday, 12 December 2007 16:17:21 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:51 UTC