W3C home > Mailing lists > Public > public-html@w3.org > October 2007

Re: 'HTML 5' and some poem markup?

From: Dr. Olaf Hoffmann <Dr.O.Hoffmann@gmx.de>
Date: Sat, 6 Oct 2007 13:41:13 +0200
To: public-html@w3.org
Message-Id: <200710061341.13229.Dr.O.Hoffmann@gmx.de>

James Graham wrote:

> I don't understand how readers would benefit from a poem element. Is
> there some special UA behavior you imagine?

Authors normally have specific expectations about
the presentation of a line of a poem, they do not really want
a line break within a line, if this can be avoided, therefore it is called
a 'line' in english language and not 'one-or-more-lines', I think.
With limited space there is a need to break it anyway of course,
maybe after some padding is removed from the complete
poem. Then typically the second line is  indented to inform the
reader about the problem. In browsers it may be another 
approach to add a specific symbol after the undesired line break
as a warning. Not completely trivial even to style such a useful
behaviour with CSS, but obvious for the browser to detect such
problems and to care about them if there is a line element.
I think there are more types of literature with something like
a defined rhythm, which require a more careful presentation as
for example ordinary paragraphs, this could be gathered in such
an element, then 'poem' or 'lyrics' is just the generic element
name for such problematic content, having such a line break
problem (not to be mixed up with pre) and a requirement of
a list like structure without beeing just an ordered list (or course
order is important in poems or lyrics normally, therefore it
it is surely not an unordered list).

> Unfortunately depending on the use of explicit semantics in this way
> doesn't seem to work so well in practice. In this case I would imagine
> that the biggest problem would be search engines only picking up the
> small fraction of total poetry marked as <poem>, thus making such a
> facility too unhelpful to be worth deploying, although one can imagine
> problems with e.g. spam, for example, spammers swamping the relatively
> small amount of poetry content with much more rubbish.

Sure there will always be abuse of possibilities, but this is a chance for
search engines too to analyse the content better as other search engines.
There is a lot of nonsense in the results of search engines today, anyway
many people still believe in search engines and if robots get more 
clever by reading high quality literature with sufficient markup, maybe 
one day they will be more clever as spammers at all ;o)
Intellectual capacities of spammers are much more limited as the
hard disk space of robots, therefore there is still some hope.
The IQ of human spammers does not increase significantly, this
of robots/software/AI do.


Peter Krantz wrote:

> I was refering to the extension mechanism RDFa, not the default
> elements of XHTML2. Please see http://www.w3.org/TR/xhtml-rdfa-primer/
> for an introduction.

It is already hard to create a sufficient structure of a traditional 
poem with the current (X)HTML. It is true, that this is slightly improved
in HTML5 or XHTML2 drafts.
But what to do now with poems, how to markup? This happens
already before I add values to the class attribute or a role attribute.
I simply have to put such attributes in sufficient elements somehow.
Using a defintion list for stanzas? It has the correct structure, but
to claim that the lines define the stanzas is a little bit out of range.
But it is maybe the best approach I have seen yet.
Using p for a stanza - impossible, it can contain currently only inlines,
lines in a stanza are more like block elements.
Using pre? Not really, sure one can use span to markup the lines,
but there is no really significant structure. If an author starts to
require pre for a poem, I would recommend to use SVG, because
such poems are already a mixture of graphics and text, and SVG does 
have 'text' and 'tspan' and some more, this is not less as HTML 
offers for poems currently and one can add the same values as well 
into the class attribute of SVG.
Doing everything with div and class attributes? This seems to
be the state of the art of rdf too, but I can create lists, tables, paragraphs
and most other elements too just using divs with class attributes, this
is the dead end of semantics in HTML - then we even can write
<div class="html:html"> instead of <html> ;o) it mainly blows up
source code. An no viewer will care about 
<div class="olaf:Gedicht"> of my private set of semantics of 
lycris/poems written in german, therefore of course I will use
'Gedicht' instead of 'poem', because as an arbitrary attribute value
'Gedicht' has much more semantic meaning for a poem written in
german language as just 'poem'. Of course, french, spanish, russian
etc authors will use different attributes with semantical meaning
and viewers have to understand thousends or millions of semantic
sets just to identify three poetric core elements correctly.





Doug Schepers wrote:

> Hi, Dr. O-
>
> I'd like to note that in addition to poetry, the same solution could be
> applied to song lyrics, which are very widespread content on the Web.
> There are many sites devoted to nothing else, and sites like MySpace
> (and many blogs) have a lot of lyrical content.

Yes of course, this is closely related and can be combined, most 
lyrics have a similar structure as classical poems.
Both are in most cases a list of stanzas, where stanzas are a
list of lines. This is a specific structure of text, like lists and
tables and paragraphs are.

>
> I personally favor the idea of loosening up the definition of <p> into
> just that of a block of text (since the idea of a paragraph is not
> universal among natural-language orthographies), and using some other
> semantic system to annotate specialities of written language (where you
> could, for example, choose between a simple poetry markup and a more
> complex one that notates free verse or  sonnets or even structural
> elements of iambic pentameter).  

Yes, what you can do is to put a li element or a dd into the p to create
a line if this model is expanded.
If HTML5 specifies, that a p element containing li elements is lyric or
poem content like a stanza/strophe, this may be a minimal solution,
together with something like <section role="poem"> this may work,
if 'poem' or 'lyrics' is specified as something meaningful in a core
RDFa recommendation or such a recommendation references an
existing poem/lyrics scheme containing all such nice things as
'sonnet', 'haiku', 'concretePoetry', 'visualPoetry' - these details are
really out of the scope of a general text markup language.


> This might be RDFa, or spans marked 
> with microformats tags.  You'll be able to get much more precision than
> with a blunt tool like HTML.


Currently I'm not looking for precision. I'm mainly looking for any element
except div beeing sufficient to contain ordinary poems with a somehow
related semantical meaning.
And sure, such things are useful for details like the role element. 
But the minimal requirement I can see is to have about three elements 
defined in HTML as the containers for such additional information. 
Or one can use the poem element as a defined container for a more 
specific poem XML or even SVG - as I found out,
SVG is usable to markup concrete poetry, not perfect but it works where
it starts to get completely hopeless with HTML.
But having one to three core elements already in HTML to serve some 
semantical information about the content is still useful as a meaningful
starting point, not just to say this is an object or something diverse, 
anything meaningless.

It is an exaggeration that I try to blow up HTML with very specific elements.
I just like to have any meaningful element to put in some poem like content.
And none of the current suggestions solves this problem yet. As most
authors I do not need a specific element for 'sonnet', even if I have a
several of them waiting for some useful markup.
Received on Saturday, 6 October 2007 11:46:10 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:27 UTC