Re: Against 'start' and 'value' attributes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Ian, Etan, Daniel, Tantek, dear list members,


Am Dienstag, 11. März 2003 23:18 schrieb Ian Hickson:
> On Tue, 11 Mar 2003, Etan Wexler wrote:
> > An attribute is not content.
>
> This is fundamentally untrue.
Oh, despite your good argumentation, the statement that an attribute is not 
content is neither right nor wrong according to the XML specs, I think.

In rules, content is used only in [39]:
[39]    	element 	   ::=    	EmptyElemTag
			| STag content ETag
...and [78]:
[78]    	extParsedEnt 	   ::=    	TextDecl? content
...and defined in [43]:
[43]    	content 	   ::=    	CharData? ((element | Reference | CDSect | PI | 
Comment) CharData?)*

This of course is called element content, which is a restriction on the term 
content, but on the other hand, what else has content when not an element? 
Yes, an attribute's value could be said to be the attribute's content. But an 
attribute _itself_ is not content. The element content is between the start 
and end tag.

The XML specification is very unclear in that it does not provide a definition 
of the term content, the term is only used.

Some parts of the spec (when talking of entities or the standalone 
declaration) the spec considers all information to be the content, other 
parts say content is between a start tag and an end tag.

I think in textual markup, and that's what HTML (or e.g. DocBook) is, the 
/main information/ should be encoded as element content, while attributes 
should be used for meta information.
Of course there are exceptions from that rule, and the distinction between 
information and meta information is not always clear.

For instance, in HTML there are at least three levels of meta information.
* The structure (a level of meta information nearly always present in XML 
documents).
* Attributes, like start or cite or title
* Meta elements, that provide meta information on the document as such


So I agree with both of you, I agree with that content is not just element 
content, that's what I think Ian meant, but that in a closer sense, only 
element content should be treated as content, that's what I think Etan meant. 
The rest is meta information, at least in a markup language for text based 
information. (Other markup languages like SVG or XML Schema need a different 
approach)



> In fact, the question about whether to put content in attributes or
> elements during the development of markup languages is one of the most
> hotly debated, and, ironically, one of the least important.
I don't think so, at least as long as a DTD is required. An element's content 
simply can't be ID, IDREF, IDREFS, NMTOKEN or NMTOKENS.
This might be important for some applications.

I think there are some main helpers on the decision between element and 
attribute:
* Might I want to nest something inside it later?
For instance, a chapter title must be an element because you might want to 
nest other markup like emphasis or code.
* Must it be unique, referable or a reference?
Then it's an attribute.
* Is it ordered or unordered? (ordered -> element, unordered -> attribute)
* Is it the information itself or is it meta information? (info -> element, 
meta -> attribute)

A good approach is taking a look at markup in use, like XHTML, SVG, Ant's 
build files, XML Schema to find out what to use when.

But on the other hand, imagine you have a DTD, and see it can be improved, but 
if you do so, all documents require changes. If it's a small number and you 
use a good editor like vim or emacs, the change is no problem for the editor. 
if it's a huge number, hundrets or thousands or more, just write a 
transformation.

So really, often people are too cautious when defining a new DTD. Just go 
ahead, if you make mistakes and detect them late, XSLT will help you correct 
them, I tell my students.


Back to the original topic, wether to have start and value attributes on the 
list or a continue reference.
I completely agree with Etan's argumentation that a more logical approach is 
needed.
I also completely agree with Daniel's argumentation that the physical approach 
may not be eliminated in favour of the logical approach because there are 
situations where the logical approach simply can't replace the physical 
approach.
I support both.

So I really like Tantek's statement:
> I see no problem with having both solutions coexist.
And am looking forward on hopefully seeing both solutions in the next drafts.


Bye
- --
ITCQIS GmbH
Christian Wolfgang Hujer
Geschäftsführender Gesellschafter
Telefon: +49  (0)89  27 37 04 37
Telefax: +49  (0)89  27 37 04 39
E-Mail: Christian.Hujer@itcqis.com
WWW: http://www.itcqis.com/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+bnXszu6h7O/MKZkRAhcuAJ9TeSZVTQeW0k3UJtsFTOVJOs169ACfSg83
3lKer3UvCpiD9DOIsnNGBCY=
=eAtn
-----END PGP SIGNATURE-----

Received on Tuesday, 11 March 2003 18:49:10 UTC