- From: Herr Christian Wolfgang Hujer <Christian.Hujer@itcqis.com>
- Date: Wed, 12 Mar 2003 00:49:00 +0100
- To: Ian Hickson <ian@hixie.ch>, Etan Wexler <ewexler@stickdog.com>, glazman@netscape.com (Daniel Glazman), Tantek Çelik <tantek@cs.stanford.edu>
- Cc: "www-html@w3.org" <www-html@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Ian, Etan, Daniel, Tantek, dear list members, Am Dienstag, 11. März 2003 23:18 schrieb Ian Hickson: > On Tue, 11 Mar 2003, Etan Wexler wrote: > > An attribute is not content. > > This is fundamentally untrue. Oh, despite your good argumentation, the statement that an attribute is not content is neither right nor wrong according to the XML specs, I think. In rules, content is used only in [39]: [39] element ::= EmptyElemTag | STag content ETag ...and [78]: [78] extParsedEnt ::= TextDecl? content ...and defined in [43]: [43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* This of course is called element content, which is a restriction on the term content, but on the other hand, what else has content when not an element? Yes, an attribute's value could be said to be the attribute's content. But an attribute _itself_ is not content. The element content is between the start and end tag. The XML specification is very unclear in that it does not provide a definition of the term content, the term is only used. Some parts of the spec (when talking of entities or the standalone declaration) the spec considers all information to be the content, other parts say content is between a start tag and an end tag. I think in textual markup, and that's what HTML (or e.g. DocBook) is, the /main information/ should be encoded as element content, while attributes should be used for meta information. Of course there are exceptions from that rule, and the distinction between information and meta information is not always clear. For instance, in HTML there are at least three levels of meta information. * The structure (a level of meta information nearly always present in XML documents). * Attributes, like start or cite or title * Meta elements, that provide meta information on the document as such So I agree with both of you, I agree with that content is not just element content, that's what I think Ian meant, but that in a closer sense, only element content should be treated as content, that's what I think Etan meant. The rest is meta information, at least in a markup language for text based information. (Other markup languages like SVG or XML Schema need a different approach) > In fact, the question about whether to put content in attributes or > elements during the development of markup languages is one of the most > hotly debated, and, ironically, one of the least important. I don't think so, at least as long as a DTD is required. An element's content simply can't be ID, IDREF, IDREFS, NMTOKEN or NMTOKENS. This might be important for some applications. I think there are some main helpers on the decision between element and attribute: * Might I want to nest something inside it later? For instance, a chapter title must be an element because you might want to nest other markup like emphasis or code. * Must it be unique, referable or a reference? Then it's an attribute. * Is it ordered or unordered? (ordered -> element, unordered -> attribute) * Is it the information itself or is it meta information? (info -> element, meta -> attribute) A good approach is taking a look at markup in use, like XHTML, SVG, Ant's build files, XML Schema to find out what to use when. But on the other hand, imagine you have a DTD, and see it can be improved, but if you do so, all documents require changes. If it's a small number and you use a good editor like vim or emacs, the change is no problem for the editor. if it's a huge number, hundrets or thousands or more, just write a transformation. So really, often people are too cautious when defining a new DTD. Just go ahead, if you make mistakes and detect them late, XSLT will help you correct them, I tell my students. Back to the original topic, wether to have start and value attributes on the list or a continue reference. I completely agree with Etan's argumentation that a more logical approach is needed. I also completely agree with Daniel's argumentation that the physical approach may not be eliminated in favour of the logical approach because there are situations where the logical approach simply can't replace the physical approach. I support both. So I really like Tantek's statement: > I see no problem with having both solutions coexist. And am looking forward on hopefully seeing both solutions in the next drafts. Bye - -- ITCQIS GmbH Christian Wolfgang Hujer Geschäftsführender Gesellschafter Telefon: +49 (0)89 27 37 04 37 Telefax: +49 (0)89 27 37 04 39 E-Mail: Christian.Hujer@itcqis.com WWW: http://www.itcqis.com/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE+bnXszu6h7O/MKZkRAhcuAJ9TeSZVTQeW0k3UJtsFTOVJOs169ACfSg83 3lKer3UvCpiD9DOIsnNGBCY= =eAtn -----END PGP SIGNATURE-----
Received on Tuesday, 11 March 2003 18:49:10 UTC