The many roles of HTML.

This is something I had thought about some time ago, but decided to
post my thoughts here after reading the heading (<h>/<hx>) dicussion.

HTML does several things that perhaps it shouldn't or should be
further modified.

1) It attaches metadata to the document. Personally I'm of the mind
that HTML shouldn't do this since it should be the responsibility of
the OS to store this data in a general sense as part of the file
descriptor. Each document having it's own metadata structure just
makes it that much harder for the OS to work like that. However a
generic container for metadata that is a transfer protocol could be
useful.

2) It organizes documents into a container. Each top-level section in
a document is a sub-document. Each has it's own title and while the
sub-documents are linked, the contents of each are not continuations
of each other. Each sub-document could live on it's own, though in
this case it would probably detract from our usage of the documents.
Though with this in mind I would decrecate <body> for multiple
sections. More on this below.

3) It marks up outline structure. This is something it actually does
very well. Well as long as <hn> goes away. We should also get rid of
<div>, but that's another problem entirely. Otherwise it does its job
very well.

4) It marks up content for semantic classification. This is doesn't do
so well. It does an adequate job, but seeks to apply tree structure to
data that doesn't have a tree structure. It rallie against empty,
replaced, semantic elements which are a perfectly valid case. While
entities could be used, they aren't necessarily beneficial. Also the
semantic classes in HTML seem more programmer oriented than other
traditional bodies of work (e.g. <code>, <blockcode>). I'm guessing
that if you could find someone in the literary world, they would ask
for a few more especially in the references section. Maybe subtitle
and the like. Content classification I believe should be more
plug-and-play. This should be sub-classed out.

To make some of these changes, I would do the following.

1) Take out linking structures and turn them into their own language
but simplify them (XLink goes overboard). A simply @href attribute
would suffice.

2) The metadata part of a document should be it's own specification
used by anyone and should be limited to a transmission protocol so
that OSes can manage their own metadata.

3) Deprecate <body>, replacing it with <section>, <table>, <object> or
<list> (or other top-level elements from other specs (svg,
spreadsheet, slides, etc).

4) Create an element called <package> used to place multiple top-level
elements into it.

5) Fix namespaces (just thought I'd throw this in everywhere I get the chance).

Orion Adrian

Received on Monday, 13 June 2005 12:59:33 UTC