- From: Jonny Axelsson <jonny@metastasis.net>
- Date: Wed, 09 Feb 2000 11:25:37 +0100
- To: www-html@w3.org
____________________________________________ 1. FORM CONTROLS, NAME: SCOPE AND UNIQUENESS I am a little confused about the uniqueness of the name attribute in different forms on the same HTML page. NAME for form controls have a grouping function, a bit of ID and a bit of CLASS. Any control inside the same FORM with the same NAME belong to the same group. Two controls with the same NAME inside two FORMs do not: HTML 4.0 standard section 17.2, Controls A control's "control name" is given by its name attribute. The scope of the name attribute for a control within a FORM element is the FORM element. I presume from this, and from the lack of contrary evidence, that having the same NAME in two different FORMs is not only legal, essensially NAME has a separate namespace for each FORM. This is unlike every other namespace that is unique for the entire document (URI). It is also problematic, since references like [HTML401, sect 12.2.3] implies an equality between the ID namespace, the A NAME namespace and the <formcontrol> NAME space. (cite: originally aired: <http://buzz.builder.com/cgi-bin/WebX?14@179.wMviaFO9iQX^27@.ee7e11d/24>) ________________________________ 2. NAMESPACE INSIDE A SINGLE URI Speaking of namespaces and scope, the ID namespace inside a single URI is flat, no two elements anywhere may have the same ID [HTML401, sect 7.5.2]. In XML, unique ID is a validity constraint [XML10, sect 3.3.1]. I am not proposing a change to this, but what would happen if the mapping URI# - ID was indeed hierarchical? For instance like this: <body> <div id="intro"> <h2 id="first">Intro.first</h2> <a id="pointer1" href="#first">Go first</a> <a id="pointer2" href="#second">Go second</a> </div> <div id="partI"> <h2 id="first">partI.first</h2> <h2 id="second">partI.second</h2> <a id="pointer3" href="#first">Go first</a> <a id="pointer4" href="#second">Go second</a> </div> <a id="pointer5" href="#first">Go first</a> </body> pointer1 and pointer5 would point to Intro.first, pointer3 to partI.first, pointer4 to partI and pointer5 to either: A: NONE (there are no #first in scope) B: UNDEFINED (ambigous, two equal candidates for #first) C: Intro.first (the first #first) D: partI.first (the last #first) Alternative A would be "OOHTML", but break all current HTML pages, and generally be a pain for everyone involved. Alternative B would be like the current namespace, except that pointer1 and pointer3 would be defined (pointer2 and pointer4 would always be defined. Alternative C or D would mean that every href would be defined if the corresponding ID at all exists in the URI. Alternatives B-C would give an "ID event hierarchy". 1. Is the ID inside my content? 2. Is the ID inside my containing element? ... N. Is the ID inside the root element (HTML)? The difference between alternatives B, C and D is what "the ID" is. Given alternative C, these two HTML bodies are equivalent: HTML document Alt. B Alt. C Alt. D <body> <body> <body> <body> <p id="id1"> <p> <p id="id1"> <p> <p id="id1"> <p> <p> <p id="id1"> </body> </body> </body> </body> This will have a cost at href resolution as every containing element has its own namespace which increases the max number of lookups from one to max number of containing elements (rarely more than five). There is no way to dynamically change an ID (is there??), an ID of "here" will always remain id="here", so when the lookup tables are made, they will never change. Neither do I think it should have an adverse effect on DOM (but the element reached with any id handle could be different from what it would have been using a flat namespace). There are some benefits with a hierachical namespace scenario. Here are two scenarios: Case 1: "Semantic" IDs (With a view to a database) Often the source of an HTML document would be structured tables, like from a relational database. These tables would usually have their own keys, unique IDs that are later easy to hook up to for other systems and often ideal IDs. It is not the ultimate infosystem with RDF and all that, and you might not always want to expose your internal db keys to the world, but in general a simple mapping like this gets the job done, easily: COL = dbFIELD, COL.CLASS = dbFIELDNAME TR = dbRECORD, TR.ID = dbRECORDID The same way to do this in XML would be <person id="Employee-ID"> <field1>Content</field1> ... <fieldN>Content</fieldN> </person> The problem arises when two or more db tables (or "XML records") are on the same page. It is possible that two TRs (or <person>s) would have the same ID. Indeed given the nature of RDBMS, it is highly likely. Case 2: Generated pages on demand The web designer often has no direct control over the IDs used on a single as the HTML can come from several sources. There are two cases in particular. a) Server side includes (SSI) b) "merge" pages for print version Often a page consists of several HTML parts merged with SSI (it could be header, footer, navigation bar...) while each part can have a clean namespace, it is hard to guarantee that there will not be a namespace collision when these namespaces merges. It is often relatively easy to have several HTML versions of the same document. It can be a report split up in section for convenient on-screen reading and in a single document for printout. Again when HTML merge, namespaces may collide. __________________________________ 3. BAD ID: THE NAME GENERATION GAP There is a highly restrictive subset of characters allowed in an ID, "ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".")." [HTML401, sect 6.2] The corresponding XML rule is "A Name is a token beginning with a letter or one of a few punctuation characters, and continuing with letters, digits, hyphens, underscores, colons, or full stops, together known as name characters." [XML10, sect 2.3] That is XML allows any letter, not just A-Z, otherwise the definition is identical. A NAME by comparison is "cdata". You cannot have an ID with the value "Here I am" (spaces), for for that matter "Here%20I%am" (% is not alphanumeric) nor "HereIam" ("&#;", and the ID value isn't parsed anyway). Nor can you have an ID beginning with a digit, so <tag id="1"> is not valid. Is this really desired behaviour? What advantages are there to this? Jonny Axelsson, Net asset, Metastasis design
Received on Wednesday, 9 February 2000 05:26:52 UTC