Re: Name namespace, namespaces and names in general

On Sun, 13 Feb 2000, Jonny Axelsson wrote:

> Yes, I consciously "overloaded" ID by giving the ID a meaning
> (reference to some external scheme), but I did it for good
> practical reasons. 

The trouble is that ID already has a definite meaning.  (Netploder not
having much of a clue - if any at all - is not proof that ID is open
to creative interpretation.)  The purpose of the ID declared value is
to establish a *dedicated* namespace for cross-referencing within a
document - the arcs that make and support a directed graph out of a
strictly hierarchical tree.

So, when you wrote:

> As for the "Big Question" in my original message, now rephrased
> "Would it be (dis)advantagous to have non-unique IDs in the same
> HTML/XML document (URI)?", I am still curious to the answer.

The answer just about writes itself: isn't there a generic need for
cross-referencing so that one can have a directed graph rather than
just a tree?  Wouldn't that mean unique *targets*?  So, even if ID
were repurposed, the general case would force an implementation of the
underlying concept!  NIH notwithstanding, why reinvent the wheel?

If you say that ID as given is useless to you - you can't imagine ever
wanting, much less needing, intradocument cross-references - then the
answer is simply to eschew them:)

> Not doing so would add a "conversion layer" to (X)HTML if some
> other process or person want to access.  It doesn't have to be
> much, a two column table where one column is the (X)HTML ID, the
> other the "real" reference.  Though if the process adding IDs is
> unconnected to the one maintaining the external data store, that
> table is not so simple to make.  Not *having to* use such a table
> can easily make (X)HTML "live" today.

The table is unnecessary.  The external reference (i.e. *from* the
document *to* the database source) should simply be in the instance
data, either as the content of an element or as the value of an
attribute.  This need not have anything to do with some element in the
document having an ID, only that such content be recognized as an
external reference.  (This is actually two requirements - recognizing
referential intent and identifying the external context in which the
target is meaningful.)

The fact that HTML doesn't have a "slot" for your specific need is
simply a reflection of the finitude of *any* document type, and thus
the basic need for extensibility/embeddability of document types, in
an open environment like the Web.  

More generally, the extension/embedding distinction is largely
artificial.  Operationally, they amount to the same thing: pruning a
surrounding context to isolate the "base" (that which has been
extended or embedded, depending on one's pov.)  As such, if you feel
constrained to take the extension view, just add the attributes you
need:

   <table db="dbView" dbViewName="myfoos" dbQuery="select * from foo">
     <tr db="dbRecord" dbRecName="foo" dbKey="wefe142343">
       <td db="dbField" dbCol="bar">bar-blah</td>
       <td db="dbField" dbCol="baz">baz-blah</td>
       <td db="dbField" dbCol="blort">blort-blah</td>
     </tr>
     <!-- more <tr>s here -->
   </table>

This would be a straightforward application of architectural forms to
map the tactical <table> markup via a control attribute ('db') to a
generic database-centric schema:

 <!NOTATION sql      SYSTEM
            >
 <!NOTATION dbID     SYSTEM
            >
 <!ELEMENT  dbView   (dbRecord+)
            >
 <!-- the DATA declared value type is new from the WebSGML TC -->
 <!ATTLIST  dbView
            dbView   NAME        #IMPLIED
            dbQuery  DATA  sql   #IMPLIED
            >
 <!ELEMENT  dbRecord (dbField+)
            >
 <!ATTLIST  dbRecord
            dbKey    DATA  dbID  #IMPLIED
            >
 <!ELEMENT  dbField  (#PCDATA)
            >
 <!ATTLIST  dbField
            dbCol    NAME  #REQUIRED
            >

The embbedded view would make the database schema "primary" in terms
of the markup, but the similarity of structures makes the inverse map
to HTML categories just as simple:

 <dbView dbViewName="myfoos" dbQuery="select * from foo" html="table">
   <dbRecord dbRecName="foo" dbKey="wefe142343">
     <dbField dbCol="bar" html="td">bar-blah</dbField>
     <dbField dbCol="baz" html="td">baz-blah</dbField>
     <dbField dbCol="blort" html="td">blort-blah</dbField>
   </dbRecord>
   <!-- more <dbRecord>s here -->
 </dbView>

(How a DTD subset could save some of the verbosity via defaults in an
ATTLIST declaration should be obvious.  DTDs are good, use 'em!)

The advantage here is that any application that only *cares* about the
databse doesn't have to bother at all with the HTML markup (or its
limitations, real or imagined), and by the same token, an appllication
interested in only the HTML need only look to the 'html' attribute for
guidance.  In either case, with a parser as the interface between the
application (or its semantics-specific modules) and the document, the
essential "extraction" procedure - of *relevant data only* - is
completely mechanical.

In an even better world, you could have markup like this

   <myfoos html="table" db="dbView" dbQuery="select * from foo">
     <foo html="tr" db="dbRecord" dbKey="wefe142343">
       <bar html="td" db="dbField" dbCol="bar">bar-blah</bar>
       <baz html="td" db="dbField" dbCol="baz">baz-blah</baz>
       <blort db="dbField" dbCol="blort">blort-blah</blort>
     </foo>
     <!-- more <foo>s here -->
   </myfoos>

where the "primary" markup is semantic, and the HTML and database
"views" are both extractable using a precisely defined and controlled
extraction process.

> The /semantic web/ however implemented would add such a layer,
> making it possible for *any* system with proper access to link up
> the page with that data store or other stores. Shortcuts, such as
> the direct HTML<->database linking above, may make the road there
> easier,

Overloading constructs at cross-purposes is not a strategy to make
transitions easier.  The road to hell is paved with kludgery.
SGML/XML IDs have nothing to do with your problem, or its proper
solution:)


Arjun

Received on Monday, 14 February 2000 05:27:09 UTC