Re: DPH from Jeremy Carroll on 2001-10-22 (w3c-rdfcore-wg@w3.org from October 2001)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Tue, 23 Oct 2001 00:16:13 +0100
To: w3c-rdfcore-wg@w3.org
Message-ID: <3BD4A8BC.1C6981F7@hplb.hpl.hp.com>
Pat Hayes wrote:
> 
> Wow. In the year 2001, fundamental design decisions in programing
> languages are critically influenced by the need to protect low-level
> hackers from the burden of implementing a simple stack. IBM 704
> assembler beats LISP after 45 years and about ten (twelve?) orders of
> magnitude increase in processing efficiency.
> 
> To hell with the DPH. If he can't parse a nested bracket structure,
> then he doesn't deserve the outrageous salary he is probably earning;
> tell him to take up gardening instead.
> 
> Pat
> --

Well, you had me laughing out loud, ... but I'll still wade in to defend
the DPH.

Considering whether we are creating unnecessary burdens is an important
test of any proposals. We can gain useful insight by understanding how
the DPH does useful stuff despite not having various mathematically
appealing properties embodied in their code.

I think the example which was *on topic* at the beginning of this thread
was parseType="Literal".

For me, a suggestion I put on the table is that we use XML
Canonicalization. This is moderately difficult to implement, and
currently the DPH doesn't do it (nor the sophisticated programmer for
that matter). What are the attractions, well, actually there is one, and
it is theoretical: it becomes possible to provide a clean definition of
equality.

This allows some of the things that this WG has found important to
happen. 
+ We can define test cases. (They depend on equality). 
+ We can have a well-founded abstract syntax, the graph, with
well-defined literals. (Although Pat keeps asserting that any literals
will do, it is made a lot harder if, as with M&S parseType="Literals" we
can't say that A=A and B=B).
+ We can build a model theory on top of the abstract syntax.

Meanwhile, without a well-defined representation of a
parseType="Literal" value, and without well-defined equality, the DPH
has done useful things, like shunt metadata around, help make the web
work in one way or another. Things that are valuable rather than merely
important.


I am currently experimenting with squaring this circle with a bit of
philosophy. The purpose of our rearticulation is not so much to change
RDF but to allow people (DPHs included) a better understanding of what
they are already doing. 

So let's suppose we decide with my suggestion of XML Canonicalization
for this part of the pie.
The DPH who needs to move metadata around will still pick up the literal
string and pass it from an input stream to an output stream unchanged;
but now they can be enlightened and understand that this string is one
of the many acceptable representations for the XML Canonicalized
version. This helps clarify the issues to do with equality, if this
interests the DPH. It also helps identify what the DPH may need to do -
expansion of XML character and entity references, a bit of worry about
namespaces. Often in the restricted environments that the DPH works in
they know ahead of time the namespaces in scope on the input, and can
ensure that the output environment for the literal also has these
namespaces. Maybe the DPH will decide to use Ron's approach and
explicitly pass namespace information in parallel to the actual text.
The receiving end can, if it so chooses, now recreate the XML
Canonicalization, but since that two was written by a DPH, it probably
won't.

So, we *can* have our cake and (the DPH can) eat it. We provide a good
theoretical framework, but remind application developers (and metadata
infrastructure developers) that they only need to bite off as much of it
as suits their problem space. The symbolic manipulation done in the
application is valid in as much as it corresponds to an equivalent
manipulation of its theoretical counterpart. In a restricted domain not
moving into the more abstract level may be quite a bit easier.

I am pretty sure that the same sort of argument can be made about all
the graph and sub-graph isomorphism stuff, and the rdf-entail and the
rdfs-entail. These are issues of theoretical importance; a lack of
understanding of them may in the past have put a break on RDF escaping
from the metadata arena; but if metadata in restricted schema is what an
application is about then that complexity is an irrelevance. (But not
dangerous).

Jeremy
Received on Monday, 22 October 2001 19:11:44 UTC