RE: Metonyms, # or /, mimetypes, named graphs from McBride, Brian on 2004-06-14 (www-rdf-logic@w3.org from June 2004)

From: McBride, Brian <brian.mcbride@hp.com>
Date: Mon, 14 Jun 2004 16:39:58 +0100
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, rdf-logic <www-rdf-logic@w3.org>
Message-ID: <E864E95CB35C1C46B72FEA0626A2E808026A12FA@0-mail-br1.hpl.hp.com>
 

> -----Original Message-----
> From: www-rdf-logic-request@w3.org 
> [mailto:www-rdf-logic-request@w3.org] On Behalf Of Jeremy Carroll
> Sent: Friday, June 11, 2004 1:24 PM
> To: rdf-logic
> Subject: Metonyms, # or /, mimetypes, named graphs
> 
> 
> 
> This is long and rambling, apologies, ... I am hoping that 
> maybe danbri 
> or pat or anyone else might chose to read it all and comment, 
> and then 
> maybe in a bit a somewhat more succint version might emerge (or not).

Despite lacking the mips between the ears of Pat and Danbri, who probably
have the mips to know better than to chase this particular rabbit ...

> 
> I was sitting on my balcony just after dawn this morning, smoking a 
> cigarette and drinking some coffee (instant, apologies; I had a 
> cappuccino at the bar later).

I am sitting on a crowded train south form Scotland, with a cappuccino.

> 
> I was thinking about xsd:int, the URI
> 
> http://www.w3.org/2001/XMLSchema#int
> 
> When you use this URI with type rdfs:Datatype, then you mean 
> a datatype, 
> (the one defined by XML Schema).
> 
> We can say
> 
> [A]
> 
> xsd:int rdf:type rdfs:Datatype .
> 
> On the other hand, if you read this URI according to same XML 
> spec, (I 
> am not sure which one), apparantly the frag ID identifies an 
> element in 
> the XML document

There is a difference between identifies and denotes.  

It seems to me when I hear folks argue about web architecture that many seem
to believe that identification is an intrinsic property of URIs.  That
doesn't seem right to me.  Identification is a mapping from identifiers to
some set.  In principal at least, given a set of identifiers, one can have
many different identification functions with that set as its domain.  In
general, identification is intrinsic to the function, not to the
identifiers.

Now web architecture wants to put some contraints on identification
functions using URIs as identifiers.  The principal is that the web should
always map a URI to the same thing.  This allows users to include URLs in
hyperlinks and know that everywhere on the planet a GET on that URL will
return the same(ish) thing.  It means that on an engineering level caching
will work.  It means that one can cut and paste URLs between SVG and html
documents and the hyperlinks will refer to the same thing.

Along comes the semantic web and it wants to use URI's to 'denote', and to
denote things that are not web pages, e.g. the int datatype from XML schema
datatypes.  Now the formal semantics of RDF allows many interpretations and
I'm not sure what the right term to use here is, but it seems like there is
a 'standard' denotation for common uri's such the uri corresponding to
xsd:int.

To denote is to identify.  A question that arises is whether it is ok for
the standard denotation of a URI to be different from what the web thinks a
URI identifies.

Can xsd:int denote a datatype and xsd:int identify on the web a ... Uhmmm
what?

I think we get a bit confused about what identification function the web
implements and the term 'resource' seems to get me really confused.  So I'm
going to talk about Roys and ignore lots of the complexity and just think
about GET.

So get maps URL x MimeType x Time x ... Other html headers to SEQUENCE of
OCTET x MimeType x ReturnCode.  (I'll have the details wrong, but I hope
this is close enough).  So lets say that mapping is composed of:

GET1: URL -> ROY
GET2: ROY x MimeType x Time x .. -> SEQUENCE of OCTET x MimeType x
ReturnCode

There are several possible answers to this question.  And some distinguish
between URLs with and without fragment identifiers, so I'll us URL/ for URLs
without fragment identifiers and URL# for URLs with frag ids.

Answer A) GET1(url) need not equal Denote(url).  So http://connolly.org/ can
denote Dan Connolly whilst GET1(http:/connolly.org/) can return a Roy.  Dan
connolly is not a Roy :)  With this answer we have the problem that if
http://connolly.org does not not denote the Roy it identifies on the web,
what does?  We could steal a trick for Larry Massinter that I will come to
later, and propose that urn:roy:http://connolly.org/ does.

Answer B) GET1(url/) must equal Denote(url/) but GET1(url#) need not equal
Denote(url#).  This answer tries to take advantage of the fact that http
only ships about whole representations, so maybe we can argue that
GET1(url#) is undefined, so we are free to define it in Denote and not be
inconsistent. I'm not sure I buy that, and in any case were it true, if
xsd:int denotes the datatype, what denotes the fragment of the xsd schema
doc corresponding to int?  urn:roy:...?

Answer C) GET1(URL) must equal Denote(URL).  In which case what do we use to
name abstractions like the datatype int?  Larry Massinter has suggested a
urn schema so that if http://www.w3.org/.../schema#int identifies on the web
the fragment in the xml schema defining int, then
urn:tdb:http://www.w3.org/.../schema#int denotes the datatype.


> 
> http://www.w3.org/2001/XMLSchema
> 
> at a guess that is this XML document
> 
> http://www.w3.org/2001/XMLSchema.xsd
> 
> and hence this element
> 
> 
>    <xs:simpleType name="int" id="int">
>      <xs:annotation>
>        <xs:documentation
>          source="http://www.w3.org/TR/xmlschema-2/#int"/>
>      </xs:annotation>
>      <xs:restriction base="xs:long">
>        <xs:minInclusive value="-2147483648" id="int.minInclusive"/>
>        <xs:maxInclusive value="2147483647" id="int.maxInclusive"/>
>      </xs:restriction>
>    </xs:simpleType>
> 
> Thus, from a different point of view, we could say
> 
> [B]
> 
> xsd:int owl:sameAs """<xs:simpleType name="int" id="int">
>      <xs:annotation>
>        <xs:documentation
>          source="http://www.w3.org/TR/xmlschema-2/#int"/>
>      </xs:annotation>
>      <xs:restriction base="xs:long">
>        <xs:minInclusive value="-2147483648" id="int.minInclusive"/>
>        <xs:maxInclusive value="2147483647" id="int.maxInclusive"/>
>      </xs:restriction>
>    </xs:simpleType>"""^^rdf:XMLLiteral .
> 
> (modulo XML canonicalization)
> 
> Both [A] and [B] are legitimate (I think), but put them 
> together to form 
> a single graph [A U B] and we get nonsense (a contradiction).
> 
> Basically [A] and [B] take different views of the world, and are 
> appropriate for different tasks.
> 
> This is reminiscint of:
> - named graphs
> See
> http://www.hpl.hp.com/techreports/2004/HPL-2004-57.html
> 
> - metonymity
>     Putting the sentences:
>      "Smith summed up for the crown"
> with
>      "The crown weighs 4 kilos"
>    gives nonsense, since the former sentence takes a view of 
> "the crown" 
> as standing for something other than a crown (the power of 
> the state in 
> a monarchy)
> 
> - the problem of whether an http URL (without a frag ID) can 
> ever stand 
> for something other than a document.
> 
> 
> ====
> 
> We already have a mechanism for taking different views of a resource: 
> mime types.
> 
> ====
> 
> We could, as always, resolve this by adding another level of 
> indirection; using more sophisticated modelling. e.g. maybe instead of
> 
> xsd:int rdf:type rdfs:Datatype .
> 
> we should have said
> 
> _:d rdf:type rdfs:Datatype .
> _:d rdfs:isDefinedBy xsd:int .
> 
> However, since we can always say that, but in practice we 
> have to decide 
> how many levels of indirection to have, and usually it is 
> easier to have 
> none. I think the "your modelling is poor" argument, at the 
> end of the 
> day, is uncompelling. It is unfalsifiable, any modelling is 
> always poor, 
> because by it's very nature it approximates the way the world 
> is (as if 
> we knew) rather than giving a perfect reflection (oh, the joy of 
> omniscience).

Interesting sense of falsifiable.  This statement is falsifiable because it
is true, and must be true of any attempt to do better :)

[...]

> 
> So, since I am underwhelmed by the modelling argument, I consider 
> metonymity.
> 
> In [A] we are taking one view of xsd:int, that represented by the 
> (hypothetical) mimetype "x-semantics/rdf-datatype"; in [B] 
> our view is 
> with the (hypothetical?) mimetype "application/xml-frag".
> 
> Perhaps we should just add this information to the class definitions.
> 
> e.g.
> 
> rdfs:Datatype eg:hasMimeTypes [
>     "x-semantics/rdf-datatype"
> ] .
> eg:XMLFragment eg:hasMimeTypes [
>    "application/xml-frag"
>    "application/xml"
>    "application/xhtml"
> ] .
> 
> (Perhaps the list of possible mimetypes should be open rather 
> than closed)
> 
> If we then merge two graphs, the view of any specific 
> resource taken in 
> the merge must be using a mimetype occuring in each of 
> eg:hasMimeTypes 
> list for every class that that resource is in. Thus, if as in 
> this case, 
> the lists have an empty intersection, we know that the views 
> taken are 
> contradictory, and we appropriately choose not to merge them. 
> (Using the 
> named graphs approach)
> 
> ===
> 
> A different example might be a URI of an image e.g.
> 
> http://www.hpl.hp.com/personal/jjc/images/venn.png
> 
> We might want to discuss this image as a bitmap, talk about 
> its height 
> and width in pixels (this discussion would be meaningless for an SVG 
> version of the image)
> We might want to discuss this image as an abstract image, 
> talk about the 
> colours used, mention that is a geometric design ... (this discussion 
> would be meaningless for say a text document giving the 18 
> coordinates 
> which define the image)
> I, at least, might want to discuss the deeper significance of 
> the image 
> as a piece of abstract mathematics: I would see such a discussion as 
> typically appropriate to many variations of this image, 
> including ones, 
> where the image was abstracted to being simply a planar 
> graph, with no 
> indication of the (Euclidean) geometrical representation.
> 
> These three different discussions, are, in my view, all legitimate 
> discussions about the image. They seem to be discussing 
> different facets 
> of the image. They are incompatible discussions. We have to switch 
> world-views to go from one to the other.
> 
> Could (an extension of) mimetypes be used to distinguish 
> these different 
> discussions, like with the xsd:int example?
> I think so.
> 
> ====
> 
> # or / ?
> 
> I think the dogmatic # position in this debate is predicated on http 
> URLs denoting documents, and not more abstract concepts. If 
> we can see 
> the more abstract concepts as merely another metonymic 
> representation of 
> the (abstract) document, with an appropriate mimetype, then maybe the 
> problem goes away.
> I haven't yet understood the dogmatic / position, only the 
> pragmatic one.
> 
> ====
> 
> I smoked another cigarette and started my day.
> 

This reminds me an essay I wrote recently where the arguments I was trying
to refute relied on confusing the different senses in which person referring
terms like 'she' are used.

When you are talking about the image, you are referring to different senses
in which the word might be used.  Maybe we can model those senses
explicitly, but not using mime types.

:bitMap a BitMap;
        senseOf :image;
        height ...;
        width ....;
        ...

:abstract a AbstractImage;
          senseOf :image;
          usesColor :red;
          ...

:mathematical a MathematicalDiagram
              senseOf :image;
              discussedBy :jjc;
              ...

But in practice folks will not put in the indirection; I know I don't.

Maybe we could do that automagically, i.e.

:image a BitMap;
       height ...;
       width ....;
       ...

Transfors to

:bitMap a BitMap;
        senseOf :image;
        height ...;
        width ....;
        ...

And we can do this transform until contradictions amongst merged data
disappear.

Brian
Received on Monday, 14 June 2004 11:40:45 UTC