New Issue: Range of URI+fragment dereference function (new issue?) from Paul Prescod on 2002-08-15 (www-tag@w3.org from August 2002)

From: Paul Prescod <paul@prescod.net>
Date: Thu, 15 Aug 2002 15:28:30 -0400
To: Dan Connolly <connolly@w3.org>
CC: www-tag@w3.org
Message-ID: <3D5C00DE.E39E5F38@prescod.net>
Dan Connolly wrote:
> 
>...
> If the question is
>         What do URI references refer to?
> the answer is:
>         They're abbreviations for URIs,
>         which refer to resources.

That was the question.

> > For example, the document says "In the case of a graphics format, a URI
> > reference might designate a circle or spline.". Does it designate a
> > "circle or spline" or circle _element_ or spline _element_.
> 
> Uh... it said 'circle or spline'; if it meant circle element,
> it should have said so. 

Fine. But the XPointer specification says it addresses elements. So the
architecture document is at odds with the XPointer specification.

 * http://www.w3.org/TR/xptr/#bare-names

>...
> > For
> > instance, given two different elements I know that they are really
> > different. I can infer distinctness just by virtue of the fact that XML
> > elements have unique identity.
> 
> Really? Where is XML element identity defined?
> Not from the XML 1.0 spec, and not from the infoset spec.
>....
> You can tell that XML elements are distinct by looking at their
> infoset properties, meanwhile, which is perhaps sufficient
> for the point you were making...

Right. So why are you being pedantic? Two elements information items
that share the same parent and share the same child count in that
element are the same element. Similar identity tests can be constructed
for all informatoin items. After all, this notion of identity is
necesary for XPath and XPointer to function.

>...
> > I think that it is dangerous to declare that the referent is the SVG
> > abstraction and not the XML abstraction because how then do I talk about
> > the XML element?
> 
> This doesn't generalize to formats that aren't XML based. You'd
> agree we want to talk about postscript pages, PNG pixels
> and regions, MPEG frames, etc, no?

 * http://www.prescod.net/groves/shorttut/

"Even more powerful, though, are property sets for things that are not
even XML. SQL databases and OLE objects can have property sets. LaTeX
files can have property sets. People have defined experimental property
sets for CSS, CGM and for something as abstract as legal documents.
After all, a property set is just a simple data model."

"once you define a property set for a data object type, that data object
becomes addressible. This means that every subcomponent of every data
object in an enterprise is potentially addressable. The important point
is that you do not have to convert all of your data resources into XML
or HTML to make them addressable."

>...
> I think this belongs on the TAG issue list.

I strongly agree!

> It's been suggested that there are two kinds of dereferencing:
> xlink:href-style, for pointing at parts of documents,
> and rdf:resource-style, for pointing at things described
> by or mentioned in documents. A new tdb: URI scheme
> (thing-described-by) has been suggested.
> I'm not comfortable with either of those solutions, yet,
> but I agree it merits investigation.

My understanding is that this is more or less the topic map solution.
You say whether you are talking about a node or about the concept
discussed by that node. This is part of why topic map people and RDF
people have trouble communicating. When Steve Newcomb says he wants to
always know whether two pointers point to the same thing, he means in
node-space, not in concept-space.

> I keep thinking
> I should capture the RDF/XLink issue in a test case, but
> I haven't gotten around to it yet.
> 
> > The grove view is that by default we address elements and explicitly ASK
> > to address beyond elements into other layers of abstractions.
> 
> Hmm... the grove view assumes everything has element structure?

No. The grove assumes everything has node structure. I think that most
media types will have a syntactic grove that more or less corresponds to
the XML infoset and pointers beyond the syntactic grove into abstract
groves. Here is a stack of groves I would anticipate for XML-family
vocabularies:

 * First there is the low-level XML view. It would have elements,
attributes, characters and so forth.

* Then there is the namespace view built on top of that. A "namespace
engine" would add some namespace information to the tree. It would
probably hide namespace attributes that were visible in the lower view.

 * Then there is yet another view that adds hyperlinking information.
The engine that provides this view can let us know whether a node is an
anchor or a link.

 * On top of that there could be a view built specifically for that
document type. It would understand the constructs in the document type
and make them available to a programmer as objects with properties.

A CGM file would have different property sets and groves.

> > I think
> > the Web needs to formalize its view.
> 
> Maybe, but maybe not. Maybe we don't need any more constraints
> here.

How will I know what RDF assertions point to if we don't differentiate
between elements and concepts?

> > >...
> > > > and if I have a resource identified
> > > > by the URIRef http://example.com/someResource#otherResource, how do I
> > > > reference a fragment of that resource (assuming it has one)?
> > >
> > > OK, assuming it has one, I can coin a new URI
> > >
> > >   mid:2002-08-14.thismessage@w3.org#abc
> > >         (pretend that's the MID for this message)
> > >
> > > to refer to it.
> >
> > But we've lost the benefits of the HTTP URI scheme.
> 
> How so?

How do I dereference this URI ("mid:2002-08-14.thismessage@w3.org#abc")
to get at the element?

> > ...
> > Historically, fragments pointed at things that were NOT resources
> 
> Really? Where is that documented?
> The documentation I'm aware of says that everything with identity
> is a resource, so of course the things fragments point to
> are resources.

2396 is careful not to call a fragment a resource. 

>...
> I don't think that changing the words used to describe a concept
> makes issues go away. 

No, but merging two concepts, "Resource" and "Fragment" can questions of
logical completeness where there were none before. If I can get a
fragment of a resource and a fragment is a resource then why isn't there
a syntax for fragment of a fragment. If there is a mismatch between
model and syntax that is an issue from a purely pedagogic point of view!
-- 
 Paul Prescod
Received on Thursday, 15 August 2002 15:31:12 UTC