Re: Fragment in HTML + RDF from Xiaoshu Wang on 2007-10-25 (www-tag@w3.org from October 2007)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Thu, 25 Oct 2007 19:22:42 +0100
To: Richard Cyganiak <richard@cyganiak.de>
CC: Tim Berners-Lee <timbl@w3.org>, "Booth, David (HP Software - Boston)" <dbooth@hp.com>, "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>, W3C-TAG Group WG <www-tag@w3.org>, Alan Ruttenberg <alanruttenberg@gmail.com>, Jonathan A Rees <jar@mumble.net>, Dan Connolly <connolly@w3.org>
Message-ID: <4720DEF2.2010003@musc.edu>

Richard Cyganiak wrote:
> On 24 Oct 2007, at 12:01, Tim Berners-Lee wrote:
>> There are three possible attitudes:
>>
>> 1) don't mix HTML and RDF, HTML will always have anchors. I think 
>> that this doesn't meet the need.
>>
>> 2) Do mix RDF and HTML, allow one file to define both anchors and 
>> arbitrary things.  Don't let the same fragid be used for both an 
>> anchor and a thing.
>>
>> 3) Do mix them, and by the way, allow the same fragid to be used as 
>> an ID for an anchor and an ID for a thing, with RDF clients and HTML 
>> clienst doing different things.  I think that this path leads to 
>> madness, as in a script for exaple, I may want to use a URI to refer 
>> to one or the other unambiguously. It also makes it impossible for 
>> HTML+RDF clients.
>>
>> So (2) is my preference, and I feel it should be written down 
>> somewhere. Actually the MIME type registration for HTML would be the 
>> logical place.  A TAG finding could be  pragmatic place too.
>
> I agree that it's an important issue. I agree that 2) would be a huge 
> improvement over 1), and that 3) is a recipe for disaster.
>
> Regarding Xiaoshu's point that 2) doesn't allow us to click on an RDF 
> hash URI and end up scrolled to the right spot in the HTML: I agree 
> that this is a problem. But I think it's a Web browser implementation 
> issue, and can be worked around with in-browser trickery, such as a 
> bit of Javascript. So I'd say there's no need to bend the architecture.
I am not sure if I like put Javascript in action.  I think so far, most, 
if not all pages, on w3.org are free of Javascript.

The truth is: I do not perceive any potential madness.  I am guessing 
that the potential madness refers to the following situation.  If a 
resource, say http://example.com/something has two representations, one 
is  "text/html" and the other "application/xml", in each of which they 
have an element with id of 'cool'. So, when people is given the URI,

"http://example.com/something#cool"

does he know which element is referring?

I would say "neither" or "both" because 
http://example.com/something#cool refers to an abstract thing just like 
any regular URIs.  If you want to know which element on a particular 
representation, speak it clearly, like "the html representation of 
http://example.com/something#cool".  Of course, in real life when the 
resource has only one representation, we are often thrift with our word, 
we will just say "http://example.com/something#cool".  But if the 
madness is created, it is not because the semantics of URI is wrong but 
because we are not careful about the meaning of our word.  Sure, we can 
avoid it by choosing (2) but then we unnecessarily spit the same concept 
into different URIs.  That is the reason I support (3).

 From another angle regarding to the topic, I would like to propose a 
minor extension to the current syntax of fragment identifier.  To 
designate a default fragment identifier, e.g., "#_", as a catch all for 
all unidentified fragment identifiers.

The reason is for this proposal is there are situation that (1) one 
resource can have some fragment identifiers on some media type but not 
the other, (2) not all fragment identifier have to be explicitly 
defined.    Hence, when a fragment identifier is requested, if the 
client software does not find it, it can goes to that fragment id to 
explain the namespace policy, such as if the request URI is prohibited 
or if it can only be found in a particular representation etc....  This 
gives a URI fragment identifier a similar semantics to the HTTP 
3xx/4xx.  Right now, for instance, if I was given a URI of 
"http://www.w3.org#foo", and try to find out what it is in my browser.  
I am not sure which of the following is the case,

(1) http://www.w3.org/#foo is the top element of "http://www.w3.org" 
html representation.
(2) My browser has failed to function properly
(3) The fragment URI is not defined in this representation.

Sure I can find out (3) with a look at source code, but I am still not 
sure if it possibly exist in some other representation, to which I can 
not possibly exhaust all the possibilities. 

But with a catch-all id, the browser can scroll to the "#_", where we 
can tell client what is going on.

Regards,

Xiaoshu

Received on Thursday, 25 October 2007 18:23:36 UTC