Re: Document fragment vocabulary from Michael Hausenblas on 2011-08-30 (uri@w3.org from August 2011)

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Tue, 30 Aug 2011 18:05:02 +0100
To: Erik Wilde <dret@berkeley.edu>, Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Cc: URI list <uri@w3.org>, John Cowan <cowan@mercury.ccil.org>
Message-Id: <2AC8705D-9A49-4227-AD15-8178B23065EC@deri.org>

> that simply depends on how you define "working". ranges/lines always  
> select something, but not necessarily what you wanted them to  
> select. the fact that (some) fragment identifiers can break is a  
> good thing, in the same way as it is good that the web has 404s. in  
> decentralized systems things change and break and you have to deal  
> with it. if an XML document is broken, you cannot feed it into an  
> XML pipeline, and therefore it's just not suitable for processing  
> anymore.

+1


> this is just my opinion, of course, and i am looking forward to see  
> what you will end up doing.


Agreed, same here ...


Cheers,
	Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 30 Aug 2011, at 17:41, Erik Wilde wrote:

> hello.
>
> On 2011-08-29 10:08 , Sebastian Hellmann wrote:
>> Maybe http://en.wikipedia.org/wiki/Text_file
>> is closest to your definition of text, i.e. what can be edited in a  
>> text
>> editor.
>
> in that case XML would be plain text, which does not make a whole  
> lot of sense. XML is a tree which happens to be text-encoded, but  
> there is a reason why all XML technologies are based on the tree  
> (XDM) and not on the text serialization. if something has a text- 
> based serialization that's convenient, but if the standard  
> application-level access to that data uses parsing into some form of  
> higher-level data structure, then it's not plain text anymore.
>
>> I would argue this directly.
>> If e.g. file://myfile.csv or file://myfile.xml
>> have a syntax error (not well-formed) then #line=10,11 or  
>> #range=88,105
>> will perform much better than CSV specific things or XPath, which  
>> do not
>> work any more.
>
> that simply depends on how you define "working". ranges/lines always  
> select something, but not necessarily what you wanted them to  
> select. the fact that (some) fragment identifiers can break is a  
> good thing, in the same way as it is good that the web has 404s. in  
> decentralized systems things change and break and you have to deal  
> with it. if an XML document is broken, you cannot feed it into an  
> XML pipeline, and therefore it's just not suitable for processing  
> anymore.
>
>> Furthermore, it will be much more interoperable as implementors could
>> implement fragment identification once and it will work for many  
>> other
>> formats.
>
> how would that work? even if you had some cross-media-type fragment  
> identifiers, the actual mapping of identifiers to fragments would  
> need to be implemented for each individual media type.
>
>> So there is another usefulness to it. I agree that matching the  
>> semantic
>> model has certain benefits, reusing general Fragment Ids , however,
>> should also be considered.
>
> it's a good idea in theory, but very hard in practice. pretty much  
> the only thing you can probably do would be to have ids, and even  
> then the lexical structure of these probably would start to  
> interfere badly with some of the targeted media types. i think  
> there's an important reason why cross-media-type fragment  
> identifiers never got off the ground: it would make the  
> decentralized nature of media type definition much harder (they  
> would need to coordinated to support fragment identifiers of a  
> certain kind), and it would be impossible to enforce retroactively.
>
> this is just my opinion, of course, and i am looking forward to see  
> what you will end up doing. cheers,
>
> dret.
>
> -- 
> erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-6432253 |
>           | UC Berkeley  -  School of Information (ISchool) |
>           | http://dret.net/netdret http://twitter.com/dret |

Received on Tuesday, 30 August 2011 17:05:32 UTC