Re: Fwd: Re: Document fragment vocabulary from Sebastian Hellmann on 2011-08-29 (uri@w3.org from August 2011)

From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Date: Mon, 29 Aug 2011 19:08:44 +0200
To: John Cowan <cowan@mercury.ccil.org>
CC: Erik Wilde <dret@berkeley.edu>, uri@w3.org, Michael Hausenblas <michael.hausenblas@deri.org>
Message-ID: <4E5BC79C.7060403@informatik.uni-leipzig.de>

Am 29.08.2011 17:34, schrieb John Cowan:
> Sebastian Hellmann scripsit:
>
>> The basic problem seems to be the definition of what plain text is. I
>> guess you are talking about the media type, while I am talking about
>> plain text in general. My definition would be a bit broader such
>> as:  "Plain text is basically anything that makes sense to open in a
>> text editor. " or negatively "Not a binary format." or "a character
>> sequence".
> I would say that what can be edited in a text editor is text.  Plain
> text, then, is a particular form of text that doesn't have any explicit
> presentation or semantic markup, with the significant exception of
> horizontal and vertical whitespace encoded as characters.  There may be
> markup, but it's implicit.
Fair enough, I guess you can define it any way you want.  I am not 100% 
sure what exactly "semantic markup" is.
Wikipedia makes three distinction:
http://en.wikipedia.org/wiki/Text_%28disambiguation%29
http://en.wikipedia.org/wiki/Plain_text
http://en.wikipedia.org/wiki/Formatted_text
http://en.wikipedia.org/wiki/Enriched_text

Although Plain Text is defined as the opposite of formatted text, 
programming source code is defined as plain also.
Maybe http://en.wikipedia.org/wiki/Text_file
is closest to your definition of text, i.e. what can be edited in a text 
editor.


>> I am a little worried as fragment-ids are so restricted to media
>> types, especially since you could easily reuse them, i.e. plain text
>> RFC 5147 for CSV and *ML
> You could, but fragment ids that match the semantic model would be much
> more robust and useful, like row/column for CSV and XPath for XML.
>
I would argue this directly.
If e.g. file://myfile.csv or file://myfile.xml
have a syntax error (not well-formed) then #line=10,11 or #range=88,105 
will perform much better than CSV specific things or XPath, which do not 
work any more.
Furthermore, it will be much more interoperable as implementors could 
implement fragment identification once and it will work for many other 
formats.
So there is another usefulness to it. I agree that matching the semantic 
model has certain benefits, reusing general Fragment Ids , however, 
should also be considered.
Cheers,
Sebastian


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Received on Monday, 29 August 2011 17:09:21 UTC