Re: Fwd: Re: Document fragment vocabulary

Hello all,
this discussion really helped me a lot to get a different perspective on 
the whole issue.
I also see, that my terminology was wrong with regard to plain text.  I 
thought about it and I think I understand now that it is not so easy to 
make universal fragment identifier for the Web.

For my main use case (interoperability of NLP tools) this fact is not 
really relevant as the focus is on text and text annotation.  One big 
problem in this domain is for example to have multi-layered and 
overlapping annotations, sometimes solved with milestones embedded in 
XML. [1] proposes a docuverse, which seems to be a little overkill. 
Overall, however, the question about which media type is not so relevant 
in this domain. I also intend to make it possible to embed the text into 
the RDF as an RDF Literal. Then the media type would be fixed.


To wrap up this discussion, here is what I plan to do:

• First, I think I will collect most approaches to fragment identifiers 
and make a table "media type vs. possible fragment ids", then in a next 
step I will write down some use cases  and then derive criteria for 
fragment ids. Then I will do some benchmarking with that and create a 
table for comparison. I have just submitted the LOD2 EU deliverable so I 
already did some of the things.
• Based on the deliverable, we will specify a NIF version 1.0 and then 
implement it for several tools and do a field test. Results will be 
collected in a NIF 2.0 draft. NIF-1.0 will have the recipes I already 
mentioned, offset based and context - hash based. I think we will also 
fix the '#' and not leave the choice of #, ?nif=, /  to the implementor. 
During NIF-1.0 we will see, if any problems come up doing it this way.
•  End of September I will give a presentation at a W3C workshop [2]. 
There I will try to talk to David Filip ( LRC/CNGL/LT-Web, LT-Web: 
Meta-data interoperability between Web CMS, Localization tools and 
Language Technologies at the W3C)
• We hope to submit NIF 2.0 draft to some organization who standardizes 
it (W3C and ISO are both options to be considered) .
• Lastly, if we have time, we might pick up and continue/extend the 
liveURL project [3] . Maybe we could implement some RFC also along with 
it. It is of course just one plugin for one browser, but it would be a 
start. This will need some time though before we pick up on this. If 
anybody would be willing to join, please mail me ;)


Overall, I am a little bit sad that compatibility with the RFCs can not 
be achieved so easily. Especially the "optional" parts need to be 
stripped, because of the "owl:sameAs" dilemma I sketched in a previous 
email.  For now, I will probably make a page that describes the relation 
between the NIF URIs and different W3C RFCs. Maybe it is possible to 
find some convergence on the way.

Thanks a lot for having all this patience and answering all my questions,
Sebastian

[1]  http://palindrom.es/phd/research/earmark/
[2] 
http://www.multilingualweb.eu/documents/limerick-workshop/limerick-program
[3] http://liveurls.mozdev.org/index.html

Received on Friday, 2 September 2011 13:36:30 UTC