- From: David Wood <david@3roundstones.com>
- Date: Sat, 24 Aug 2013 07:20:01 -0400
- To: Ivan Herman <ivan@w3.org>
- Cc: Pat Hayes <phayes@ihmc.us>, W3C RDF WG <public-rdf-wg@w3.org>, Felix Sasaki <fsasaki@w3.org>
The Open Annotation Community Group [1] is the best fit, I think. Section 2.1.4 of their spec [2] is entitled "Fragment URIs Identifying Body or Target" and attempts to define an RDF- and URI-friendly way to identify a particular part of a resource to annotate. Having worked with their spec, I don't think they have quite succeeded either. It may not be possible to do this cleanly given the state of the specs. Regards, Dave -- http://about.me/david_wood [1] http://www.w3.org/community/openannotation/ [2] http://www.openannotation.org/spec/core/20130208/core.html#FragmentURIs On Aug 24, 2013, at 2:10, Ivan Herman <ivan@w3.org> wrote: > That is a good point, but it may still be good for the records of the ITS WG to, at the minimum, share our opinion without requesting a change. The ITS WG may then decide to contact, eg, the TAG if they want... The problem is that I do not really see which group owns this thing. > > Actually... we are not completely out of this. After all, the concepts document does talk about fragments, ie, we do go beyond a purely opaque IRI... > > Note sure. > > Ivan > > --- > Ivan Herman > Tel:+31 641044153 > http://www.ivan-herman.net > > (Written on mobile, sorry for brevity and misspellings...) > > > > On 24 Aug 2013, at 06:19, Pat Hayes <phayes@ihmc.us> wrote: > >> Ivan >> >> While I sympathise with, and share, your discomfort, I don't see that this is an issue particularly for RDF to comment upon. RDF, as you note, treats IRIs as opaque, so this entire discussion seems irrelevant to RDF-WG. Maybe some other WG, or the TAG, should be asked to take up this issue with ITS WG ? >> >> Pat >> >> >> On Aug 23, 2013, at 5:57 AM, Ivan Herman wrote: >> >>> As recorded as an action (wait, it was not recorded on the call because tracker got confused by several ivan-s:-) I reviewed the ITS 2.0 document, as requested by the ITS WG via Felix Sasaki[1]. The section that is relevant for this Working Group is the mapping to an external ontology, called NIF[2]. Actually, the details of that ontology are also not relevant for this Working Group; the issue is to map the attributes set on the textual content of an HTML (or XML) document into RDF. >>> >>> To take the example of the document: >>> >>> <html><body><h2 translate="yes">Welcome to <span >>> its-ta-ident-ref="http://dbpedia.org/resource/Dublin" its-within-text="yes" >>> translate="no">Dublin</span> in >>> <b translate="no" its-within-text="yes">Ireland</b>!</h2></body></html> >>> >>> the goal is to produce a set of RDF statements of the form: >>> >>> <URI_TO_IDENTIFY_A_TEXT_PORTION> >>> nif:property1 value1; >>> nif:property2 value2; >>> nif:prop <URI_TO_IDENTIFY_A_TEXT_POSITION> >>> ... >>> >>> The really interesting question is how to define the two URI-s <URI_TO_IDENTIFY_A_TEXT_PORTION> and <URI_TO_IDENTIFY_A_TEXT_POSITION>, where, say, the first should somehow refer to "Welcome to Dublin Ireland!" and the other should tell the world that this text is within the <h2> element of the file. >>> >>> The current mapping uses the following two URI-s >>> >>> <http://example.com/exampledoc.html#char=0,29> >>> <http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1])> >>> >>> although it is quite obvious what these are for, I sense some sort of a problem with these. We may end in a rathole, but... >>> >>> - We refer to IRI-s in our concept document: RFC3987 >>> - IRI-s map to URI-s: RFC3987 >>> - What RFC3987 says about fragments is: >>> >>> "The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained." >>> >>> The way I translate is that if I want to have a proper URI, where I expect the media type to be BLA, then the fragment ID should somehow be defined for BLA. Although RDF regards IRI-s as opaque, I would still feel uneasy to do otherwise. >>> >>> Looking at the URI-s above >>> >>> - The 'char' fragment is defined by rfc 5147, but is defined for text/plain only. ITS talks about XML and HTML, ie, talks about resources whose media types are definitely _not_ text/plain >>> - The xpath fragment id is fine for XML. But it is not defined for text/html and, knowing how XML is frown upon by the HTML WG, I do not expect that to ever change. >>> >>> In view of this, I do not feel comfortable with the choice of the mapping. The URI-s are not dereferenceable, neither are they correct... >>> >>> That being said, I may be too picky and we could let this go, also considering the fact that this section is _not_ normative in ITS. >>> >>> I had some discussion with Felix and also with Sebastian Hellmann, who is the author of NIF; one proposal I had was to use a URI of the form >>> >>> http://www.w3.org/its?resource=http://example.com/exampldoc.html&char=0,29 >>> >>> which, if some simple service is provided, can provide some simple information back, and is ok as a URI. I think that would be acceptable to them. But again, this WG may decide that I am just way too pedantic... >>> >>> Ivan >>> >>> P.S. It is of course possible to radically change the mapping with some blank nodes in the middle to avoid the issue... >>> >>> [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2013Aug/0000.html >>> [2] http://www.w3.org/TR/2013/WD-its20-20130820/#conversion-to-nif >>> >>> ---- >>> Ivan Herman, W3C >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +31-641044153 >>> FOAF: http://www.ivan-herman.net/foaf.rdf >> >> ------------------------------------------------------------ >> IHMC (850)434 8903 home >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 mobile (preferred) >> phayes@ihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> >
Received on Saturday, 24 August 2013 11:20:21 UTC