RE: [dpub identifiers] Please review updated Identifiers TF wiki from Bill Kasdorf on 2015-03-23 (public-digipub-ig@w3.org from March 2015)

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Mon, 23 Mar 2015 14:13:13 +0000
To: Ivan Herman <ivan@w3.org>
CC: W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CO2PR06MB57298BAF46947F9CF577847DF0D0@CO2PR06MB572.namprd06.prod.outlook.com>
Thanks for moving us yet another step forward, Ivan.

Okay if I add this to the wiki? I'd like to create a section, after the background section, with this content, slightly edited to make it a bit less e-mail-like.

May I go ahead and do that? Or do you want to do that?

--Bill

-----Original Message-----
From: Ivan Herman [mailto:ivan@w3.org] 
Sent: Monday, March 23, 2015 5:39 AM
To: Bill Kasdorf
Cc: W3C Digital Publishing IG
Subject: Re: [dpub identifiers] Please review updated Identifiers TF wiki

Bill,

I think we have to separate two categories here:

- Purely media type specific fragments (that includes xpointer, the media fragments as mentioned by Thierry, xpath, svgview, etc)
- Package level fragments like CFI and or the Packaging Fragments (let me refer to this as PFrag for now)

In an EPUB-WEB approach we should _not_ deal with the first category at all. Those are specified by other groups, registered by IETF, etc; the DPUB community should be a user of those just as they are users of the specific media. In future, the variety of media that can be added to a portable document will just increase and will be open ended; we should be 'clients' of that evolution.

CFI and PFrag have a different concern: the question is how to find a specific document *within* a package and then, within that document, a finer way of identifying an anchor.

I think what we should specify, as a set of requirement, is what an EPUB-WEB Fragment (say EWFrag) has to fulfill. Here is a tentative list, based on CFI and PFrag:

1. EWFrag should have a clear way of identifying a document *within* the media 2. EWFrag should have a way to follow "paths" of references through several 'hops'
3. EWFrag should have a way to reuse externally defined fragment id specifications for specific media types 4. EWFrag should have clear (and simple) conceptual equivalents to URI-s with fragment ID-s when the document is directly accessed on the Web 5. EWFrag should be based, as far as possible, on technologies widely deployed on the Web (and hence in Web browsers)

For CFI:

- (1) is fulfilled (for EPUB) starting from the package file
- (2) is fulfilled through the usage of the '!' character, though the definition seems to rely on XHMTL and SVG elements only, ie, is not really extensible
- (3) is not fulfilled, as far as I can see; instead, it uses its own identification down to the character level in a document
- (4) is not fulfilled, it uses its own identification
- (5) is fulfilled today but may not work tomorrow: it is deeply rooted in XML both for the package file and the target documents; if some packages are defined in other formats (eg, JSON) then this may break down; I am not even sure it would work with HTML5 (does the '/' approach, making this differentiation between elements and text children work the same way?)

For PFrag

- (1) is fulfilled, using the list of headers within the package
- (2) is not fulfilled, it can only go one step (from the package down to a document within the package)
- (3) is fulfilled; in fact PFrag is concerned _only_ by the identification of a document within the package and is oblivious to the rest
- (4) is sort of fulfilled (per documentation), but is a bit convoluted
- (5) is fulfilled; relies on, essentially, HTTP headers, which is part of the basics on the Web

There may be other requirements (Human readability? Ease of generation?) and some of the requirements above are not really important (eg, I am not sure about the importance of (2)). But I believe this is the kind of requirements that we should really formulate.


Cheers

Ivan





Bill Kasdorf wrote:
> Thanks to Tzviya, we have some substantive content for review on the Identifiers TF wiki at [1].
>
> This initial draft of background information gives brief descriptions, links, discussion, and examples of three possible options for consideration as the basis for our initial work on a Fragment Identifier:
> --EPUB CFI
> --W3C Packaging for the Web Fragment Identifiers --The Open 
> Annotations Fragment Selector
>
> In addition, there's a placeholder for XPath, and we need to collect suggestions for other relevant specs or technologies to take into account, e.g. XPointer.
>
> Please take a look at this before the Monday IG call and suggest any others we should add. Feel free to add a placeholder (ideally with a link) if you aren't prepared to add the prose.
>
> And although we now have a good list of participants in this TF, please add your name if you'd like to participate as well. > We will discuss next steps on the call Monday, which will probably involve a TF conference call later this week if we can > find a time that works for everybody.
>
> --Bill K
>
> [1] https://www.w3.org/dpub/IG/wiki/Task_Forces/identifiers#Background


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Monday, 23 March 2015 14:13:42 UTC