RE: [Locators] Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...

I may be out of date (or missing the point completely), but I thought the
'!' character was reserved in URI Generic Syntax to serve as a sub-delimiter
within a component of a URI. The main difference between '!' and '#' is that
the former is meant to delimit sub-components within a URI component
(scheme, authority, path, query, fragment) and the latter is used to delimit
a specific component (fragment). 
 
As a reserved character sub-component delimiter, the meaning of '!' can be
scheme, component or implementation-specific (much as the meaning of URL
components delimited by '#' is specific by MIME type). '=' and '&' are
examples of reserved sub-component delimiters (i.e., the same class of
reserved character as '!') that have well-known roles as delimiters in a
query component of a URI. 
 
So in that sense http://example.org/myRoot/A!B has path component
(/myRoot/A!B) that has been explicitly divided (according to URL spec) into
2 sub-components: /myRoot/A and B. The meaning of these 2 sub-components and
what the server is supposed to do with them is not generically defined, but
one can reasonably expect that the fact that the path has been divided into
sub-components make this URI different than say
http://example.org/myRoot/A-B where the path has not been divided into
sub-components (because '-' is an allowed but not reserved character in URL
syntax. Certainly if you wanted to referred to the font file of a PWP
resource held somewhere separate from the rest of the PWD, using the base
locator of the PWP, it could make sense to do so by appending its name as a
path sub-component, though of course this would have to be clearly spelled
out and uptake would be uncertain, and there are other approaches as well.
But starting with a reserved character does seem like a good idea. 
 
Probably all of this has been entirely implicit for the other posters to
this thread, but I just wanted to make sure.
 
-Tim Cole
 
 
From: AUDRAIN LUC [mailto:LAUDRAIN@hachette-livre.fr] 
Sent: Tuesday, December 22, 2015 2:36 AM
To: Ivan Herman <ivan@w3.org>
Cc: Shane P McCarron <shane@aptest.com>; Leonard Rosenthol
<lrosenth@adobe.com>; Romain Deltour <rdeltour@gmail.com>; Bill Kasdorf
<bkasdorf@apexcovantage.com>; Tzviya Siegman <tsiegman@wiley.com>; W3C
Digital Publishing IG <public-digipub-ig@w3.org>
Subject: Re: [Locators] Re: While it is still fresh in our minds: '!' is not
just a funny fragment identifier...
 
Looks like EPUB CFI…
Luc
 
 
De : Ivan Herman <ivan@w3.org <mailto:ivan@w3.org> >
Date : mardi 22 décembre 2015 09:34
À : AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr
<mailto:laudrain@hachette-livre.fr> >
Cc : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com> >, Leonard
Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com> >, Romain Deltour
<rdeltour@gmail.com <mailto:rdeltour@gmail.com> >, Bill Kasdorf
<bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com> >, Tzviya
Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com> >, W3C Digital
Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org> >
Objet : [Locators] Re: While it is still fresh in our minds: '!' is not just
a funny fragment identifier...
 
 
On 22 Dec 2015, at 09:22, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr
<mailto:LAUDRAIN@hachette-livre.fr> > wrote:
 
Sorry, perhaps I am not at the same level of abstraction. 
And yes, it may be certainly a question of server’s trick.
 
But from a resource producer point of view, if "http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  and http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  are two completely different resources", is B a
sub-resource of A? 
 
By default there is nothing that says that as far as the HTTP protocol is
concerned.



*         If yes, « in A¡B, B is a sub-resource of A », then resource
producers have to build « two completely different resources » for a commun
content B,
*         If no, « in A¡B, B is not a sub-resource of A », what does A¡B
means a locator for B, why not use http://www.example.org/B?
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_B-3F&d=
BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2
k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Cio9s2Q5cP-2-PJl5SRePibdHV
OMN3LokItcHZjgz5E&e=> 
Good question. And to make it clear: I did not propose the usage of the '!'
character, it is just mentioned as a possible avenue. I believe it was used
in a very restricted manner (and not generally):
 
                • http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  is the URL yielding the PWP manifest (or something
similar)
                • http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  was to access a resource within the PWP (but that
must either be aided by the server, or the client has to have some built in
logic to manage that URI instead of issuing a direct HTTP GET>
 
I seem to remember that Readium uses this trick in its Service Worker
experimentation.
 
Ivan
                
 



Luc
 
De : Ivan Herman <ivan@w3.org <mailto:ivan@w3.org> >
Date : mardi 22 décembre 2015 09:03
À : AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr
<mailto:laudrain@hachette-livre.fr> >
Cc : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com> >, Leonard
Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com> >, Romain Deltour
<rdeltour@gmail.com <mailto:rdeltour@gmail.com> >, Bill Kasdorf
<bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com> >, Tzviya
Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com> >, W3C Digital
Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org> >
Objet : Re: While it is still fresh in our minds: '!' is not just a funny
fragment identifier...
 
 
On 22 Dec 2015, at 07:47, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr
<mailto:LAUDRAIN@hachette-livre.fr> > wrote:
 
Snippet : if I request http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  then the server is supposed to deliver
http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  to the client
This means that A¡B as a sub-resource can be served by the server. Depending
on the kind of resource, it may not « naturally »  exists .
 
If it’s a specific position in an audio or vidéo file, it may be fine in
streaming, but as a position in text, can the server send this specific
portion of text without sending the beginning of the HTML file?
 
I am not sure I 100% understand the question.
 
By default, http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  and http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  are two completely different resources, not unlike
http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  is completely different from http://www.example.org/C
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_C&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=22q6n7EdIur3QaV1kk17kK1LbKc6M
9y_veLWZdCOkuM&e=> . Of course, the server can implement some tricks whereby
the '!' character is interpreted in a particular way, but that is really a
matter of server setup/programming/whatever. The '!' character is nothing
special, afaik.
 
But I am not sure I answered your question…
 
Ivan
 
 



 
 
De : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com> >
Date : mardi 22 décembre 2015 03:10
À : Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com> >
Cc : Romain Deltour <rdeltour@gmail.com <mailto:rdeltour@gmail.com> >, Ivan
Herman <ivan@w3.org <mailto:ivan@w3.org> >, Bill Kasdorf
<bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com> >, Tzviya
Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com> >, W3C Digital
Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org> >
Objet : Re: While it is still fresh in our minds: '!' is not just a funny
fragment identifier...
Renvoyer - De : <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>
>
Renvoyer - Date : mardi 22 décembre 2015 03:11
 
I am personally wary of any use of '#' in a URL, even if it is in a
different scheme.  While it would be perfectly legitimate to define and
register a new scheme that has difference semantics for '#', it would be
potentially confusing for developers.  I am sure there is some other
separator you could use if you really want to identify a sub-resource.
Heck, you could even make it part of a query string.
 
On Mon, Dec 21, 2015 at 6:09 PM, Leonard Rosenthol <lrosenth@adobe.com
<mailto:lrosenth@adobe.com> > wrote:


I would also add that it would be extremely valuable that any such fragment
idents for PWP be format agnostic, since we are already seeing that EPUB is
but a single profile of PWP and that there may be others – and these idents
need to work for all.
 
Leonard
 
From: Romain Deltour [mailto:rdeltour@gmail.com <mailto:rdeltour@gmail.com>
] 
Sent: Monday, December 21, 2015 1:17 PM
To: Ivan Herman <ivan@w3.org <mailto:ivan@w3.org> >
Cc: Bill Kasdorf <bkasdorf@apexcovantage.com
<mailto:bkasdorf@apexcovantage.com> >; Tzviya Siegman <tsiegman@wiley.com
<mailto:tsiegman@wiley.com> >; W3C Digital Publishing IG
<public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org> >
Subject: Re: While it is still fresh in our minds: '!' is not just a funny
fragment identifier...
 
 
This is a major difference that we should not forget about.
 
Absolutely, right.
 
I was more thinking in terms of spec work:  we should not try to (re)invent
the wheel and touch fragment IDs where they're already well-defined (like
HTML), but on the other hand, for new media types (for instance a JSON PWP
manifest?) we have new grounds to explore and it may be relevant to consider
at a fragment identifier-based approach (which is, as you correctly point
out, technically different from a custom-URL-separator-based approach).
 
Romain.
 
On 21 Dec 2015, at 18:21, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org> >
wrote:
 
This came up today, I think maybe Romain mentioned it: that the '!' approach
for content URL looks very much like a fragment ID, so why do we make a
differentiation? (But I may have misunderstood the remark, in which case my
apologies!)
 
There is one aspect that we should not forget about where '!' and '#' are
very different. Per HTTP the fragment identifier is resolved, and acted
upon, on the client side. Ie, the approach is that if I request
 
http://www.example.org/A#B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-23B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Gfv6Gs1WFifKGuKnGhBhJEIDB
ZIV7JI7nCbDvFg0pIE&e=> 
 
then the GET request will deliver the http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  as a whole to the client, which will then select, in a
second step, B out of A. 
 
However, a '!' is a bona fide part of a URI. Ie, if I request
 
http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=> 
 
then the server is supposed to deliver http://www.example.org/A!B
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d
=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU
2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wG
XdiWtiR_mxmdUdoqzc&e=>  to the client, not http://www.example.org/A
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQM
F-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m
=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh
5ciF7OmcSk4ex4&e=>  (whatever that is). 
 
This is a major difference that we should not forget about.
 
Happy holidays and lots of rest to all of you/us!
 
Ivan
 
 

----
Ivan Herman, W3C 
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_
&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nu
wU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPN
O3PytMSXl_9LWqGCQa2A&e=> 
mobile: +31-641044153 <tel:%2B31-641044153> 
ORCID ID: http://orcid.org/0000-0003-0782-2704
<https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2
D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9i
jk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFL
c8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=> 


 
 



 
-- 
Shane McCarron
Managing Director, Applied Testing and Technology, Inc.
 

----
Ivan Herman, W3C 
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_
&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nu
wU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPN
O3PytMSXl_9LWqGCQa2A&e=> 
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
<https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2
D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9i
jk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFL
c8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=> 



 
 

----
Ivan Herman, W3C 
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_
&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nu
wU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPN
O3PytMSXl_9LWqGCQa2A&e=> 
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
<https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2
D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9i
jk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFL
c8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=> 



 

Received on Tuesday, 22 December 2015 17:01:51 UTC