Re: [Locators] Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...

> On 22 Dec 2015, at 18:00, Timothy Cole <t-cole3@illinois.edu> wrote:
> 
> I may be out of date (or missing the point completely), but I thought the '!' character was reserved in URI Generic Syntax to serve as a sub-delimiter within a component of a URI. The main difference between '!' and '#' is that the former is meant to delimit sub-components within a URI component (scheme, authority, path, query, fragment) and the latter is used to delimit a specific component (fragment).
> 
> As a reserved character sub-component delimiter, the meaning of '!' can be scheme, component or implementation-specific (much as the meaning of URL components delimited by '#' is specific by MIME type). '=' and '&' are examples of reserved sub-component delimiters (i.e., the same class of reserved character as '!') that have well-known roles as delimiters in a query component of a URI.
> 
> So in that sense http://example.org/myRoot/A!B has path component (/myRoot/A!B) that has been explicitly divided (according to URL spec) into 2 sub-components: /myRoot/A and B. The meaning of these 2 sub-components and what the server is supposed to do with them is not generically defined, but one can reasonably expect that the fact that the path has been divided into sub-components make this URI different than say http://example.org/myRoot/A-B where the path has not been divided into sub-components (because '-' is an allowed but not reserved character in URL syntax. Certainly if you wanted to referred to the font file of a PWP resource held somewhere separate from the rest of the PWD, using the base locator of the PWP, it could make sense to do so by appending its name as a path sub-component, though of course this would have to be clearly spelled out and uptake would be uncertain, and there are other approaches as well.  But starting with a reserved character does seem like a good idea.

You are right, I was not entirely correct (and Tzviya's reference to the URI syntax is a good starting point!). But the important point is what you do say:

	"The meaning of these 2 sub-components and what the server is supposed to do with them is not generically defined"

which means that it relies on extra knowledge that the server and/or the client has to establish. By default what I said does hold, I believe, namely that the server is supposed to deliver the whole resource. In this sense, its usage is shaky. That is why I think we should try to look for a solution that demands the least new features (compared to what clients and servers routinely do already).

Ivan

> 
> Probably all of this has been entirely implicit for the other posters to this thread, but I just wanted to make sure.
> 
> -Tim Cole
> 
> 
> From: AUDRAIN LUC [mailto:LAUDRAIN@hachette-livre.fr]
> Sent: Tuesday, December 22, 2015 2:36 AM
> To: Ivan Herman <ivan@w3.org>
> Cc: Shane P McCarron <shane@aptest.com>; Leonard Rosenthol <lrosenth@adobe.com>; Romain Deltour <rdeltour@gmail.com>; Bill Kasdorf <bkasdorf@apexcovantage.com>; Tzviya Siegman <tsiegman@wiley.com>; W3C Digital Publishing IG <public-digipub-ig@w3.org>
> Subject: Re: [Locators] Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...
> 
> Looks like EPUB CFI…
> Luc
> 
> 
> De : Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>
> Date : mardi 22 décembre 2015 09:34
> À : AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr <mailto:laudrain@hachette-livre.fr>>
> Cc : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com>>, Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com>>, Romain Deltour <rdeltour@gmail.com <mailto:rdeltour@gmail.com>>, Bill Kasdorf <bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com>>, Tzviya Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com>>, W3C Digital Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>>
> Objet : [Locators] Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...
> 
> 
>> On 22 Dec 2015, at 09:22, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr <mailto:LAUDRAIN@hachette-livre.fr>> wrote:
>> 
>> Sorry, perhaps I am not at the same level of abstraction.
>> And yes, it may be certainly a question of server’s trick.
>> 
>> But from a resource producer point of view, if "http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> and http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> are two completely different resources", is B a sub-resource of A?
> 
> By default there is nothing that says that as far as the HTTP protocol is concerned.
> 
> 
>> ·         If yes, « in A¡B, B is a sub-resource of A », then resource producers have to build « two completely different resources » for a commun content B,
>> ·         If no, « in A¡B, B is not a sub-resource of A », what does A¡B means a locator for B, why not use http://www.example.org/B? <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_B-3F&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Cio9s2Q5cP-2-PJl5SRePibdHVOMN3LokItcHZjgz5E&e=>Good question. And to make it clear: I did not propose the usage of the '!' character, it is just mentioned as a possible avenue. I believe it was used in a very restricted manner (and not generally):
> 
>                 • http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> is the URL yielding the PWP manifest (or something similar)
>                 • http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> was to access a resource within the PWP (but that must either be aided by the server, or the client has to have some built in logic to manage that URI instead of issuing a direct HTTP GET>
> 
> I seem to remember that Readium uses this trick in its Service Worker experimentation.
> 
> Ivan
> 
> 
> 
> 
>> Luc
>> 
>> De : Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>
>> Date : mardi 22 décembre 2015 09:03
>> À : AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr <mailto:laudrain@hachette-livre.fr>>
>> Cc : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com>>, Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com>>, Romain Deltour <rdeltour@gmail.com <mailto:rdeltour@gmail.com>>, Bill Kasdorf <bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com>>, Tzviya Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com>>, W3C Digital Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>>
>> Objet : Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...
>> 
>> 
>>> On 22 Dec 2015, at 07:47, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr <mailto:LAUDRAIN@hachette-livre.fr>> wrote:
>>> 
>>> Snippet : if I request http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> then the server is supposed to deliver http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> to the client
>>> This means that A¡B as a sub-resource can be served by the server. Depending on the kind of resource, it may not « naturally »  exists .
>>> 
>>> If it’s a specific position in an audio or vidéo file, it may be fine in streaming, but as a position in text, can the server send this specific portion of text without sending the beginning of the HTML file?
>> 
>> I am not sure I 100% understand the question.
>> 
>> By default, http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> and http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> are two completely different resources, not unlike http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> is completely different from http://www.example.org/C <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_C&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=22q6n7EdIur3QaV1kk17kK1LbKc6M9y_veLWZdCOkuM&e=>. Of course, the server can implement some tricks whereby the '!' character is interpreted in a particular way, but that is really a matter of server setup/programming/whatever. The '!' character is nothing special, afaik.
>> 
>> But I am not sure I answered your question…
>> 
>> Ivan
>> 
>> 
>> 
>> 
>>> 
>>> 
>>> De : Shane McCarron <shane@aptest.com <mailto:shane@aptest.com>>
>>> Date : mardi 22 décembre 2015 03:10
>>> À : Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com>>
>>> Cc : Romain Deltour <rdeltour@gmail.com <mailto:rdeltour@gmail.com>>, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>, Bill Kasdorf <bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com>>, Tzviya Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com>>, W3C Digital Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>>
>>> Objet : Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...
>>> Renvoyer - De : <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>>
>>> Renvoyer - Date : mardi 22 décembre 2015 03:11
>>> 
>>> I am personally wary of any use of '#' in a URL, even if it is in a different scheme.  While it would be perfectly legitimate to define and register a new scheme that has difference semantics for '#', it would be potentially confusing for developers.  I am sure there is some other separator you could use if you really want to identify a sub-resource.  Heck, you could even make it part of a query string.
>>> 
>>> On Mon, Dec 21, 2015 at 6:09 PM, Leonard Rosenthol <lrosenth@adobe.com <mailto:lrosenth@adobe.com>> wrote:
>>> 
>>>> I would also add that it would be extremely valuable that any such fragment idents for PWP be format agnostic, since we are already seeing that EPUB is but a single profile of PWP and that there may be others – and these idents need to work for all.
>>>> 
>>>> Leonard
>>>>   <>
>>>> From: Romain Deltour [mailto:rdeltour@gmail.com <mailto:rdeltour@gmail.com>]
>>>> Sent: Monday, December 21, 2015 1:17 PM
>>>> To: Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>
>>>> Cc: Bill Kasdorf <bkasdorf@apexcovantage.com <mailto:bkasdorf@apexcovantage.com>>; Tzviya Siegman <tsiegman@wiley.com <mailto:tsiegman@wiley.com>>; W3C Digital Publishing IG <public-digipub-ig@w3.org <mailto:public-digipub-ig@w3.org>>
>>>> Subject: Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...
>>>> 
>>>> 
>>>>> This is a major difference that we should not forget about.
>>>> 
>>>> Absolutely, right.
>>>> 
>>>> I was more thinking in terms of spec work:  we should not try to (re)invent the wheel and touch fragment IDs where they're already well-defined (like HTML), but on the other hand, for new media types (for instance a JSON PWP manifest?) we have new grounds to explore and it may be relevant to consider at a fragment identifier-based approach (which is, as you correctly point out, technically different from a custom-URL-separator-based approach).
>>>> 
>>>> Romain.
>>>> 
>>>>> On 21 Dec 2015, at 18:21, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
>>>>> 
>>>>> This came up today, I think maybe Romain mentioned it: that the '!' approach for content URL looks very much like a fragment ID, so why do we make a differentiation? (But I may have misunderstood the remark, in which case my apologies!)
>>>>> 
>>>>> There is one aspect that we should not forget about where '!' and '#' are very different. Per HTTP the fragment identifier is resolved, and acted upon, on the client side. Ie, the approach is that if I request
>>>>> 
>>>>> http://www.example.org/A#B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-23B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Gfv6Gs1WFifKGuKnGhBhJEIDBZIV7JI7nCbDvFg0pIE&e=>
>>>>> 
>>>>> then the GET request will deliver the http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> as a whole to the client, which will then select, in a second step, B out of A.
>>>>> 
>>>>> However, a '!' is a bona fide part of a URI. Ie, if I request
>>>>> 
>>>>> http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=>
>>>>> 
>>>>> then the server is supposed to deliver http://www.example.org/A!B <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> to the client, not http://www.example.org/A <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> (whatever that is).
>>>>> 
>>>>> This is a major difference that we should not forget about.
>>>>> 
>>>>> Happy holidays and lots of rest to all of you/us!
>>>>> 
>>>>> Ivan
>>>>> 
>>>>> 
>>>>> 
>>>>> ----
>>>>> Ivan Herman, W3C
>>>>> Digital Publishing Lead
>>>>> Home: http://www.w3.org/People/Ivan/ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
>>>>> mobile: +31-641044153 <tel:%2B31-641044153>
>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Shane McCarron
>>> Managing Director, Applied Testing and Technology, Inc.
>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Lead
>> Home: http://www.w3.org/People/Ivan/ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704 <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
>> 
>> 
>> 
> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704 <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
> 
> 
> 


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

Received on Tuesday, 22 December 2015 17:32:49 UTC