W3C home > Mailing lists > Public > public-digipub-ig@w3.org > December 2015

Re: [Locators] Re: While it is still fresh in our minds: '!' is not just a funny fragment identifier...

From: Nick Ruffilo <nickruffilo@gmail.com>
Date: Tue, 22 Dec 2015 12:13:32 -0500
Message-ID: <CA+Dds5-UW-k84Rwns1kpb8KmDQwcV-eL0+AgH7Yh9Mp=mdHc1A@mail.gmail.com>
To: Tim Cole <t-cole3@illinois.edu>
Cc: AUDRAIN LUC <LAUDRAIN@hachette-livre.fr>, Ivan Herman <ivan@w3.org>, Shane P McCarron <shane@aptest.com>, Leonard Rosenthol <lrosenth@adobe.com>, Romain Deltour <rdeltour@gmail.com>, Bill Kasdorf <bkasdorf@apexcovantage.com>, Tzviya Siegman <tsiegman@wiley.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
I've briefly read through this thread and believe I am more confused then
when I started.  I've never actually seen ! used in a URL or URI so I'm
having some difficulty understanding it's current use.  It may be that this
has been answered previously but I'll admit to being lazy and not wanting
to read through everything again, but hopefully I can ask concise questions
that are easy to answer:

1) Is there a real-world example of ! in use?
2) Is ! a new proposal or an existing spec we wish to make use of?
3) Is ! just for URIs or for URLs as well?
4) Can a URL/URI make use of multiple special characters?  For example
could I have //folder/subfolder/item.html?param1=val1#section!somethingelse

Again - sorry if this asks for information to be repeated.  I can only
assume that if I'm a bit lost at least 1 other person is.

-Nick

On Tue, Dec 22, 2015 at 12:00 PM, Timothy Cole <t-cole3@illinois.edu> wrote:

> I may be out of date (or missing the point completely), but I thought the
> '!' character was reserved in URI Generic Syntax to serve as a
> sub-delimiter within a component of a URI. The main difference between '!'
> and '#' is that the former is meant to delimit sub-components within a URI
> component (scheme, authority, path, query, fragment) and the latter is used
> to delimit a specific component (fragment).
>
>
>
> As a reserved character sub-component delimiter, the meaning of '!' can be
> scheme, component or implementation-specific (much as the meaning of URL
> components delimited by '#' is specific by MIME type). '=' and '&' are
> examples of reserved sub-component delimiters (i.e., the same class of
> reserved character as '!') that have well-known roles as delimiters in a
> query component of a URI.
>
>
>
> So in that sense http://example.org/myRoot/A!B has path component (/myRoot/A!B)
> that has been explicitly divided (according to URL spec) into 2
> sub-components: /myRoot/A and B. The meaning of these 2 sub-components
> and what the server is supposed to do with them is not generically defined,
> but one can reasonably expect that the fact that the path has been divided
> into sub-components make this URI different than say
> http://example.org/myRoot/A-B where the path has not been divided into
> sub-components (because '-' is an allowed but not reserved character in URL
> syntax. Certainly if you wanted to referred to the font file of a PWP
> resource held somewhere separate from the rest of the PWD, using the base
> locator of the PWP, it could make sense to do so by appending its name as a
> path sub-component, though of course this would have to be clearly spelled
> out and uptake would be uncertain, and there are other approaches as well.
> But starting with a reserved character does seem like a good idea.
>
>
>
> Probably all of this has been entirely implicit for the other posters to
> this thread, but I just wanted to make sure.
>
>
>
> -Tim Cole
>
>
>
>
>
> *From:* AUDRAIN LUC [mailto:LAUDRAIN@hachette-livre.fr]
> *Sent:* Tuesday, December 22, 2015 2:36 AM
> *To:* Ivan Herman <ivan@w3.org>
> *Cc:* Shane P McCarron <shane@aptest.com>; Leonard Rosenthol <
> lrosenth@adobe.com>; Romain Deltour <rdeltour@gmail.com>; Bill Kasdorf <
> bkasdorf@apexcovantage.com>; Tzviya Siegman <tsiegman@wiley.com>; W3C
> Digital Publishing IG <public-digipub-ig@w3.org>
> *Subject:* Re: [Locators] Re: While it is still fresh in our minds: '!'
> is not just a funny fragment identifier...
>
>
>
> Looks like EPUB CFI…
>
> Luc
>
>
>
>
>
> *De : *Ivan Herman <ivan@w3.org>
> *Date : *mardi 22 décembre 2015 09:34
> *À : *AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr>
> *Cc : *Shane McCarron <shane@aptest.com>, Leonard Rosenthol <
> lrosenth@adobe.com>, Romain Deltour <rdeltour@gmail.com>, Bill Kasdorf <
> bkasdorf@apexcovantage.com>, Tzviya Siegman <tsiegman@wiley.com>, W3C
> Digital Publishing IG <public-digipub-ig@w3.org>
> *Objet : *[Locators] Re: While it is still fresh in our minds: '!' is not
> just a funny fragment identifier...
>
>
>
>
>
> On 22 Dec 2015, at 09:22, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr> wrote:
>
>
>
> Sorry, perhaps I am not at the same level of abstraction.
>
> And yes, it may be certainly a question of server’s trick.
>
>
>
> But from a resource producer point of view, if "http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=>
>  and http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> are
> two completely different resources", is B a sub-resource of A?
>
>
>
> By default there is nothing that says that as far as the HTTP protocol is
> concerned.
>
>
>
> ·         If yes, « in A¡B, B is a sub-resource of A », then resource
> producers have to build « two completely different resources » for a commun
> content B,
>
> ·         If no, « in A¡B, B is not a sub-resource of A », what does A¡B
> means a locator for B, why not use http://www.example.org/B?
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_B-3F&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Cio9s2Q5cP-2-PJl5SRePibdHVOMN3LokItcHZjgz5E&e=>
>
> Good question. And to make it clear: I did *not* propose the usage of the
> '!' character, it is just mentioned as a possible avenue. I believe it was
> used in a very restricted manner (and not generally):
>
>
>
>                 • http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> is
> the URL yielding the PWP manifest (or something similar)
>
>                 • http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> was
> to access a resource within the PWP (but that must either be aided by the
> server, or the client has to have some built in logic to manage that URI
> instead of issuing a direct HTTP GET>
>
>
>
> I seem to remember that Readium uses this trick in its Service Worker
> experimentation.
>
>
>
> Ivan
>
>
>
>
>
>
>
> Luc
>
>
>
> *De : *Ivan Herman <ivan@w3.org>
> *Date : *mardi 22 décembre 2015 09:03
> *À : *AUDRAIN LUC AUDRAIN LUC <laudrain@hachette-livre.fr>
> *Cc : *Shane McCarron <shane@aptest.com>, Leonard Rosenthol <
> lrosenth@adobe.com>, Romain Deltour <rdeltour@gmail.com>, Bill Kasdorf <
> bkasdorf@apexcovantage.com>, Tzviya Siegman <tsiegman@wiley.com>, W3C
> Digital Publishing IG <public-digipub-ig@w3.org>
> *Objet : *Re: While it is still fresh in our minds: '!' is not just a
> funny fragment identifier...
>
>
>
>
>
> On 22 Dec 2015, at 07:47, AUDRAIN LUC <LAUDRAIN@hachette-livre.fr> wrote:
>
>
>
> Snippet : if I request http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> then
> the server is supposed to deliver http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> to
> the client
>
> This means that A¡B as a sub-resource can be served by the server.
> Depending on the kind of resource, it may not « naturally »  exists .
>
>
>
> If it’s a specific position in an audio or vidéo file, it may be fine in
> streaming, but as a position in text, can the server send this specific
> portion of text without sending the beginning of the HTML file?
>
>
>
> I am not sure I 100% understand the question.
>
>
>
> By default, http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=>
>  and http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> are
> two completely different resources, not unlike http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> is
> completely different from http://www.example.org/C
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_C&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=22q6n7EdIur3QaV1kk17kK1LbKc6M9y_veLWZdCOkuM&e=>.
> Of course, the server can implement some tricks whereby the '!' character
> is interpreted in a particular way, but that is really a matter of server
> setup/programming/whatever. The '!' character is nothing special, afaik.
>
>
>
> But I am not sure I answered your question…
>
>
>
> Ivan
>
>
>
>
>
>
>
>
>
>
>
> *De : *Shane McCarron <shane@aptest.com>
> *Date : *mardi 22 décembre 2015 03:10
> *À : *Leonard Rosenthol <lrosenth@adobe.com>
> *Cc : *Romain Deltour <rdeltour@gmail.com>, Ivan Herman <ivan@w3.org>,
> Bill Kasdorf <bkasdorf@apexcovantage.com>, Tzviya Siegman <
> tsiegman@wiley.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
> *Objet : *Re: While it is still fresh in our minds: '!' is not just a
> funny fragment identifier...
> *Renvoyer - De : *<public-digipub-ig@w3.org>
> *Renvoyer - Date : *mardi 22 décembre 2015 03:11
>
>
>
> I am personally wary of any use of '#' in a URL, even if it is in a
> different scheme.  While it would be perfectly legitimate to define and
> register a new scheme that has difference semantics for '#', it would be
> potentially confusing for developers.  I am sure there is some other
> separator you could use if you really want to identify a sub-resource.
> Heck, you could even make it part of a query string.
>
>
>
> On Mon, Dec 21, 2015 at 6:09 PM, Leonard Rosenthol <lrosenth@adobe.com>
> wrote:
>
> I would also add that it would be extremely valuable that any such
> fragment idents for PWP be format agnostic, since we are already seeing
> that EPUB is but a single profile of PWP and that there may be others – and
> these idents need to work for all.
>
>
>
> Leonard
>
>
>
> *From:* Romain Deltour [mailto:rdeltour@gmail.com]
> *Sent:* Monday, December 21, 2015 1:17 PM
> *To:* Ivan Herman <ivan@w3.org>
> *Cc:* Bill Kasdorf <bkasdorf@apexcovantage.com>; Tzviya Siegman <
> tsiegman@wiley.com>; W3C Digital Publishing IG <public-digipub-ig@w3.org>
> *Subject:* Re: While it is still fresh in our minds: '!' is not just a
> funny fragment identifier...
>
>
>
>
>
> This is a major difference that we should not forget about.
>
>
>
> Absolutely, right.
>
>
>
> I was more thinking in terms of spec work:  we should not try to
> (re)invent the wheel and touch fragment IDs where they're already
> well-defined (like HTML), but on the other hand, for new media types (for
> instance a JSON PWP manifest?) we have new grounds to explore and it may be
> relevant to consider at a fragment identifier-based approach (which is, as
> you correctly point out, technically different from a
> custom-URL-separator-based approach).
>
>
>
> Romain.
>
>
>
> On 21 Dec 2015, at 18:21, Ivan Herman <ivan@w3.org> wrote:
>
>
>
> This came up today, I think maybe Romain mentioned it: that the '!'
> approach for content URL looks very much like a fragment ID, so why do we
> make a differentiation? (But I may have misunderstood the remark, in which
> case my apologies!)
>
>
>
> There is one aspect that we should not forget about where '!' and '#' are
> very different. Per HTTP the fragment identifier is resolved, and acted
> upon, *on the client side*. Ie, the approach is that if I request
>
>
>
> http://www.example.org/A#B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-23B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=Gfv6Gs1WFifKGuKnGhBhJEIDBZIV7JI7nCbDvFg0pIE&e=>
>
>
>
> then the GET request will deliver the http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=>
>  *as a whole* to the client, which will then select, in a second step, B *out
> of* A.
>
>
>
> However, a '!' is a bona fide part of a URI. Ie, if I request
>
>
>
> http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=>
>
>
>
> then the server is supposed to deliver http://www.example.org/A!B
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A-21B&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=KVgRD-0oMuZfB5yAVVVqIe8wGXdiWtiR_mxmdUdoqzc&e=> to
> the client, *not* http://www.example.org/A
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.example.org_A&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=bfP0l0p0dNl6iLgZZxSS4EdyvRSHh5ciF7OmcSk4ex4&e=> (whatever
> that is).
>
>
>
> This is a major difference that we should not forget about.
>
>
>
> Happy holidays and lots of rest to all of you/us!
>
>
>
> Ivan
>
>
>
>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
> mobile: +31-641044153
>
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
>
>
>
>
>
>
>
>
>
> --
>
> Shane McCarron
>
> Managing Director, Applied Testing and Technology, Inc.
>
>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
> mobile: +31-641044153
>
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
>
>
>
>
>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Ivan_&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=c-5TILm4-P8z8dAzC7FUbPNO3PytMSXl_9LWqGCQa2A&e=>
> mobile: +31-641044153
>
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0003-2D0782-2D2704&d=BQMF-g&c=8hUWFZcy2Z-Za5rBPlktOQ&r=zjI0r-H6xRs5fYf2_jJkju6US9ijk0nLw4ns2nuwU2k&m=SwASCeIUKZynw8D-jQws8BK1aDegYO-c7EYgrOHiorY&s=4Oj5vYVRbFLc8NBpBT2NrCC5xt8aaSuqSFuurrIylKE&e=>
>
>
>
>



-- 
- Nick Ruffilo
@NickRuffilo
http://Aerbook.com
http://twitch.tv/TheWizardLlewyn
http://ZenOfTechnology.com <http://zenoftechnology.com/>
Received on Tuesday, 22 December 2015 17:14:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:20 UTC