Re: Maybe it's time to change CFI syntax to support CSS Selector? from Zheng Xu on 2022-02-17 (public-publishingcg@w3.org from February 2022)

From: Zheng Xu <zxu@gardenia-corp.com>
Date: Thu, 17 Feb 2022 08:56:45 -0500
To: Ivan Herman <ivan@w3.org>
Cc: Lars Wallin <lars@colibrio.com>, W3C Publishing Community Group <public-publishingcg@w3.org>
Message-ID: <CAAq=bxf-4M1od1ExRFsJqSDSY_v1W1BSaDuAm_enL+thTUTxfg@mail.gmail.com>
Thanks. yeah that cleared a lot to me.

In terms of tech layer, it seems Selector sits in a lower layer and
Annotation is on top of it with more beefy part such as notes and
attachments.

Cheers,
Zheng


On Thu, Feb 17, 2022 at 8:52 AM Ivan Herman <ivan@w3.org> wrote:

>
>
> On 17 Feb 2022, at 13:35, Xu Zheng (gardenia) <zxu@gardenia-corp.com>
> wrote:
>
> Yeah, it is already a web browser standard to identify a HTML element.
>
> While I have not found the rest (locate to text) are well defined and
> implemented yet.
>
> What is the current status of https://www.w3.org/TR/selectors-states/? It
> has Selector session which looks similar as annotation model.
>
>
> That is intentional. Here is the history of that document:
>
> The whole selector model originated from earlier work on the annotation
> model which was taken to the Working Group and then standardized. The
> nature of the document is such that the selector model is closely
> intertwined with the rest of the annotation model. Ie, it is part of a
> document that has also other concepts (e.g., the abstract data structure
> for an annotation) which has nothing to do with what we are discussing.
>
> At some point some members of the WG realized that, in fact, the selector
> model may have a wider usage than "just" annotation. But it was also
> realized that, for potential users, it might be very difficult to "extract"
> the relevant concept out of the annotation model standard. Hence came the
> idea to do this "extraction" once and for all, i.e., produce a document
> whose only purpose is to expose the selector states' model, regardless of
> its root in annotation. That is the selector-states document's purpose and
> 90% of the concepts therein. The only thing that was added (non
> normatively) is the translation of the model concepts into URI-s.
>
> I hope this clarifies this.
>
> Cheers
>
> Ivan
>
>
>
>
> Back to how to let web browser select some text through URI. To select
> some text in a webpage we need node index inside a HTML element and char
> offset inside the node which I found.
>
> CFI has the most close definition. annotation model (
> https://www.w3.org/TR/annotation-model/) is very close but it is still
> need some modification to get mixture of CSS Selector and Text Position
> Selector work together and I can see annotation-model has more benefit
> about how to serialize annotation (wider than selector).
>
>
> https://www.w3.org/TR/selectors-states/#serializing-iri-to-url is
> interesting as well. I don't think a selector like this is supported by web
> browser yet maybe I am wrong. so that has the similar difficult as cfi when
> adopting to web browser.
>
> http://example.org/page1
>     #selector(type=TextQuoteSelector,exact=annotation,
>        prefix=this%20is%20an%20,suffix=%20that%20has%20some)
>
>
> Cheers,
>
> Zheng
>
>
> On 2022-02-17 2:34 a.m., Lars Wallin wrote:
>
> Good point. But, as you say, It's a very good starting point indeed 🙂
> No need to reengineer the wheel just for the fun of it 😉
>
> Cheers,
> Lars
>
> On Thu, 17 Feb 2022, 08:25 Ivan Herman, <ivan@w3.org> wrote:
>
>>
>>
>> On 17 Feb 2022, at 07:33, Lars Wallin <lars@colibrio.com> wrote:
>>
>> Hey Xu Zheng 👋
>>
>> There is actually already a standard for this. Have a look at the Web
>> Annotation Selectors and States document
>>
>> https://www.w3.org/TR/selectors-states/#CssSelector_def
>>
>> https://www.w3.org/TR/selectors-states/#serializing-iri-to-url
>>
>>
>> Just to be precise here…. the selector part *is* indeed a standard
>> (defined formally in [1]) and just quoted in the note that you cite. But
>> the URI translation thereof[2] is not.
>>
>> (I am all in favor of reusing the model, of course, if we can. But we
>> should be careful with the expectations…)
>>
>> Ivan
>>
>>
>> [1] https://www.w3.org/TR/annotation-model/
>> [2] https://www.w3.org/TR/selectors-states/#frags
>>
>>
>>
>> Let's show the Web Annotations spec the love it deserves 🙂
>>
>> As EPUBCFI supports "extensions" which would let you tag these selectors
>> on to the CFI, making it EPUB compatible 👌
>>
>> http://idpf.org/epub/linking/cfi/epub-cfi.html#sec-extensions
>>
>> Cheers,
>> Lars
>>
>> On Thu, 17 Feb 2022, 03:37 Xu Zheng (gardenia), <zxu@gardenia-corp.com>
>> wrote:
>>
>>> Hi folks
>>>
>>> This week POC (
>>> https://wysebee.com/?startSelector=%23home-header%3A0%3A8&endSelector=%23home-header%3A0%3A15)
>>> to select a random string in web browser bring me back to one question
>>> again.
>>>
>>> "Can we change CFI spec a bit to support CSS Selector"?
>>>
>>>
>>> As a reading system programmer the most difficult part to implement CFI
>>> for highlight are:
>>> 1. need to go through all element to identify element for CFI such as
>>> following
>>>
>>> /6/4!/4/10/2/1
>>>
>>> 2. when we have this CFI generated for highlight then it's still
>>> difficult to go through DOM to locate this element when rendering highlight.
>>>
>>>
>>> I went through CSS selector again as well it is possible to use CSS
>>> selector to locate every element (for example implementation like this
>>> https://www.npmjs.com/package/css-selector-generator).
>>>
>>> And actually I remember when someone had brought up CFI to CSS WG and
>>> Web browser vendor a few years ago mostly two questions raised from them:
>>>
>>> 1. why don't use CSS selector
>>>
>>> 2. should avoid inventing new way attaching to URI such as following
>>>
>>> #epubcfi(/6/4!/4/10/2/1:3[%d0%a4-%22spa%20ce%22-99%25-aa^[bb^]^^])
>>>
>>>
>>>
>>> The rest of CFI I think are very useful by defining charOffset of each
>>> text node and the node index of the target node which are not covered by
>>> CSS Selector (you might want to ask - can we ask CSS Selector to support
>>> selecting charOffset? I am not sure since CSS Selector seems only select
>>> element (https://www.w3.org/TR/selectors-4/#context) ).
>>>
>>>
>>> So I wonder, maybe we still can find a unified way to define annotation
>>> but with slightly changed CFI.
>>>
>>>
>>> In the POC I tried format like
>>>
>>> ?startSelector=<CSS Selector>:<node index>:<charOffset>&endSelector=<CSS
>>> Selector>:<node index>:<charOffset>
>>>
>>> But if make it more closer to "CFI" we might have something like this?
>>> So even if web browser might not support it natively but as long as epub
>>> could support then we can have something unified that end user could share
>>> highlight in the same book but different reading app.
>>>
>>> ?epubcfi=<CSS Selector>:<node index>:<charOffset>,<CSS Selector>:<node
>>> index>:<charOffset>
>>>
>>>
>>> And this format can satisfy "path-relative-scheme-less-URL string
>>> <https://url.spec.whatwg.org/#path-relative-scheme-less-url-string>"
>>>
>>>
>>> Cheers,
>>>
>>> --
>>> Zheng Xu
>>>
>>> Chief Executive Officer
>>> Gardenia Corphttps://gardenia-corp.com/
>>>
>>>
>>
>> ----
>> Ivan Herman, W3C
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +33 6 52 46 00 43
>>
>>
>> --
> Zheng Xu
>
> Chief Executive Officer
> Gardenia Corphttps://gardenia-corp.com/
>
>
>
> ----
> Ivan Herman, W3C
> Home: http://www.w3.org/People/Ivan/
> mobile: +33 6 52 46 00 43
>
>
>
Received on Thursday, 17 February 2022 13:57:11 UTC