Re: Maybe it's time to change CFI syntax to support CSS Selector?

> On 17 Feb 2022, at 13:35, Xu Zheng (gardenia) <zxu@gardenia-corp.com> wrote:
> 
> Yeah, it is already a web browser standard to identify a HTML element.
> 
> While I have not found the rest (locate to text) are well defined and implemented yet.
> 
> What is the current status of https://www.w3.org/TR/selectors-states/ <https://www.w3.org/TR/selectors-states/>? It has Selector session which looks similar as annotation model.
> 
> 

That is intentional. Here is the history of that document:

The whole selector model originated from earlier work on the annotation model which was taken to the Working Group and then standardized. The nature of the document is such that the selector model is closely intertwined with the rest of the annotation model. Ie, it is part of a document that has also other concepts (e.g., the abstract data structure for an annotation) which has nothing to do with what we are discussing.

At some point some members of the WG realized that, in fact, the selector model may have a wider usage than "just" annotation. But it was also realized that, for potential users, it might be very difficult to "extract" the relevant concept out of the annotation model standard. Hence came the idea to do this "extraction" once and for all, i.e., produce a document whose only purpose is to expose the selector states' model, regardless of its root in annotation. That is the selector-states document's purpose and 90% of the concepts therein. The only thing that was added (non normatively) is the translation of the model concepts into URI-s.

I hope this clarifies this.

Cheers

Ivan



> 
> Back to how to let web browser select some text through URI. To select some text in a webpage we need node index inside a HTML element and char offset inside the node which I found.
> 
> CFI has the most close definition. annotation model (https://www.w3.org/TR/annotation-model/ <https://www.w3.org/TR/annotation-model/>) is very close but it is still need some modification to get mixture of CSS Selector and Text Position Selector work together and I can see annotation-model has more benefit about how to serialize annotation (wider than selector).
> 
> 
> 
> https://www.w3.org/TR/selectors-states/#serializing-iri-to-url <https://www.w3.org/TR/selectors-states/#serializing-iri-to-url> is interesting as well. I don't think a selector like this is supported by web browser yet maybe I am wrong. so that has the similar difficult as cfi when adopting to web browser.
> 
> http://example.org/page1 <http://example.org/page1>
>     #selector(type=TextQuoteSelector,exact=annotation,
>        prefix=this%20is%20an%20,suffix=%20that%20has%20some)
> 
> 
> Cheers,
> 
> Zheng
> 
> 
> 
> On 2022-02-17 2:34 a.m., Lars Wallin wrote:
>> Good point. But, as you say, It's a very good starting point indeed 🙂
>> No need to reengineer the wheel just for the fun of it 😉
>> 
>> Cheers,
>> Lars
>> 
>> On Thu, 17 Feb 2022, 08:25 Ivan Herman, <ivan@w3.org <mailto:ivan@w3.org>> wrote:
>> 
>> 
>>> On 17 Feb 2022, at 07:33, Lars Wallin <lars@colibrio.com <mailto:lars@colibrio.com>> wrote:
>>> 
>>> Hey Xu Zheng 👋
>>> 
>>> There is actually already a standard for this. Have a look at the Web Annotation Selectors and States document
>>> 
>>> https://www.w3.org/TR/selectors-states/#CssSelector_def <https://www.w3.org/TR/selectors-states/#CssSelector_def>
>>> 
>>> https://www.w3.org/TR/selectors-states/#serializing-iri-to-url <https://www.w3.org/TR/selectors-states/#serializing-iri-to-url>
>> Just to be precise here…. the selector part is indeed a standard (defined formally in [1]) and just quoted in the note that you cite. But the URI translation thereof[2] is not.
>> 
>> (I am all in favor of reusing the model, of course, if we can. But we should be careful with the expectations…)
>> 
>> Ivan
>> 
>> 
>> [1] https://www.w3.org/TR/annotation-model/ <https://www.w3.org/TR/annotation-model/>
>> [2] https://www.w3.org/TR/selectors-states/#frags <https://www.w3.org/TR/selectors-states/#frags>
>> 
>> 
>>> 
>>> Let's show the Web Annotations spec the love it deserves 🙂
>>> 
>>> As EPUBCFI supports "extensions" which would let you tag these selectors on to the CFI, making it EPUB compatible 👌
>>> 
>>> http://idpf.org/epub/linking/cfi/epub-cfi.html#sec-extensions <http://idpf.org/epub/linking/cfi/epub-cfi.html#sec-extensions>
>>> 
>>> Cheers,
>>> Lars
>>> 
>>> On Thu, 17 Feb 2022, 03:37 Xu Zheng (gardenia), <zxu@gardenia-corp.com <mailto:zxu@gardenia-corp.com>> wrote:
>>> Hi folks
>>> 
>>> This week POC (https://wysebee.com/?startSelector=%23home-header%3A0%3A8&endSelector=%23home-header%3A0%3A15 <https://wysebee.com/?startSelector=%23home-header%3A0%3A8&endSelector=%23home-header%3A0%3A15>) to select a random string in web browser bring me back to one question again.
>>> 
>>> "Can we change CFI spec a bit to support CSS Selector"?
>>> 
>>> 
>>> 
>>> As a reading system programmer the most difficult part to implement CFI for highlight are:
>>> 1. need to go through all element to identify element for CFI such as following
>>> 
>>> /6/4!/4/10/2/1
>>> 2. when we have this CFI generated for highlight then it's still difficult to go through DOM to locate this element when rendering highlight.
>>> 
>>> 
>>> 
>>> I went through CSS selector again as well it is possible to use CSS selector to locate every element (for example implementation like this https://www.npmjs.com/package/css-selector-generator <https://www.npmjs.com/package/css-selector-generator>).
>>> 
>>> And actually I remember when someone had brought up CFI to CSS WG and Web browser vendor a few years ago mostly two questions raised from them:
>>> 
>>> 1. why don't use CSS selector
>>> 
>>> 2. should avoid inventing new way attaching to URI such as following
>>> 
>>> #epubcfi(/6/4!/4/10/2/1:3[%d0%a4-%22spa%20ce%22-99%25-aa^[bb^]^^])
>>> 
>>> 
>>> 
>>> The rest of CFI I think are very useful by defining charOffset of each text node and the node index of the target node which are not covered by CSS Selector (you might want to ask - can we ask CSS Selector to support selecting charOffset? I am not sure since CSS Selector seems only select element (https://www.w3.org/TR/selectors-4/#context <https://www.w3.org/TR/selectors-4/#context>) ).
>>> 
>>> 
>>> 
>>> So I wonder, maybe we still can find a unified way to define annotation but with slightly changed CFI.
>>> 
>>> 
>>> 
>>> In the POC I tried format like
>>> 
>>> ?startSelector=<CSS Selector>:<node index>:<charOffset>&endSelector=<CSS Selector>:<node index>:<charOffset>
>>> 
>>> But if make it more closer to "CFI" we might have something like this? So even if web browser might not support it natively but as long as epub could support then we can have something unified that end user could share highlight in the same book but different reading app.
>>> 
>>> ?epubcfi=<CSS Selector>:<node index>:<charOffset>,<CSS Selector>:<node index>:<charOffset>
>>> 
>>> 
>>> 
>>> And this format can satisfy "path-relative-scheme-less-URL string <https://url.spec.whatwg.org/#path-relative-scheme-less-url-string>"
>>> 
>>> 
>>> 
>>> Cheers,
>>> 
>>> --
>>> Zheng Xu
>>> 
>>> Chief Executive Officer
>>> Gardenia Corp
>>> https://gardenia-corp.com/ <https://gardenia-corp.com/>
>> 
>> ----
>> Ivan Herman, W3C
>> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
>> mobile: +33 6 52 46 00 43
>> 
>> 
> --
> Zheng Xu
> 
> Chief Executive Officer
> Gardenia Corp
> https://gardenia-corp.com/ <https://gardenia-corp.com/>

----
Ivan Herman, W3C
Home: http://www.w3.org/People/Ivan/
mobile: +33 6 52 46 00 43

Received on Thursday, 17 February 2022 13:52:15 UTC