Minutes for the call on 2022-10-12

Hi all,

below are the minutes of today's call. They are also available at

https://www.w3.org/2022/10/12-rch-minutes.html,text


pa


12 October 2022

    [2]Agenda. [3]IRC log.

       [2]https://www.w3.org/events/meetings/51e8f278-b556-4090-b538-7928b3c628b6/20221012T110000

       [3]https://www.w3.org/2022/10/12-rch-irc


Attendees

    Present
           AndyS, dlehn, dlehn1, dlongley, gkellogg, Kazue, manu,
           markus_sabadello, pchampin, phila, TallTed, yamdan

    Regrets
           -

    Chair
           phila

    Scribe
           AndyS, phila

Contents

     1. [4]Editors

Meeting minutes

    phila: Any new attendees?

    Today: comparing the algorithms, deciding editors,

    <phila> [5]https://github.com/w3c/rch-rdc/issues/6


       [5]https://github.com/w3c/rch-rdc/issues/6


    phila: How do we compare the two algorithms
    … described as being similar
    … as a group we need to decide how to proceed
    … various ideas - need to be open and fair.

    <Zakim> manu, you wanted to suggest one approach to compare
    algorithms.

    phila: if the graph is simple then it is a simple algorithm, as
    bnode structures increase it gets harder.

    manu: Aiden presented his algorithm. Could compare algorithm A
    and B work as the stages are similar.
    … run both in parallel
    … guilded tour of the process

    gkellogg: if we able to pick some examples to run through would
    be useful
    … complex category of graphs - set of promlematic use cases -
    performance testing

    <Zakim> dlongley, you wanted to ask about criteria for making
    choices if we knew what the differences were to understand what
    differences to look for

    dlongley: if we figure out the differences - need to keep on
    mind criteria.
    … how formal do we need to get in understanding the differences
    … worried about the amount of work

    manu: add an input:
    … formal analysis

    <manu> Technical Report on the Universal RDF Dataset
    Normalization Algorithm: [6]https://lists.w3.org/Archives/

    Public/public-credentials/2021Apr/att-0032/
    Mirabolic_Graph_Iso_Report_2020_10_19.pdf

       [6]https://lists.w3.org/Archives/Public/public-credentials/2021Apr/att-0032/Mirabolic_Graph_Iso_Report_2020_10_19.pdf


    manu: we might consider bringing in that group
    … introduction to the formal analysis

    AndyS: You mentioned - might not be all graphs that were
    covered. I think we ought to target all graphs as you don't
    know what you'l encounter in the real word

    <Zakim> manu, you wanted to comment on the "all graphs" thing

    dlongley: solve for all graphs (within resource limits)
    … by default solve all "normal graphs"
    … special flag for all graphs

    <Zakim> manu, you wanted to comment on the "all graphs" thing
    -- concerns around "as big as the web"

    AndyS: I'd push back a little on that as it means deciding what
    is and is not normal

    <dlongley> "don't (try to) canonicalize the Web"

    manu: at a higher level ... potential formal objections on
    charter ... e.g. very very large graphs
    … working on documents that are bounded
    … general algorithm ... state caveats e.g. not unbounded graphs
    … "poison graphs" as an attack vector.
    … we can eat up a lot of time on this.
    … scoping of graph needed

    phila: we have to limit the scope
    … we could create an algorithm for all but not the requirement

    [7]Explainer doc

       [7]https://www.w3.org/2022/07/rch-wg-charter/explainer.html


    phila: UCR says what we are trying to solve
    … (editors needed)

    <manu> +1 to explainer document to set the boundary of what
    we're trying to do.

    <gkellogg> SHACL doesn't do datasets, only graphs.

    phila: is there a condition we can do as a preprocessing graph

    AndyS: If you take a FOAF graph built up from bnodes - can
    become complex in a small file
    … I'd rather an approach that recognizes that sometimes you
    can't execute, rather than defining upfront what you can't
    compute

    <Zakim> dlongley, you wanted to say i think you'd have to
    formally prove a preprocessing step would protect you if there
    will be no false safe constraints in the processing algorithm

    dlongley: a preprocessing step would need proving

    <Zakim> manu, you wanted to speak about "multiple phased
    solutions" not THE algorithm.

    manu: we are not generating one algorithm. There exists today
    some impls in the field.
    … we might look at whether it is good enough
    … then consider next version
    … not all or nothing

    AndyS: What are the limitations? Assumption?

    <Zakim> dlongley, you wanted to say we also know that RDF-star
    is coming -- and we'll need another algorithm for that

    dlongly: current limitations/assumption URDA2015 - any bound
    dataset
    … bail out at cost points.

    AndyS: I'm happy with bailing out. But you can go further and
    say it doesn't handle all graphs. I'm happy with all graphs,
    with a bail out if it takes too much computing

    AndyS: Defining a shape before hand is not something we should
    do

    <manu> +1 to what AndyS is saying -- sounds like we're agreeing
    :)

    phila: Others?

    Kazue: thinking external criteria hard to decide

    phila: and it is political

    yamdan: also important to be clear about processing.
    … A difference of the two algorithms is scope - dataset vs
    graph.

    <manu> +1 to yamadan's points.

    dlongley: criteria important. Formally defining the differences
    is itself difficult.

    <Zakim> gkellogg, you wanted to suggest identifying specific
    categories of graphs in our hypothetical dataset that are known
    to create computational problems.

    phila: please think of two criteria

    gkellogg: want a collection of cases beyond test cases e.g.
    known expensive.

    <dlongley> 1. ease of implementation, 2. existing incubation /
    use in the marketplace, 3. time / resource complexity in
    solving common datasets, 4. time / resource complexity in
    solving complex (or poison?) datasets

    dlongley: not an ordered list

    <manu> 5. Existence of formal proofs for the algorithms

    <manu> 6. Demonstration of review of formal proofs for the
    algorithms

    phila: easy of implementation - yes.
    … incubation - yes
    … resource complexity - yes
    … formal proofs - yes

    AndyS: Ease of implementation and complexity of algorithm can
    be in opposition

    <dlongley> yes, there is a tension between ease of
    implementation and time complexity (sometimes)

    <manu> +1 to create an issue to track this.

    <dlongley> 7. reusing existing primitives that are available on
    various platforms

    <Kazue> coverage of target RDF?

    dlongley: reuse primitives e.g. hashing algorithms.
    … existing RDF serialization.

    Kazue: cover real life examples

    phila: need to note that only usual graph trigger the
    failsafes.

    <Zakim> manu, you wanted to note "hashed data" as the output...
    for BBS.

    manu: BBS signature do a statement by statement signature

    <dlongley> 8. allow signatures on individual statements and
    components of statements

    manu: criteria: has to support selective disclosure. Hashing
    alternatves.

    <Zakim> AndyS, you wanted to give criteria

    <yamdan> +1 to BBS-friendly hash

    AndyS: Dataset, not graph, no shape excluded, cover RDF-star

    <gkellogg> +1 to AndyS

    AndyS: Translates as do stuff with the longest life

    <manu> I was with AndyS all the way up to "cover RDF-star" :)

    <gkellogg> Also, Generalized RDF (bnode predicates, literal
    subjects)

    +1 to gkellogg.

    dlongley: RDF-star. Do existing use cases.

    phila: rdf-star is a nice to have but should not fail because
    of rdf-star

    URDNA2015 FPWD

    phila: URDNA2015 as FPWD.
    … likes explanatory examples.

   Editors

    phila: need to do a test suite and an explainer.

    <Zakim> gkellogg, you wanted to volunteer to edit one or both
    of the documents and help with the test suites.

    gkellogg: have been active in CG
    … hat in the ring

    For the C14N spec: ...
    … For the C14N spec ...

    <manu> Thank you, Gregg for Editor-ing the canonicalization
    spec! :)

    dlongley: can contribute as backup editor

    phila: any one like to be an editor or contribute in some way.
    … hash doc
    … "RDH"

    <Zakim> manu, you wanted to note they might be the same doc?

    manu: might be the same doc.
    … hashing is C14N input, hash it. -- one page?

    <manu> Woo! Thanks Tobias for volunteering to be an Editor!

    Tobias_: happy to help edit esp hashing

    <dlongley> +1

    <manu> (for the second part)

    <pchampin> +1

    <Zakim> gkellogg, you wanted to discuss testing implications.

    gkellogg: tesring may be easier as 2 docs

    <pchampin> a contrario, the C14N itself may be a complex
    document. That could justify keeping the hashing part out.

    yamdan: interested in hashing part

    <dlongley> +1 to Phil

    phila: end meeting

    <manu> Note: Ahmad Alobaid volunteered to be a first-time
    Editor in this group.


     Minutes manually created (not a transcript), formatted by
     [8]scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).

       [8]https://w3c.github.io/scribe2/scribedoc.html

Received on Wednesday, 12 October 2022 16:56:07 UTC