Minutes for the call on 2022-10-12 from Pierre-Antoine Champin on 2022-10-12 (public-rch-wg@w3.org from October 2022)

From: Pierre-Antoine Champin <pierre-antoine@w3.org>
Date: Wed, 12 Oct 2022 18:56:04 +0200
To: public-rch-wg@w3.org
Message-ID: <af6317d3-786e-f0b1-9ad7-d56d2481c6d3@w3.org>

Hi all,

below are the minutes of today's call. They are also available at

https://www.w3.org/2022/10/12-rch-minutes.html,text

12 October 2022

[2]Agenda. [3]IRC log.

[2]https://www.w3.org/events/meetings/51e8f278-b556-4090-b538-7928b3c628b6/20221012T110000

[3]https://www.w3.org/2022/10/12-rch-irc

Attendees

Present
AndyS, dlehn, dlehn1, dlongley, gkellogg, Kazue, manu,
markus_sabadello, pchampin, phila, TallTed, yamdan

Regrets
-

Chair
phila

Scribe
AndyS, phila

Contents

1. [4]Editors

Meeting minutes

phila: Any new attendees?

Today: comparing the algorithms, deciding editors,

<phila> [5]https://github.com/w3c/rch-rdc/issues/6

[5]https://github.com/w3c/rch-rdc/issues/6

phila: How do we compare the two algorithms
… described as being similar
… as a group we need to decide how to proceed
… various ideas - need to be open and fair.

<Zakim> manu, you wanted to suggest one approach to compare
algorithms.

phila: if the graph is simple then it is a simple algorithm, as
bnode structures increase it gets harder.

manu: Aiden presented his algorithm. Could compare algorithm A
and B work as the stages are similar.
… run both in parallel
… guilded tour of the process

gkellogg: if we able to pick some examples to run through would
be useful
… complex category of graphs - set of promlematic use cases -
performance testing

<Zakim> dlongley, you wanted to ask about criteria for making
choices if we knew what the differences were to understand what
differences to look for

dlongley: if we figure out the differences - need to keep on
mind criteria.
… how formal do we need to get in understanding the differences
… worried about the amount of work

manu: add an input:
… formal analysis

<manu> Technical Report on the Universal RDF Dataset
Normalization Algorithm: [6]https://lists.w3.org/Archives/

Public/public-credentials/2021Apr/att-0032/
Mirabolic_Graph_Iso_Report_2020_10_19.pdf

[6]https://lists.w3.org/Archives/Public/public-credentials/2021Apr/att-0032/Mirabolic_Graph_Iso_Report_2020_10_19.pdf

manu: we might consider bringing in that group
… introduction to the formal analysis

AndyS: You mentioned - might not be all graphs that were
covered. I think we ought to target all graphs as you don't
know what you'l encounter in the real word

<Zakim> manu, you wanted to comment on the "all graphs" thing

dlongley: solve for all graphs (within resource limits)
… by default solve all "normal graphs"
… special flag for all graphs

<Zakim> manu, you wanted to comment on the "all graphs" thing
-- concerns around "as big as the web"

AndyS: I'd push back a little on that as it means deciding what
is and is not normal

<dlongley> "don't (try to) canonicalize the Web"

manu: at a higher level ... potential formal objections on
charter ... e.g. very very large graphs
… working on documents that are bounded
… general algorithm ... state caveats e.g. not unbounded graphs
… "poison graphs" as an attack vector.
… we can eat up a lot of time on this.
… scoping of graph needed

phila: we have to limit the scope
… we could create an algorithm for all but not the requirement

[7]Explainer doc

[7]https://www.w3.org/2022/07/rch-wg-charter/explainer.html

phila: UCR says what we are trying to solve
… (editors needed)

<manu> +1 to explainer document to set the boundary of what
we're trying to do.

<gkellogg> SHACL doesn't do datasets, only graphs.

phila: is there a condition we can do as a preprocessing graph

AndyS: If you take a FOAF graph built up from bnodes - can
become complex in a small file
… I'd rather an approach that recognizes that sometimes you
can't execute, rather than defining upfront what you can't
compute

<Zakim> dlongley, you wanted to say i think you'd have to
formally prove a preprocessing step would protect you if there
will be no false safe constraints in the processing algorithm

dlongley: a preprocessing step would need proving

<Zakim> manu, you wanted to speak about "multiple phased
solutions" not THE algorithm.

manu: we are not generating one algorithm. There exists today
some impls in the field.
… we might look at whether it is good enough
… then consider next version
… not all or nothing

AndyS: What are the limitations? Assumption?

<Zakim> dlongley, you wanted to say we also know that RDF-star
is coming -- and we'll need another algorithm for that

dlongly: current limitations/assumption URDA2015 - any bound
dataset
… bail out at cost points.

AndyS: I'm happy with bailing out. But you can go further and
say it doesn't handle all graphs. I'm happy with all graphs,
with a bail out if it takes too much computing

AndyS: Defining a shape before hand is not something we should
do

<manu> +1 to what AndyS is saying -- sounds like we're agreeing
:)

phila: Others?

Kazue: thinking external criteria hard to decide

phila: and it is political

yamdan: also important to be clear about processing.
… A difference of the two algorithms is scope - dataset vs
graph.

<manu> +1 to yamadan's points.

dlongley: criteria important. Formally defining the differences
is itself difficult.

<Zakim> gkellogg, you wanted to suggest identifying specific
categories of graphs in our hypothetical dataset that are known
to create computational problems.

phila: please think of two criteria

gkellogg: want a collection of cases beyond test cases e.g.
known expensive.

<dlongley> 1. ease of implementation, 2. existing incubation /
use in the marketplace, 3. time / resource complexity in
solving common datasets, 4. time / resource complexity in
solving complex (or poison?) datasets

dlongley: not an ordered list

<manu> 5. Existence of formal proofs for the algorithms

<manu> 6. Demonstration of review of formal proofs for the
algorithms

phila: easy of implementation - yes.
… incubation - yes
… resource complexity - yes
… formal proofs - yes

AndyS: Ease of implementation and complexity of algorithm can
be in opposition

<dlongley> yes, there is a tension between ease of
implementation and time complexity (sometimes)

<manu> +1 to create an issue to track this.

<dlongley> 7. reusing existing primitives that are available on
various platforms

<Kazue> coverage of target RDF?

dlongley: reuse primitives e.g. hashing algorithms.
… existing RDF serialization.

Kazue: cover real life examples

phila: need to note that only usual graph trigger the
failsafes.

<Zakim> manu, you wanted to note "hashed data" as the output...
for BBS.

manu: BBS signature do a statement by statement signature

<dlongley> 8. allow signatures on individual statements and
components of statements

manu: criteria: has to support selective disclosure. Hashing
alternatves.

<Zakim> AndyS, you wanted to give criteria

<yamdan> +1 to BBS-friendly hash

AndyS: Dataset, not graph, no shape excluded, cover RDF-star

<gkellogg> +1 to AndyS

AndyS: Translates as do stuff with the longest life

<manu> I was with AndyS all the way up to "cover RDF-star" :)

<gkellogg> Also, Generalized RDF (bnode predicates, literal
subjects)

+1 to gkellogg.

dlongley: RDF-star. Do existing use cases.

phila: rdf-star is a nice to have but should not fail because
of rdf-star

URDNA2015 FPWD

phila: URDNA2015 as FPWD.
… likes explanatory examples.

Editors

phila: need to do a test suite and an explainer.

<Zakim> gkellogg, you wanted to volunteer to edit one or both
of the documents and help with the test suites.

gkellogg: have been active in CG
… hat in the ring

For the C14N spec: ...
… For the C14N spec ...

<manu> Thank you, Gregg for Editor-ing the canonicalization
spec! :)

dlongley: can contribute as backup editor

phila: any one like to be an editor or contribute in some way.
… hash doc
… "RDH"

<Zakim> manu, you wanted to note they might be the same doc?

manu: might be the same doc.
… hashing is C14N input, hash it. -- one page?

<manu> Woo! Thanks Tobias for volunteering to be an Editor!

Tobias_: happy to help edit esp hashing

<dlongley> +1

<manu> (for the second part)

<pchampin> +1

<Zakim> gkellogg, you wanted to discuss testing implications.

gkellogg: tesring may be easier as 2 docs

<pchampin> a contrario, the C14N itself may be a complex
document. That could justify keeping the hashing part out.

yamdan: interested in hashing part

<dlongley> +1 to Phil

phila: end meeting

<manu> Note: Ahmad Alobaid volunteered to be a first-time
Editor in this group.

Minutes manually created (not a transcript), formatted by
[8]scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).

[8]https://w3c.github.io/scribe2/scribedoc.html

Attachments

application/pgp-keys attachment: OpenPGP public key

Received on Wednesday, 12 October 2022 16:56:07 UTC