- From: Pierre-Antoine Champin <pierre-antoine@w3.org>
- Date: Wed, 12 Oct 2022 18:56:04 +0200
- To: public-rch-wg@w3.org
- Message-ID: <af6317d3-786e-f0b1-9ad7-d56d2481c6d3@w3.org>
Hi all,
below are the minutes of today's call. They are also available at
https://www.w3.org/2022/10/12-rch-minutes.html,text
pa
12 October 2022
[2]Agenda. [3]IRC log.
[2]https://www.w3.org/events/meetings/51e8f278-b556-4090-b538-7928b3c628b6/20221012T110000
[3]https://www.w3.org/2022/10/12-rch-irc
Attendees
Present
AndyS, dlehn, dlehn1, dlongley, gkellogg, Kazue, manu,
markus_sabadello, pchampin, phila, TallTed, yamdan
Regrets
-
Chair
phila
Scribe
AndyS, phila
Contents
1. [4]Editors
Meeting minutes
phila: Any new attendees?
Today: comparing the algorithms, deciding editors,
<phila> [5]https://github.com/w3c/rch-rdc/issues/6
[5]https://github.com/w3c/rch-rdc/issues/6
phila: How do we compare the two algorithms
… described as being similar
… as a group we need to decide how to proceed
… various ideas - need to be open and fair.
<Zakim> manu, you wanted to suggest one approach to compare
algorithms.
phila: if the graph is simple then it is a simple algorithm, as
bnode structures increase it gets harder.
manu: Aiden presented his algorithm. Could compare algorithm A
and B work as the stages are similar.
… run both in parallel
… guilded tour of the process
gkellogg: if we able to pick some examples to run through would
be useful
… complex category of graphs - set of promlematic use cases -
performance testing
<Zakim> dlongley, you wanted to ask about criteria for making
choices if we knew what the differences were to understand what
differences to look for
dlongley: if we figure out the differences - need to keep on
mind criteria.
… how formal do we need to get in understanding the differences
… worried about the amount of work
manu: add an input:
… formal analysis
<manu> Technical Report on the Universal RDF Dataset
Normalization Algorithm: [6]https://lists.w3.org/Archives/
Public/public-credentials/2021Apr/att-0032/
Mirabolic_Graph_Iso_Report_2020_10_19.pdf
[6]https://lists.w3.org/Archives/Public/public-credentials/2021Apr/att-0032/Mirabolic_Graph_Iso_Report_2020_10_19.pdf
manu: we might consider bringing in that group
… introduction to the formal analysis
AndyS: You mentioned - might not be all graphs that were
covered. I think we ought to target all graphs as you don't
know what you'l encounter in the real word
<Zakim> manu, you wanted to comment on the "all graphs" thing
dlongley: solve for all graphs (within resource limits)
… by default solve all "normal graphs"
… special flag for all graphs
<Zakim> manu, you wanted to comment on the "all graphs" thing
-- concerns around "as big as the web"
AndyS: I'd push back a little on that as it means deciding what
is and is not normal
<dlongley> "don't (try to) canonicalize the Web"
manu: at a higher level ... potential formal objections on
charter ... e.g. very very large graphs
… working on documents that are bounded
… general algorithm ... state caveats e.g. not unbounded graphs
… "poison graphs" as an attack vector.
… we can eat up a lot of time on this.
… scoping of graph needed
phila: we have to limit the scope
… we could create an algorithm for all but not the requirement
[7]Explainer doc
[7]https://www.w3.org/2022/07/rch-wg-charter/explainer.html
phila: UCR says what we are trying to solve
… (editors needed)
<manu> +1 to explainer document to set the boundary of what
we're trying to do.
<gkellogg> SHACL doesn't do datasets, only graphs.
phila: is there a condition we can do as a preprocessing graph
AndyS: If you take a FOAF graph built up from bnodes - can
become complex in a small file
… I'd rather an approach that recognizes that sometimes you
can't execute, rather than defining upfront what you can't
compute
<Zakim> dlongley, you wanted to say i think you'd have to
formally prove a preprocessing step would protect you if there
will be no false safe constraints in the processing algorithm
dlongley: a preprocessing step would need proving
<Zakim> manu, you wanted to speak about "multiple phased
solutions" not THE algorithm.
manu: we are not generating one algorithm. There exists today
some impls in the field.
… we might look at whether it is good enough
… then consider next version
… not all or nothing
AndyS: What are the limitations? Assumption?
<Zakim> dlongley, you wanted to say we also know that RDF-star
is coming -- and we'll need another algorithm for that
dlongly: current limitations/assumption URDA2015 - any bound
dataset
… bail out at cost points.
AndyS: I'm happy with bailing out. But you can go further and
say it doesn't handle all graphs. I'm happy with all graphs,
with a bail out if it takes too much computing
AndyS: Defining a shape before hand is not something we should
do
<manu> +1 to what AndyS is saying -- sounds like we're agreeing
:)
phila: Others?
Kazue: thinking external criteria hard to decide
phila: and it is political
yamdan: also important to be clear about processing.
… A difference of the two algorithms is scope - dataset vs
graph.
<manu> +1 to yamadan's points.
dlongley: criteria important. Formally defining the differences
is itself difficult.
<Zakim> gkellogg, you wanted to suggest identifying specific
categories of graphs in our hypothetical dataset that are known
to create computational problems.
phila: please think of two criteria
gkellogg: want a collection of cases beyond test cases e.g.
known expensive.
<dlongley> 1. ease of implementation, 2. existing incubation /
use in the marketplace, 3. time / resource complexity in
solving common datasets, 4. time / resource complexity in
solving complex (or poison?) datasets
dlongley: not an ordered list
<manu> 5. Existence of formal proofs for the algorithms
<manu> 6. Demonstration of review of formal proofs for the
algorithms
phila: easy of implementation - yes.
… incubation - yes
… resource complexity - yes
… formal proofs - yes
AndyS: Ease of implementation and complexity of algorithm can
be in opposition
<dlongley> yes, there is a tension between ease of
implementation and time complexity (sometimes)
<manu> +1 to create an issue to track this.
<dlongley> 7. reusing existing primitives that are available on
various platforms
<Kazue> coverage of target RDF?
dlongley: reuse primitives e.g. hashing algorithms.
… existing RDF serialization.
Kazue: cover real life examples
phila: need to note that only usual graph trigger the
failsafes.
<Zakim> manu, you wanted to note "hashed data" as the output...
for BBS.
manu: BBS signature do a statement by statement signature
<dlongley> 8. allow signatures on individual statements and
components of statements
manu: criteria: has to support selective disclosure. Hashing
alternatves.
<Zakim> AndyS, you wanted to give criteria
<yamdan> +1 to BBS-friendly hash
AndyS: Dataset, not graph, no shape excluded, cover RDF-star
<gkellogg> +1 to AndyS
AndyS: Translates as do stuff with the longest life
<manu> I was with AndyS all the way up to "cover RDF-star" :)
<gkellogg> Also, Generalized RDF (bnode predicates, literal
subjects)
+1 to gkellogg.
dlongley: RDF-star. Do existing use cases.
phila: rdf-star is a nice to have but should not fail because
of rdf-star
URDNA2015 FPWD
phila: URDNA2015 as FPWD.
… likes explanatory examples.
Editors
phila: need to do a test suite and an explainer.
<Zakim> gkellogg, you wanted to volunteer to edit one or both
of the documents and help with the test suites.
gkellogg: have been active in CG
… hat in the ring
For the C14N spec: ...
… For the C14N spec ...
<manu> Thank you, Gregg for Editor-ing the canonicalization
spec! :)
dlongley: can contribute as backup editor
phila: any one like to be an editor or contribute in some way.
… hash doc
… "RDH"
<Zakim> manu, you wanted to note they might be the same doc?
manu: might be the same doc.
… hashing is C14N input, hash it. -- one page?
<manu> Woo! Thanks Tobias for volunteering to be an Editor!
Tobias_: happy to help edit esp hashing
<dlongley> +1
<manu> (for the second part)
<pchampin> +1
<Zakim> gkellogg, you wanted to discuss testing implications.
gkellogg: tesring may be easier as 2 docs
<pchampin> a contrario, the C14N itself may be a complex
document. That could justify keeping the hashing part out.
yamdan: interested in hashing part
<dlongley> +1 to Phil
phila: end meeting
<manu> Note: Ahmad Alobaid volunteered to be a first-time
Editor in this group.
Minutes manually created (not a transcript), formatted by
[8]scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).
[8]https://w3c.github.io/scribe2/scribedoc.html
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Wednesday, 12 October 2022 16:56:07 UTC