- From: Pierre-Antoine Champin <pierre-antoine@w3.org>
- Date: Wed, 12 Oct 2022 18:56:04 +0200
- To: public-rch-wg@w3.org
- Message-ID: <af6317d3-786e-f0b1-9ad7-d56d2481c6d3@w3.org>
Hi all, below are the minutes of today's call. They are also available at https://www.w3.org/2022/10/12-rch-minutes.html,text pa 12 October 2022 [2]Agenda. [3]IRC log. [2]https://www.w3.org/events/meetings/51e8f278-b556-4090-b538-7928b3c628b6/20221012T110000 [3]https://www.w3.org/2022/10/12-rch-irc Attendees Present AndyS, dlehn, dlehn1, dlongley, gkellogg, Kazue, manu, markus_sabadello, pchampin, phila, TallTed, yamdan Regrets - Chair phila Scribe AndyS, phila Contents 1. [4]Editors Meeting minutes phila: Any new attendees? Today: comparing the algorithms, deciding editors, <phila> [5]https://github.com/w3c/rch-rdc/issues/6 [5]https://github.com/w3c/rch-rdc/issues/6 phila: How do we compare the two algorithms … described as being similar … as a group we need to decide how to proceed … various ideas - need to be open and fair. <Zakim> manu, you wanted to suggest one approach to compare algorithms. phila: if the graph is simple then it is a simple algorithm, as bnode structures increase it gets harder. manu: Aiden presented his algorithm. Could compare algorithm A and B work as the stages are similar. … run both in parallel … guilded tour of the process gkellogg: if we able to pick some examples to run through would be useful … complex category of graphs - set of promlematic use cases - performance testing <Zakim> dlongley, you wanted to ask about criteria for making choices if we knew what the differences were to understand what differences to look for dlongley: if we figure out the differences - need to keep on mind criteria. … how formal do we need to get in understanding the differences … worried about the amount of work manu: add an input: … formal analysis <manu> Technical Report on the Universal RDF Dataset Normalization Algorithm: [6]https://lists.w3.org/Archives/ Public/public-credentials/2021Apr/att-0032/ Mirabolic_Graph_Iso_Report_2020_10_19.pdf [6]https://lists.w3.org/Archives/Public/public-credentials/2021Apr/att-0032/Mirabolic_Graph_Iso_Report_2020_10_19.pdf manu: we might consider bringing in that group … introduction to the formal analysis AndyS: You mentioned - might not be all graphs that were covered. I think we ought to target all graphs as you don't know what you'l encounter in the real word <Zakim> manu, you wanted to comment on the "all graphs" thing dlongley: solve for all graphs (within resource limits) … by default solve all "normal graphs" … special flag for all graphs <Zakim> manu, you wanted to comment on the "all graphs" thing -- concerns around "as big as the web" AndyS: I'd push back a little on that as it means deciding what is and is not normal <dlongley> "don't (try to) canonicalize the Web" manu: at a higher level ... potential formal objections on charter ... e.g. very very large graphs … working on documents that are bounded … general algorithm ... state caveats e.g. not unbounded graphs … "poison graphs" as an attack vector. … we can eat up a lot of time on this. … scoping of graph needed phila: we have to limit the scope … we could create an algorithm for all but not the requirement [7]Explainer doc [7]https://www.w3.org/2022/07/rch-wg-charter/explainer.html phila: UCR says what we are trying to solve … (editors needed) <manu> +1 to explainer document to set the boundary of what we're trying to do. <gkellogg> SHACL doesn't do datasets, only graphs. phila: is there a condition we can do as a preprocessing graph AndyS: If you take a FOAF graph built up from bnodes - can become complex in a small file … I'd rather an approach that recognizes that sometimes you can't execute, rather than defining upfront what you can't compute <Zakim> dlongley, you wanted to say i think you'd have to formally prove a preprocessing step would protect you if there will be no false safe constraints in the processing algorithm dlongley: a preprocessing step would need proving <Zakim> manu, you wanted to speak about "multiple phased solutions" not THE algorithm. manu: we are not generating one algorithm. There exists today some impls in the field. … we might look at whether it is good enough … then consider next version … not all or nothing AndyS: What are the limitations? Assumption? <Zakim> dlongley, you wanted to say we also know that RDF-star is coming -- and we'll need another algorithm for that dlongly: current limitations/assumption URDA2015 - any bound dataset … bail out at cost points. AndyS: I'm happy with bailing out. But you can go further and say it doesn't handle all graphs. I'm happy with all graphs, with a bail out if it takes too much computing AndyS: Defining a shape before hand is not something we should do <manu> +1 to what AndyS is saying -- sounds like we're agreeing :) phila: Others? Kazue: thinking external criteria hard to decide phila: and it is political yamdan: also important to be clear about processing. … A difference of the two algorithms is scope - dataset vs graph. <manu> +1 to yamadan's points. dlongley: criteria important. Formally defining the differences is itself difficult. <Zakim> gkellogg, you wanted to suggest identifying specific categories of graphs in our hypothetical dataset that are known to create computational problems. phila: please think of two criteria gkellogg: want a collection of cases beyond test cases e.g. known expensive. <dlongley> 1. ease of implementation, 2. existing incubation / use in the marketplace, 3. time / resource complexity in solving common datasets, 4. time / resource complexity in solving complex (or poison?) datasets dlongley: not an ordered list <manu> 5. Existence of formal proofs for the algorithms <manu> 6. Demonstration of review of formal proofs for the algorithms phila: easy of implementation - yes. … incubation - yes … resource complexity - yes … formal proofs - yes AndyS: Ease of implementation and complexity of algorithm can be in opposition <dlongley> yes, there is a tension between ease of implementation and time complexity (sometimes) <manu> +1 to create an issue to track this. <dlongley> 7. reusing existing primitives that are available on various platforms <Kazue> coverage of target RDF? dlongley: reuse primitives e.g. hashing algorithms. … existing RDF serialization. Kazue: cover real life examples phila: need to note that only usual graph trigger the failsafes. <Zakim> manu, you wanted to note "hashed data" as the output... for BBS. manu: BBS signature do a statement by statement signature <dlongley> 8. allow signatures on individual statements and components of statements manu: criteria: has to support selective disclosure. Hashing alternatves. <Zakim> AndyS, you wanted to give criteria <yamdan> +1 to BBS-friendly hash AndyS: Dataset, not graph, no shape excluded, cover RDF-star <gkellogg> +1 to AndyS AndyS: Translates as do stuff with the longest life <manu> I was with AndyS all the way up to "cover RDF-star" :) <gkellogg> Also, Generalized RDF (bnode predicates, literal subjects) +1 to gkellogg. dlongley: RDF-star. Do existing use cases. phila: rdf-star is a nice to have but should not fail because of rdf-star URDNA2015 FPWD phila: URDNA2015 as FPWD. … likes explanatory examples. Editors phila: need to do a test suite and an explainer. <Zakim> gkellogg, you wanted to volunteer to edit one or both of the documents and help with the test suites. gkellogg: have been active in CG … hat in the ring For the C14N spec: ... … For the C14N spec ... <manu> Thank you, Gregg for Editor-ing the canonicalization spec! :) dlongley: can contribute as backup editor phila: any one like to be an editor or contribute in some way. … hash doc … "RDH" <Zakim> manu, you wanted to note they might be the same doc? manu: might be the same doc. … hashing is C14N input, hash it. -- one page? <manu> Woo! Thanks Tobias for volunteering to be an Editor! Tobias_: happy to help edit esp hashing <dlongley> +1 <manu> (for the second part) <pchampin> +1 <Zakim> gkellogg, you wanted to discuss testing implications. gkellogg: tesring may be easier as 2 docs <pchampin> a contrario, the C14N itself may be a complex document. That could justify keeping the hashing part out. yamdan: interested in hashing part <dlongley> +1 to Phil phila: end meeting <manu> Note: Ahmad Alobaid volunteered to be a first-time Editor in this group. Minutes manually created (not a transcript), formatted by [8]scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC). [8]https://w3c.github.io/scribe2/scribedoc.html
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Wednesday, 12 October 2022 16:56:07 UTC