Re: regression testing [was Re: summarizing proposed changes to charter]

Hi Peter,

On 08/14/2014 11:45 AM, Peter F. Patel-Schneider wrote:
> On 08/13/2014 08:14 PM, David Booth wrote:
>> On 08/13/2014 10:04 PM, Peter F. Patel-Schneider wrote:
>>> OK, even though regression testing doesn't need canonicalization, it is
>>> useful to have RDF canonicalization to support a particular regression
>>> testing system.
>>>
>>> But how is the lack of a W3C-blessed method for RDF canonicalization
>>> hindering the development or deployment of this system?  How would a
>>> W3C-blessed method for RDF canonicalization help the development or
>>> deployment of this system?
>>>
>>> The system could use any canonical form whatsoever, after all, right?
>>
>> Yes and no.  The lack of a W3C-blessed method of RDF canonicalization
>> makes
>> the comparison dependant on the particular canonicalization tool that
>> is used,
>> which means that RDF data produced by different tools (or different
>> versions
>> of the same tool) could not be reliably compared.  In many scenarios this
>> won't be an issue, but it will in some.
>
> Maybe, but regression testing appears to me to be a scenario where this
> is definitely not an issue.  The regression testing system can simply
> have one particular canonicalization tool or method that it uses.

No, it cannot.  Again, the data comparison process -- regression testing 
-- is *separate* from the data serialization process.  They are 
completely decoupled.  The data comparison process knows nothing about 
the datatype -- it just compares bytes.  Canonicalization is done in the 
*serialization* process, which *must* understand the datatype anyway -- 
not in the comparison process, which does not and should not have to 
know anything about the datatype.  This decoupling is done so that the 
Framework can uniformly handle any kind of data.

>
>> But more importantly, the lack of a standard RDF canonicalization method
>> discourages the development of canonicalization tools.
>
> Well, maybe, but in the absence of a need for a standard for RDF
> canonicalization then this does not appear to be a problem to be
> addressed by standardization.

Well, I've described my need for it, but I understand if you're using 
RDF differently and don't feel the same need that I feel.

>
>  > Canonicalization has
>> gotten little attention in RDF tools, in my view largely *because* of the
>> difficulty of doing it and the lack of a W3C-blessed method.  It is
>> non-trivial to implement, and if one's implementation would just end
>> up as
>> one's own idiosyncratic canonicalization anyway, instead of being an
>> implementation of a standard, then there isn't as much motivation to
>> do it.  I
>> think a W3C-blessed method would help a lot.
>>
>> Would you be okay with canonicalization being an OPTIONAL deliverable?
>
> Without a demonstrated need for W3C standardization of RDF
> canonicalization, and, further, a demonstrated need in the context of
> this WG's general activities, I don't see that RDF canonicalization
> should be a part of the WG.

I'm disappointed, but I'll drop my request that it be included.

Having evangelized RDF for over 10 years, I've become convinced that RDF 
has subtleties that ultimately make it more complex and harder to use 
than it should be, and this inhibits adoption.  The inability to easily 
compare two RDF graphs for equality is one good example: it is trivially 
easy in most data representations, but embarrassingly difficult in RDF. 
  I was hoping that we could make a small step toward fixing that flaw, 
but I guess not today.

Still, I'm glad to see other work on RDF validation going forward, 
because that's another gap that we've had for a long time.

David

Received on Thursday, 14 August 2014 18:09:20 UTC