Re: compact-0018 from Gregg Kellogg on 2012-10-22 (public-linked-json@w3.org from October 2012)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Mon, 22 Oct 2012 11:39:25 -0400
To: Markus Lanthaler <markus.lanthaler@gmx.net>
CC: Dave Longley <dlongley@digitalbazaar.com>, "public-linked-json@w3.org" <public-linked-json@w3.org>
Message-ID: <FFD06522-5DE6-4993-B69E-DC0E671EACD0@greggkellogg.net>

On Oct 22, 2012, at 8:32 AM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:

> I didn’t want to reflect my algorithm but wanted to test corner cases. >From a end-users perspective I would say that term2 is clearly the better match for the second list below than term1, do we disagree on that?

With either algorithm, it's possible to construct tests that have slightly different list contents, but attempt to (illegally) use the same term. There is indeed a difference in how the algorithms operator, and your's apparently is more correct for this one use case. FWIW, I think we're probably spending too much time testing bizaar corner cases, and you could argue that no such tests are a appropriate, as having two list values for a term is illegal.

I think the only sense of this test is to indirectly test the Term Rank algorithm; in that sense, the test is inconsistent with the existing expression of that algorithm.

Gregg

> From: Dave Longley [mailto:dlongley@digitalbazaar.com] 
> Sent: Monday, October 22, 2012 3:42 PM
> To: Gregg Kellogg
> Cc: Markus Lanthaler; public-linked-json@w3.org
> Subject: Re: compact-0018
>  
> It looks like you may have already sorted this out, but my understanding was that Markus had changed that test to reflect his term ranking algorithm rather than the one in the spec -- as the term ranking stuff was still under discussion. I'm waiting to update my processor implementations based on what gets finalized on that issue.
> 
> On 10/20/2012 05:47 PM, Gregg Kellogg wrote:
> I think there's a problem in compact-0018 regarding finding the appropriate terms for term1 and term2.
>  
> The test includes two lists, associated with an IRI shared between term1 and term2. The difference is that term1 and no language defined, and term2 has a language different from the default of the context ("en" vs "de").
>  
> The result comes down to calculating the term ranks for each value in the list. I come up with the following calculations:
>  
>  
> value
> term1
> term2
> { "@value": "v1.1", "@language": "de" },
> 3
> 0
> { "@value": "v1.2", "@language": "de" },
> 3
> 0
> { "@value": "v1.3", "@language": "de" },
> 3
> 0
> 4,
> 2
> 1
> { "@value": "v1.5", "@language": "en" },
> 1
> 3
> { "@value": "v1.6", "@language": "en" }
> 1
> 3
> total (term1)
> 13
> 7
> { "@value": "v2.1", "@language": "en" },
> 1
> 3
> { "@value": "v2.2", "@language": "en" },
> 1
> 3
> { "@value": "v2.3", "@language": "en" },
> 1
> 3
> 4,
> 2
> 1
> { "@value": "v2.5", "@language": "de" },
> 3
> 0
> { "@value": "v2.6", "@language": "de" }
> 3
> 0
> total (term2)
> 11
> 10
>  
> (pardon the formatting)
>  
> Basically, I find that term1 is selected in both cases, which results in an illegal compaction, as a term with @container: @list can't have two list values.
>  
> The playground, and presumably Markus' implementation does allocate between term1 and term2, so it seems that there's an inconsistency.
>  
> I think the test would be just as valid if v1.5 were "de" and v2.5 were "en", which would give the totals of 15 and 4 for the v1.x values and 9 and 13 for the v2.x values, which would result in the proper allocation.
>  
> Am I missing some detail in the algorithms?
>  
> Gregg Kellogg
> gregg@greggkellogg.net
>  
> 
> 
> 
> -- 
> Dave Longley
> CTO
> Digital Bazaar, Inc.
> http://digitalbazaar.com

Received on Monday, 22 October 2012 15:44:53 UTC