Re: bugs in 5.10 Express the quality of a linkset from Antoine Isaac on 2016-07-29 (public-dwbp-comments@w3.org from July 2016)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Fri, 29 Jul 2016 14:48:37 +0200
To: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
CC: public-dwbp-comments <public-dwbp-comments@w3.org>, Vladimir Alexiev <vladimir.alexiev@ontotext.com>, Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <579B50A5.3000608@few.vu.nl>
Hi Riccardo,

Thanks!
I may do a couple of minor editorial changes later today, but it looks considerably clearer for me!

Cheers,

Antoine

On 27/07/16 18:55, Riccardo Albertoni wrote:
> Hi Antoine,
> I revised the example considering the "way  1", see my last push.
>
>
> On 26 July 2016 at 19:48, Antoine Isaac <aisaac@few.vu.nl> wrote:
>> Hi Riccardo,
>>
>> The gain is 'oriented' so anything that will clarify that it is so will be
>> beneficial.
>>
>> I think this can be done in two ways, depending on which level of
>> 'harcoding' of the direction in the metric you prefer.
>>
>>
>> 1. Adding the statements void:subjectsTarget and void:objectsTarget on the
>> Linkset, and declaring that for computing the Metric
>> :importingForPropertyPercentage one needs that these statements are present
>> on the Linkset. And that what is measured then should be the completion of
>> the dataset in void:subjectsTarget using the data from the dataset in
>> void:objectsTarget.
>> Note that you may also want to reflect the same sort of 'hardcoding' of the
>> direction of completion on the :completenessGain Dimension and the
>> :complementationGain Category. A dimension that gathers metrics that are
>> computed in different directions may be confusing.
>>
>>
>> 2. Leaving the void:target statements on the Linkset as they are. But then
>> the Metric (and the Measurement) needs to have two parameters - one for
>> specifying the completed dataset, and one for the completing one.
>>
>> #2 is more elegant, and it avoids the theoretical hesitation on the
>> dimension and the category.
>> But it add two parameters, which makes the example much more complex (this
>> example is already about two parameters)
>> So I'd rather go for #1.
>>
>> Note that in any case, the following sentence will have to be made sharper
>> by mentioning 'subject' and 'object':
>> "It quantifies the information gain when adding the preferred labels or the
>> alternative labels of the concepts from a linked dataset to the descriptions
>> of the concepts from the other dataset, which these concepts have been
>> matched with a skos:exactMatch statement from the linkset."
>
> I've tried to make this sentence sharper ;)
>
>>
>> Note also that we can avoid some of the theoretical thinking on the
>> Dimension and the Category by removing the :complementationGain Category. I
>> think it's not crucial to the example, and its name is not very clear.
>>
>
> I am not sure that deleting this we end up in a clearer example. So I
> have left it.
>
>> Finally, as the fact of having different measurements on different days is
>> not core to the example, I'd suggest to remove these extra measurements. The
>> example is quite complex, already. What do you think?
>
> Ok I have deleted the repeated measurements.
> Cheers,
>   Riccardo
>
>>
>> cheers,
>>
>> Antoine
>>
>>
>> On 26/07/16 16:27, Riccardo Albertoni wrote:
>>>
>>> Hi Antoine,
>>>
>>>
>>> On 25 July 2016 at 15:24, Antoine Isaac <aisaac@few.vu.nl> wrote:
>>>>
>>>> Dear Riccardo, Vladimir,
>>>>
>>>> I'm looking again at the DQV after the updates on the linkset section,
>>>> triggered by Vladimir's comment.
>>>> And I'm quite puzzled. To me there was a key difference between say,
>>>> measurement_exactMatchAltLabelItDataset1 and
>>>> measurement_exactMatchAltLabelItDataset2.
>>>> What I understood is that same linkset can indeed lead to quite different
>>>> 'completion gain' depending on which dataset the gain is evaluated on.
>>>>
>>>> To take a concrete example that will be familiar to Vladimir: say a
>>>> linkset
>>>> aligns one local, monolingual vocabulary with Getty's Art and
>>>> Architecture
>>>> Thesaurus, which has several languages and can have several labels for
>>>> one
>>>> concept in one language.
>>>> If we try to pull the labels of one vocabulary into the other vocabulary,
>>>> then it's likely that such 'pulling' will complement more the local
>>>> vocabulary than Getty, as Getty was originally richer.
>>>>
>>>> Trying to say that the measurement are done at different dates don't
>>>> really
>>>> represent the fundamental distinction.
>>>
>>>
>>> In the example, we have that some of the measurements done at
>>> different date return different values...
>>> Which implies some changes have occurred.
>>>
>>>> Now, maybe the measurement should indicate clearly, which is the dataset
>>>> is
>>>> the 'completed one' on which the gain is measured, and which the
>>>> 'completing
>>>> one'.
>>>
>>>
>>> You are right, I just realized  that we were reading  the example
>>> differently. I was giving  for granted that dataset1 and dataset2
>>> were respectivelly the subject and the object of the linkset, and that
>>> is not necessarily the case.
>>> In our importing  the complemented dataset is the
>>> "void:subjectsTarget", whereas the completing one is the
>>> "void:objectsTarget". So probably, we'd better to specify which
>>> dataset is the subject and which is the object of the dataset.
>>>
>>> That can be easily done, by replacing
>>>
>>> :myLinkset
>>>       a dcat:Dataset, void:Linkset ;
>>>       dcterms:title "A Linkset between My dataset 1 and My dataset 2";
>>>       void:linkPredicate skos:exactMatch ;
>>>       void:target :myDataset1 ;
>>>       void:target :myDataset2
>>>       .
>>>
>>> With
>>>
>>> :myLinkset
>>>       a dcat:Dataset, void:Linkset ;
>>>       dcterms:title "A Linkset from My dataset 1 to My dataset 2";
>>>       void:linkPredicate skos:exactMatch ;
>>>       void:subjectsTarget :myDataset1 ;
>>>       void:objectsTarget :myDataset2
>>> .
>>> If this is ok for you, I can change it.
>>>
>>>
>>>
>>> if you want to see the impact of myDataset1 on myDataset2, you should
>>> assess the importing of the linkset myLinkset's reciprocal,
>>> :MyLinkset2 which is the linkset we can obtain inverting myLinkset,
>>> defined as
>>>
>>> : MyLinkset2
>>>       a dcat:Dataset, void:Linkset ;
>>>       dcterms:title "A Linkset from My dataset 2 to My dataset 1";
>>>       void:linkPredicate skos:exactMatch ;
>>>       void:subjectsTarget :myDataset2 ;
>>>       void:objectsTarget :myDataset1
>>>    .
>>>
>>> A side comment: In the LOD,  MyLinkset2 and myLinkset are  very often
>>> managed by distinct publishers and in the reality these linkesets
>>> might   not to be reciprocal. That is why, I think it is better to
>>> treat linkset as "oriented" and  the two linksets as distinct. That is
>>> also coherent with the  definition of linkset provided by VoID.
>>>
>>>>
>>>> In any case, I'm tempted to put back the 'dataset1' and 'dataset2' into
>>>> the
>>>> identifiers of the measurement.
>>>>
>>>> What do you think?
>>>
>>>
>>> After my comments, are you still tempted to put back the dataset1 and
>>> dataset2? If yes, I would rather suggest to introduce MyLinkset2 in
>>> the example, just to make clearer that the linksets are oriented.
>>> Thought, i am not sure that level of complexity is worth ..
>>>
>>> Best,
>>> Riccardo
>>>
>>>
>>>
>>>>
>>>> Best,
>>>>
>>>> Antoine
>>>>
>>>>
>>>> On 14/07/16 21:28, Riccardo Albertoni wrote:
>>>>>
>>>>>
>>>>> Dear Vladimir,
>>>>> Thanks for your feedbacks.
>>>>>
>>>>> On 6 July 2016 at 17:40, Vladimir Alexiev
>>>>> <vladimir.alexiev@ontotext.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Bugs in example https://www.w3.org/TR/vocab-dqv/#ExpressQualLinkset
>>>>>> 5.10 Express the quality of a linkset:
>>>>>>
>>>>>> - uses property dqv:hasObservation, apparently inverse of
>>>>>> dqv:isMeasurementOf.
>>>>>>      However, no such property is defined in dqv.ttl.
>>>>>
>>>>>
>>>>>
>>>>> if you take a look at the in progress version
>>>>> http://w3c.github.io/dwbp/vocab-dqg.html, you can notice that there is
>>>>> no
>>>>> dqv:hasObservation  included in the document anymore. We corrected
>>>>> this some time ago ;)
>>>>>
>>>>>>
>>>>>> - there is no difference whatsoever between
>>>>>> measurement_exactMatchAltLabelItDataset1 and
>>>>>> measurement_exactMatchAltLabelItDataset2,
>>>>>>      respectively measurement_exactMatchAltLabelEnDataset1 and
>>>>>> measurement_exactMatchAltLabelEnDataset2
>>>>>>      and measurement_exactMatchPrefLabelItDataset1 and
>>>>>> measurement_exactMatchprefLabelItDataset2.
>>>>>>      They both refer to :myLinkset, not to the one or another linked
>>>>>> datasets.
>>>>>
>>>>>
>>>>>
>>>>> The couples you have mentioned  are meant to  be repeated measurements
>>>>> of quality of the same linkset.  Actually in the "in progress"
>>>>> version, we have added  dcterms:date which makes that  a little more
>>>>> clearer. I have also added a sentence to point this out.
>>>>> I acknowledge that the name   measurement_exactMatchAltLabelItDataset
>>>>> is quite confusing, as the measurements are  about linksets and not on
>>>>> the datasets, so I have  cancelled  the "dataset" part.
>>>>>
>>>>>
>>>>>>
>>>>>> - (minor) measurement_exactMatchprefLabelItDataset2   should use
>>>>>> capitalized
>>>>>> "Pref"
>>>>>
>>>>>
>>>>> Done! Thanks.
>>>>>
>>>>>>
>>>>>> - defines this twice:
>>>>>>        qb:component [ qb:measure dqv:value;];
>>>>>>
>>>>> I think it isn't doubled in the "in progress" version.
>>>>>
>>>>> Cheers,
>>>>> Riccardo
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> This message has been scanned by E.F.A. Project and is believed to be
>>>>>> clean.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
Received on Friday, 29 July 2016 12:49:12 UTC