Re: bugs in 5.10 Express the quality of a linkset

Hi Riccardo,

The gain is 'oriented' so anything that will clarify that it is so will be beneficial.

I think this can be done in two ways, depending on which level of 'harcoding' of the direction in the metric you prefer.


1. Adding the statements void:subjectsTarget and void:objectsTarget on the Linkset, and declaring that for computing the Metric :importingForPropertyPercentage one needs that these statements are present on the Linkset. And that what is measured then should be the completion of the dataset in void:subjectsTarget using the data from the dataset in void:objectsTarget.
Note that you may also want to reflect the same sort of 'hardcoding' of the direction of completion on the :completenessGain Dimension and the :complementationGain Category. A dimension that gathers metrics that are computed in different directions may be confusing.


2. Leaving the void:target statements on the Linkset as they are. But then the Metric (and the Measurement) needs to have two parameters - one for specifying the completed dataset, and one for the completing one.

#2 is more elegant, and it avoids the theoretical hesitation on the dimension and the category.
But it add two parameters, which makes the example much more complex (this example is already about two parameters)
So I'd rather go for #1.

Note that in any case, the following sentence will have to be made sharper by mentioning 'subject' and 'object':
"It quantifies the information gain when adding the preferred labels or the alternative labels of the concepts from a linked dataset to the descriptions of the concepts from the other dataset, which these concepts have been matched with a skos:exactMatch statement from the linkset."

Note also that we can avoid some of the theoretical thinking on the Dimension and the Category by removing the :complementationGain Category. I think it's not crucial to the example, and its name is not very clear.

Finally, as the fact of having different measurements on different days is not core to the example, I'd suggest to remove these extra measurements. The example is quite complex, already. What do you think?

cheers,

Antoine

On 26/07/16 16:27, Riccardo Albertoni wrote:
> Hi Antoine,
>
>
> On 25 July 2016 at 15:24, Antoine Isaac <aisaac@few.vu.nl> wrote:
>> Dear Riccardo, Vladimir,
>>
>> I'm looking again at the DQV after the updates on the linkset section,
>> triggered by Vladimir's comment.
>> And I'm quite puzzled. To me there was a key difference between say,
>> measurement_exactMatchAltLabelItDataset1 and
>> measurement_exactMatchAltLabelItDataset2.
>> What I understood is that same linkset can indeed lead to quite different
>> 'completion gain' depending on which dataset the gain is evaluated on.
>>
>> To take a concrete example that will be familiar to Vladimir: say a linkset
>> aligns one local, monolingual vocabulary with Getty's Art and Architecture
>> Thesaurus, which has several languages and can have several labels for one
>> concept in one language.
>> If we try to pull the labels of one vocabulary into the other vocabulary,
>> then it's likely that such 'pulling' will complement more the local
>> vocabulary than Getty, as Getty was originally richer.
>>
>> Trying to say that the measurement are done at different dates don't really
>> represent the fundamental distinction.
>
> In the example, we have that some of the measurements done at
> different date return different values...
> Which implies some changes have occurred.
>
>> Now, maybe the measurement should indicate clearly, which is the dataset is
>> the 'completed one' on which the gain is measured, and which the 'completing
>> one'.
>
> You are right, I just realized  that we were reading  the example
> differently. I was giving  for granted that dataset1 and dataset2
> were respectivelly the subject and the object of the linkset, and that
> is not necessarily the case.
> In our importing  the complemented dataset is the
> "void:subjectsTarget", whereas the completing one is the
> "void:objectsTarget". So probably, we'd better to specify which
> dataset is the subject and which is the object of the dataset.
>
> That can be easily done, by replacing
>
> :myLinkset
>      a dcat:Dataset, void:Linkset ;
>      dcterms:title "A Linkset between My dataset 1 and My dataset 2";
>      void:linkPredicate skos:exactMatch ;
>      void:target :myDataset1 ;
>      void:target :myDataset2
>      .
>
> With
>
> :myLinkset
>      a dcat:Dataset, void:Linkset ;
>      dcterms:title "A Linkset from My dataset 1 to My dataset 2";
>      void:linkPredicate skos:exactMatch ;
>      void:subjectsTarget :myDataset1 ;
>      void:objectsTarget :myDataset2
> .
> If this is ok for you, I can change it.
>
>
>
> if you want to see the impact of myDataset1 on myDataset2, you should
> assess the importing of the linkset myLinkset's reciprocal,
> :MyLinkset2 which is the linkset we can obtain inverting myLinkset,
> defined as
>
> : MyLinkset2
>      a dcat:Dataset, void:Linkset ;
>      dcterms:title "A Linkset from My dataset 2 to My dataset 1";
>      void:linkPredicate skos:exactMatch ;
>      void:subjectsTarget :myDataset2 ;
>      void:objectsTarget :myDataset1
>   .
>
> A side comment: In the LOD,  MyLinkset2 and myLinkset are  very often
> managed by distinct publishers and in the reality these linkesets
> might   not to be reciprocal. That is why, I think it is better to
> treat linkset as "oriented" and  the two linksets as distinct. That is
> also coherent with the  definition of linkset provided by VoID.
>
>>
>> In any case, I'm tempted to put back the 'dataset1' and 'dataset2' into the
>> identifiers of the measurement.
>>
>> What do you think?
>
> After my comments, are you still tempted to put back the dataset1 and
> dataset2? If yes, I would rather suggest to introduce MyLinkset2 in
> the example, just to make clearer that the linksets are oriented.
> Thought, i am not sure that level of complexity is worth ..
>
> Best,
> Riccardo
>
>
>
>>
>> Best,
>>
>> Antoine
>>
>>
>> On 14/07/16 21:28, Riccardo Albertoni wrote:
>>>
>>> Dear Vladimir,
>>> Thanks for your feedbacks.
>>>
>>> On 6 July 2016 at 17:40, Vladimir Alexiev <vladimir.alexiev@ontotext.com>
>>> wrote:
>>>>
>>>> Bugs in example https://www.w3.org/TR/vocab-dqv/#ExpressQualLinkset
>>>> 5.10 Express the quality of a linkset:
>>>>
>>>> - uses property dqv:hasObservation, apparently inverse of
>>>> dqv:isMeasurementOf.
>>>>     However, no such property is defined in dqv.ttl.
>>>
>>>
>>> if you take a look at the in progress version
>>> http://w3c.github.io/dwbp/vocab-dqg.html, you can notice that there is
>>> no
>>> dqv:hasObservation  included in the document anymore. We corrected
>>> this some time ago ;)
>>>
>>>>
>>>> - there is no difference whatsoever between
>>>> measurement_exactMatchAltLabelItDataset1 and
>>>> measurement_exactMatchAltLabelItDataset2,
>>>>     respectively measurement_exactMatchAltLabelEnDataset1 and
>>>> measurement_exactMatchAltLabelEnDataset2
>>>>     and measurement_exactMatchPrefLabelItDataset1 and
>>>> measurement_exactMatchprefLabelItDataset2.
>>>>     They both refer to :myLinkset, not to the one or another linked
>>>> datasets.
>>>
>>>
>>> The couples you have mentioned  are meant to  be repeated measurements
>>> of quality of the same linkset.  Actually in the "in progress"
>>> version, we have added  dcterms:date which makes that  a little more
>>> clearer. I have also added a sentence to point this out.
>>> I acknowledge that the name   measurement_exactMatchAltLabelItDataset
>>> is quite confusing, as the measurements are  about linksets and not on
>>> the datasets, so I have  cancelled  the "dataset" part.
>>>
>>>
>>>>
>>>> - (minor) measurement_exactMatchprefLabelItDataset2   should use
>>>> capitalized
>>>> "Pref"
>>>
>>> Done! Thanks.
>>>
>>>>
>>>> - defines this twice:
>>>>       qb:component [ qb:measure dqv:value;];
>>>>
>>> I think it isn't doubled in the "in progress" version.
>>>
>>> Cheers,
>>> Riccardo
>>>
>>>>
>>>>
>>>>
>>>> --
>>>> This message has been scanned by E.F.A. Project and is believed to be
>>>> clean.
>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>

Received on Tuesday, 26 July 2016 17:48:57 UTC