Re: bugs in 5.10 Express the quality of a linkset from Riccardo Albertoni on 2016-07-27 (public-dwbp-wg@w3.org from July 2016)

From: Riccardo Albertoni <albertoni@ge.imati.cnr.it>
Date: Wed, 27 Jul 2016 18:55:21 +0200
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: public-dwbp-comments <public-dwbp-comments@w3.org>, Vladimir Alexiev <vladimir.alexiev@ontotext.com>, Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <CAOHhXmSevwJqkDFWB_emOv5QqBitiTnrQa05Ba7OZBW+Y9SQ3A@mail.gmail.com>
Hi Antoine,
I revised the example considering the "way  1", see my last push.


On 26 July 2016 at 19:48, Antoine Isaac <aisaac@few.vu.nl> wrote:
> Hi Riccardo,
>
> The gain is 'oriented' so anything that will clarify that it is so will be
> beneficial.
>
> I think this can be done in two ways, depending on which level of
> 'harcoding' of the direction in the metric you prefer.
>
>
> 1. Adding the statements void:subjectsTarget and void:objectsTarget on the
> Linkset, and declaring that for computing the Metric
> :importingForPropertyPercentage one needs that these statements are present
> on the Linkset. And that what is measured then should be the completion of
> the dataset in void:subjectsTarget using the data from the dataset in
> void:objectsTarget.
> Note that you may also want to reflect the same sort of 'hardcoding' of the
> direction of completion on the :completenessGain Dimension and the
> :complementationGain Category. A dimension that gathers metrics that are
> computed in different directions may be confusing.
>
>
> 2. Leaving the void:target statements on the Linkset as they are. But then
> the Metric (and the Measurement) needs to have two parameters - one for
> specifying the completed dataset, and one for the completing one.
>
> #2 is more elegant, and it avoids the theoretical hesitation on the
> dimension and the category.
> But it add two parameters, which makes the example much more complex (this
> example is already about two parameters)
> So I'd rather go for #1.
>
> Note that in any case, the following sentence will have to be made sharper
> by mentioning 'subject' and 'object':
> "It quantifies the information gain when adding the preferred labels or the
> alternative labels of the concepts from a linked dataset to the descriptions
> of the concepts from the other dataset, which these concepts have been
> matched with a skos:exactMatch statement from the linkset."

I've tried to make this sentence sharper ;)

>
> Note also that we can avoid some of the theoretical thinking on the
> Dimension and the Category by removing the :complementationGain Category. I
> think it's not crucial to the example, and its name is not very clear.
>

I am not sure that deleting this we end up in a clearer example. So I
have left it.

> Finally, as the fact of having different measurements on different days is
> not core to the example, I'd suggest to remove these extra measurements. The
> example is quite complex, already. What do you think?

Ok I have deleted the repeated measurements.
Cheers,
 Riccardo

>
> cheers,
>
> Antoine
>
>
> On 26/07/16 16:27, Riccardo Albertoni wrote:
>>
>> Hi Antoine,
>>
>>
>> On 25 July 2016 at 15:24, Antoine Isaac <aisaac@few.vu.nl> wrote:
>>>
>>> Dear Riccardo, Vladimir,
>>>
>>> I'm looking again at the DQV after the updates on the linkset section,
>>> triggered by Vladimir's comment.
>>> And I'm quite puzzled. To me there was a key difference between say,
>>> measurement_exactMatchAltLabelItDataset1 and
>>> measurement_exactMatchAltLabelItDataset2.
>>> What I understood is that same linkset can indeed lead to quite different
>>> 'completion gain' depending on which dataset the gain is evaluated on.
>>>
>>> To take a concrete example that will be familiar to Vladimir: say a
>>> linkset
>>> aligns one local, monolingual vocabulary with Getty's Art and
>>> Architecture
>>> Thesaurus, which has several languages and can have several labels for
>>> one
>>> concept in one language.
>>> If we try to pull the labels of one vocabulary into the other vocabulary,
>>> then it's likely that such 'pulling' will complement more the local
>>> vocabulary than Getty, as Getty was originally richer.
>>>
>>> Trying to say that the measurement are done at different dates don't
>>> really
>>> represent the fundamental distinction.
>>
>>
>> In the example, we have that some of the measurements done at
>> different date return different values...
>> Which implies some changes have occurred.
>>
>>> Now, maybe the measurement should indicate clearly, which is the dataset
>>> is
>>> the 'completed one' on which the gain is measured, and which the
>>> 'completing
>>> one'.
>>
>>
>> You are right, I just realized  that we were reading  the example
>> differently. I was giving  for granted that dataset1 and dataset2
>> were respectivelly the subject and the object of the linkset, and that
>> is not necessarily the case.
>> In our importing  the complemented dataset is the
>> "void:subjectsTarget", whereas the completing one is the
>> "void:objectsTarget". So probably, we'd better to specify which
>> dataset is the subject and which is the object of the dataset.
>>
>> That can be easily done, by replacing
>>
>> :myLinkset
>>      a dcat:Dataset, void:Linkset ;
>>      dcterms:title "A Linkset between My dataset 1 and My dataset 2";
>>      void:linkPredicate skos:exactMatch ;
>>      void:target :myDataset1 ;
>>      void:target :myDataset2
>>      .
>>
>> With
>>
>> :myLinkset
>>      a dcat:Dataset, void:Linkset ;
>>      dcterms:title "A Linkset from My dataset 1 to My dataset 2";
>>      void:linkPredicate skos:exactMatch ;
>>      void:subjectsTarget :myDataset1 ;
>>      void:objectsTarget :myDataset2
>> .
>> If this is ok for you, I can change it.
>>
>>
>>
>> if you want to see the impact of myDataset1 on myDataset2, you should
>> assess the importing of the linkset myLinkset's reciprocal,
>> :MyLinkset2 which is the linkset we can obtain inverting myLinkset,
>> defined as
>>
>> : MyLinkset2
>>      a dcat:Dataset, void:Linkset ;
>>      dcterms:title "A Linkset from My dataset 2 to My dataset 1";
>>      void:linkPredicate skos:exactMatch ;
>>      void:subjectsTarget :myDataset2 ;
>>      void:objectsTarget :myDataset1
>>   .
>>
>> A side comment: In the LOD,  MyLinkset2 and myLinkset are  very often
>> managed by distinct publishers and in the reality these linkesets
>> might   not to be reciprocal. That is why, I think it is better to
>> treat linkset as "oriented" and  the two linksets as distinct. That is
>> also coherent with the  definition of linkset provided by VoID.
>>
>>>
>>> In any case, I'm tempted to put back the 'dataset1' and 'dataset2' into
>>> the
>>> identifiers of the measurement.
>>>
>>> What do you think?
>>
>>
>> After my comments, are you still tempted to put back the dataset1 and
>> dataset2? If yes, I would rather suggest to introduce MyLinkset2 in
>> the example, just to make clearer that the linksets are oriented.
>> Thought, i am not sure that level of complexity is worth ..
>>
>> Best,
>> Riccardo
>>
>>
>>
>>>
>>> Best,
>>>
>>> Antoine
>>>
>>>
>>> On 14/07/16 21:28, Riccardo Albertoni wrote:
>>>>
>>>>
>>>> Dear Vladimir,
>>>> Thanks for your feedbacks.
>>>>
>>>> On 6 July 2016 at 17:40, Vladimir Alexiev
>>>> <vladimir.alexiev@ontotext.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Bugs in example https://www.w3.org/TR/vocab-dqv/#ExpressQualLinkset
>>>>> 5.10 Express the quality of a linkset:
>>>>>
>>>>> - uses property dqv:hasObservation, apparently inverse of
>>>>> dqv:isMeasurementOf.
>>>>>     However, no such property is defined in dqv.ttl.
>>>>
>>>>
>>>>
>>>> if you take a look at the in progress version
>>>> http://w3c.github.io/dwbp/vocab-dqg.html, you can notice that there is
>>>> no
>>>> dqv:hasObservation  included in the document anymore. We corrected
>>>> this some time ago ;)
>>>>
>>>>>
>>>>> - there is no difference whatsoever between
>>>>> measurement_exactMatchAltLabelItDataset1 and
>>>>> measurement_exactMatchAltLabelItDataset2,
>>>>>     respectively measurement_exactMatchAltLabelEnDataset1 and
>>>>> measurement_exactMatchAltLabelEnDataset2
>>>>>     and measurement_exactMatchPrefLabelItDataset1 and
>>>>> measurement_exactMatchprefLabelItDataset2.
>>>>>     They both refer to :myLinkset, not to the one or another linked
>>>>> datasets.
>>>>
>>>>
>>>>
>>>> The couples you have mentioned  are meant to  be repeated measurements
>>>> of quality of the same linkset.  Actually in the "in progress"
>>>> version, we have added  dcterms:date which makes that  a little more
>>>> clearer. I have also added a sentence to point this out.
>>>> I acknowledge that the name   measurement_exactMatchAltLabelItDataset
>>>> is quite confusing, as the measurements are  about linksets and not on
>>>> the datasets, so I have  cancelled  the "dataset" part.
>>>>
>>>>
>>>>>
>>>>> - (minor) measurement_exactMatchprefLabelItDataset2   should use
>>>>> capitalized
>>>>> "Pref"
>>>>
>>>>
>>>> Done! Thanks.
>>>>
>>>>>
>>>>> - defines this twice:
>>>>>       qb:component [ qb:measure dqv:value;];
>>>>>
>>>> I think it isn't doubled in the "in progress" version.
>>>>
>>>> Cheers,
>>>> Riccardo
>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> This message has been scanned by E.F.A. Project and is believed to be
>>>>> clean.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>



-- 
----------------------------------------------------------------------------
Riccardo Albertoni
Istituto per la Matematica Applicata e Tecnologie Informatiche "Enrico Magenes"
Consiglio Nazionale delle Ricerche
via de Marini 6 - 16149 GENOVA - ITALIA
tel. +39-010-6475624 - fax +39-010-6475660
e-mail: Riccardo.Albertoni@ge.imati.cnr.it
Skype: callto://riccardoalbertoni/
LinkedIn: http://www.linkedin.com/in/riccardoalbertoni
www: http://www.imati.cnr.it/
http://purl.oclc.org/NET/riccardoAlbertoni
FOAF:http://purl.oclc.org/NET/RiccardoAlbertoni/foaf
Received on Wednesday, 27 July 2016 16:55:54 UTC