Re: Use of "has" or "is" in DPV's properties from Pat McBennett on 2022-03-31 (public-dpvcg@w3.org from March 2022)

From: Pat McBennett <patm@inrupt.com>
Date: Thu, 31 Mar 2022 15:40:15 +0100
To: "Harshvardhan J. Pandit" <me@harshp.com>
Cc: Beatriz Esteves <besteves@delicias.dia.fi.upm.es>, public-dpvcg@w3.org
Message-ID: <CABgQ8mLfLqF6ZRxxq4m7xwP5OQQxj6WE-4gYujL5dKd3hOb01w@mail.gmail.com>
Hiya Harsh,

So yeah, 100%. I really think we're both on exactly the same page here -
i.e., both just trying to determine what the current consensus might be
across the broader Semantic Web community in relation to the use of term
name prefixes.

*>>> @Pat - would you be willing to do this? *
Yeah, absolutely. By "*the semantic web mailing list*", I assume you mean
this: https://lists.w3.org/Archives/Public/semantic-web/ ?

About Rob's concern (i.e., "...*that some languages do not have the
upper/lower case characters we use in English/Western languages*"), can
anyone provide a couple of simple examples, as I'm not sure I understand
(at least not in the context of DPV, where the lingua franca (for term
names) has already been agreed to be English (and we are talking here about
the names of vocab terms, right? - since as Harsh says, the values for
`rdfs:label` or `rdfs:comment` or whatever predicates *associated* with
these vocab terms can provide whatever values they want in whatever local,
non-English/Western languages they want, right?)). Anyway, I'm sure a
couple of simple examples might help highlight what I'm probably missing
here...

On Harsh's point (i.e., "*...one would have to 'create' the label to
distinguish between a label for Class and a Property with the same name
i.e. class would be 'Concept' and property would have to be 'has Concept'*").
I agree here, although I would propose having the labels (as in
`rdfs:label`, right?), in this example, as "Concept" for the property and
"Class of Concept" for the Class.

Now, that's just a convention that I've been using personally, as *I do*
think it's important for labels to be as clear and as unambiguous as
possible (and therefore clearly differentiating between a Class's label and
its associating-Property's label). But (somewhat worryingly) it's not a
convention I've ever noticed adopted anywhere else (which makes me fear
that I'm perhaps missing something important). I'll come back to the
example of DCAT below though - as that's an interesting vocabulary for sure!

*>>> However, should there be consistency between multi-lingual labels*
I don't follow what you mean here. For me, I have no trouble simply
translating any labels as appropriate, which could be very independent and
different (and therefore may appear 'inconsistent' perhaps), but that's
fine (which is why I think I may not be following what you mean), e.g.:

    ex.Concept a rdfs:Class ;
      rdfs:label "Class of Concept"@en ;
      rdfs:label "Clase de Concepto"@es ;
      rdfs:label "All the yokes (in Dublin English, everyting's a 'yoke'
(and 'th' is pronounced 't'!))"@en-dublin .

So can you expand a little on what you mean by "*consistency between
multi-lingual labels*"...?

*>> DCAT by way of example.*
Yeah, I love DCAT as an example vocab. But in fact on close inspection,
there appear to be quite a number of inconsistencies and issues with some
of their terms names, and their choices for `rdfs:label` values.

So the major change I'd suggest making to DCAT (relevant to this discussion
anyway) is my point above about providing 'better' (i.e., more useful,
helpful, and unambiguous) labels for their Classes and Properties, for
instance:

  dcat:Catalog a rdfs:Class ;
    rdfs:label "Class of Catalogs"@en .

  dcat:catalog a rdf:Property ;
    rdfs:label "Catalog"@en .


At first I thought that the label values for both the Class `dcat:Catalog`
and the Property `dcat:catalog` where both "Catalog". But in fact, the
English label for `dcat:Catalog` is `Catalog` (both capital 'C'), and the
English label for the `dcat:catalog` property is `catalog` (both lowercase
'c')).

Personally I don't get that at all - I don't see how that's useful or
helpful to anyone but a pure SemWeb person who knows the convention of
Class names starting with a capital letter, and Property names with
lowercase letters. So I wonder if that was a deliberate choice, or just an
oversight, or perhaps just the result of automated processing (I don't
think it's from automated processing, as other examples show further
inconsistencies, we I'll come to below).

Or perhaps it's their interpretation of the semantics of `rdfs:label`,
along the lines of Schema.org...

...this literally came up 3 years ago when I was discussing labels with Dan
Brickley and others involved with Schema.org. They expressly interpret the
`rdfs:label` predicate as "*...effectively defining a short name for a
term's defining URI*" (see public mail
<https://lists.w3.org/Archives/Public/public-schemaorg/2019Apr/0010.html>
here).

This is *not* how I've always interpreted `rdfs:label`, which is "*As the
creator of this property in this vocabulary, I 'suggest' this string as a
useful very short description of this property to help humans understand
what the concept behind it is (and should not necessarily have anything to
do with it's URI)*". I use my interpretation (in the vocabs I create) to
then reuse those `rdfs:label` values in my user interfaces, as the actual
text label in front of text entry fields for example (and I always try and
provide multi-lingual `rdfs:label` values too, so that those user
interfaces can more easily become multi-lingual (kinda for free, since the
UI is actually driven directly from vocabs)). In the very same vein, I use
the `rdfs:comment` values as the contents of popup UI tooltips when the
user hovers their mouse over the text entry labels and textboxes (and I
provide multi-lingual values for them too, of course)).

But it seems that DCAT don't have that Schema.org interpretation of
`rdfs:label` at all, as shown by `dcat:qualifiedRelation` which has the
English label of "qualified relation", which in my view doesn't actually
align with any of the above interpretations - i.e., it's clearly not just
the local component of the term's IRI (as that would be
"qualifiedRelation"), but it's also not great for human's either (for which
I'd suggest "Qualified relation" would be best). So it seems (with this
example at least) that they've tried to capture in the label value both the
'Class-or-Property-ness' of the term (by using the case of the label's
first letter), and then also inserted a space (to make it more
human-friendly). But personally, I don't think that's a good choice.

(And then there's `dcat:Resource` - the English label for that is actually
"Catalogued resource", which just seems to be a bug :) (i.e., it should
just be "Resource" (which would also then avoid debates on the spelling of
Catalogued vs Cataloged :) !), or else the term's IRI should be
`dcat:CataloguedResource`)).

(And then there's `dcat:hadRole`. The English label for that is actually
"hadRole" (with no space!), so do their English label values use spaces
between words or not? And why is it 'had' instead of 'has' - was that an
oversight, or was it deliberate?)

(And then there's the casing inconsistency between the labels for
`dcat:DataService` (English label is "Data service", lowercase 's') and
`dcat:CatalogRecord` (whose English label is "Catalog Record", uppercase
'R'))


So Harsh, on your points:
  "[DCAT] either have (i) exact same label for classes and concepts;"
  Well, no, not the '*exact same*' labels at all (e.g., even in the case of
"Catalog" and "catalog").

 "or (ii) do not have the same language labels across classes and
properties. "
  I don't follow what you mean here - they consistently '*do not* have the
same language labels across classes and properties', i.e., they differ by
just the case of the first letter (in the 2 cases of 'dcat:Catalog' and
'dcat:catalog', and 'dcat:Distribution' and 'dcat:distribution'), or they
differ more broadly in words (in the case of 'CatalogRecord' and 'record').

But perhaps all these DCAT issues are actually being resolved in the v3
(proposed) you mentioned. I see the HTML of the v3 spec (here
<https://www.w3.org/TR/vocab-dcat-3/>) - but how do I see the Turtle for
this new version (since the namespace IRI is still
http://www.w3.org/ns/dcat#, which right now only gives me back the v2
Turtle, right?!)

Great discussion though - thanks!

Cheers,

Pat.


*Pat McBennett*, Technical Architect

Contact  | patm@inrupt.com

Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub
<https://github.com/pmcb55>

Explore  | www.inrupt.com




On Mon, Mar 28, 2022 at 9:30 PM Harshvardhan J. Pandit <me@harshp.com>
wrote:

> Hi Pat, All.
> IMO we can take this discussion as a concrete proposal based on Pat's
> (well articulated) argument to have consistency in naming by removing
> property name prefixes i.e. hasDataSubject becomes dataSubject.
>
> However, we're not necessarily a group of semantic-web experts in terms
> of focus. Perhaps it would be better to have this discussion (also) on
> the semantic web mailing list and report consensus or salient points
> back here - if any? This way, we can take advantage of a wider community
> who have authored specifications using both styles and have surely at
> some point already discussed this.
>
> @Pat - would you be willing to do this? (IMHO you can edit and forward
> your existing email - it makes a good argument)
>
> (note: this does not exclude people discussing this here)
>
> ---------------------------------------
>
> My personal opinion:
> I like the consistency aspect of naming. I don't like drastic big
> changes, but if there is a strong argument that this improves DPV before
> it gets to v1 later this year, then I'm for it.
>
> My only concern against non-prefixed naming is the labelling in
> languages that don't have prefixes or capital letters (as Rob mentioned
> earlier). In these cases, one would have to 'create' the label to
> distinguish between a label for Class and a Property with the same name
> i.e. class would be 'Concept' and property would have to be 'has
> Concept'. To me this is fine - the IRIs would be in English, the labels
> can be structured any which way for aesthetic, accuracy, or correctness
> - they don't have to be equivalent to IRIs. However, should there be
> consistency between multi-lingual labels - I don't know.
>
> I checked DCAT v2 (standard) and v3 (proposed) - they have non-prefixed
> IRIs and multi-lingual labels - and they either have (i) exact same
> label for classes and concepts; or (ii) do not have the same language
> labels across classes and properties. See
> https://github.com/w3c/dxwg/blob/gh-pages/dcat/rdf/dcat2.ttl
>
> Given that DCAT is quite well known and actively discussed/developed, I
> think its a good indication of my concern not being important enough.
> But I wouldn't consider myself as enough of a semantic-web expert to
> make decisions on this alone.
>
> ---------------------------------------
>
> P.S. I remembered that this prefix-based notations are present partly
> because DPV started with the concepts/properties from SPECIAL
> vocabularies https://specialprivacy.ercim.eu/langs/usage-policy You can
> see the properties there are of the form `hasPurpose` and `hasStorage`.
>
> On 28/03/2022 15:13, Pat McBennett wrote:
>
> --
> ---
> Harshvardhan J. Pandit, Ph.D
> Research Fellow
> ADAPT Centre, Trinity College Dublin
> https://harshp.com/
>

-- 
This e-mail, and any attachments thereto, is intended only for use by the 
addressee(s) named herein and may contain legally privileged, confidential 
and/or proprietary information. If you are not the intended recipient of 
this e-mail (or the person responsible for delivering this document to the 
intended recipient), please do not disseminate, distribute, print or copy 
this e-mail, or any attachment thereto. If you have received this e-mail in 
error, please respond to the individual sending the message, and 
permanently delete the email.
Received on Thursday, 31 March 2022 14:41:40 UTC