- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Thu, 31 Mar 2022 16:38:28 +0100
- To: Pat McBennett <patm@inrupt.com>
- Cc: Beatriz Esteves <besteves@delicias.dia.fi.upm.es>, public-dpvcg@w3.org
Hi. Replies are selected parts are inline. For Pat's full email, see https://lists.w3.org/Archives/Public/public-dpvcg/2022Mar/0027.html On 31/03/2022 15:40, Pat McBennett wrote: > > */>>> @Pat - would you be willing to do this? /* > Yeah, absolutely. By "/the semantic web mailing list/", I assume you > mean this: https://lists.w3.org/Archives/Public/semantic-web/ > <https://lists.w3.org/Archives/Public/semantic-web/> ? Yes. > > About Rob's concern (i.e., ".../that some languages do not have the > upper/lower case characters we use in English/Western languages/"), can > anyone provide a couple of simple examples, as I'm not sure I understand > (at least not in the context of DPV, where the lingua franca (for term > names) has already been agreed to be English (and we are talking here > about the names of vocab terms, right? - since as Harsh says, the values > for `rdfs:label` or `rdfs:comment` or whatever predicates > /associated/ with these vocab terms can provide whatever values they > want in whatever local, non-English/Western languages they want, > right?)). Anyway, I'm sure a couple of simple examples might help > highlight what I'm probably missing here... AFAIK - Languages that don't have capital letters: Sanskrit or Tamil language families, Japanese, Mandarin, Korean, Arabic (in general), Hebrew, Semetic language families. > > On Harsh's point (i.e., "/...one would have to 'create' the label to > distinguish between a label for Class and a Property with the same name > i.e. class would be 'Concept' and property would have to be 'has > Concept'/"). I agree here, although I would propose having the labels > (as in `rdfs:label`, right?), in this example, as "Concept" for the > property and "Class of Concept" for the Class. Yes, label as in any annotation adding a name, e.g. rdfs:label or skos:prefLabel or dct:title or foaf:name. Though your example is not correct for what I was saying. The label for a class representing a concept would be 'concept' and not 'class of concept', and that for the property would be 'has concept' in cases where there are no capital letters and there is a need to distinguish between labels of a class and a concept. For IRIs, by convention, we use English (or rather the Latin or Roman script) and HTTP both of which support capitals - so this issue doesn't apply. > > */>>> However, should there be consistency between multi-lingual labels/* > I don't follow what you mean here. For me, I have no trouble simply > translating any labels as appropriate, which could be very independent > and different (and therefore may appear 'inconsistent' perhaps), but > that's fine (which is why I think I may not be following what you mean), > e.g.: > > ex.Concept a rdfs:Class ; > rdfs:label "Class of Concept"@en ; > rdfs:label "Clase de Concepto"@es ; > rdfs:label "All the yokes (in Dublin English, everyting's a > 'yoke' (and 'th' is pronounced 't'!))"@en-dublin . > > So can you expand a little on what you mean by "/*consistency* between > multi-lingual labels/"...? Try it in a language which doesn't support capitals. Label of a class is "Concept" (we write Class of Concept only if we're building an ontological representation, otherwise just the title), and that of its associated property is "concept" (differentiated using Capitals). In languages without this distinction, the following are what Class and Property labels look like: Hindi: अवधारणा, अवधारणा Japanese: 概念, 概念 So if such languages wish to distinguish between class and property labels, they must add a prefix or suffix which won't be present in languages that differentiate using Capitals. For example, in Japenese, the equivalent of "uses concept" as a different label than "Concept" is "コンセプトを使用する" (as per machine translation), but the English label would still be just "concept" unless everyone applies "uses concept" as the label text in translation. This is what I mean by having consistency across languages for a label. > > *>> DCAT by way of example.* > Yeah, I love DCAT as an example vocab. But in fact on close inspection, > there appear to be quite a number of inconsistencies and issues with > some of their terms names, and their choices for `rdfs:label` values. > > So the major change I'd suggest making to DCAT (relevant to this > discussion anyway) is my point above about providing 'better' (i.e., > more useful, helpful, and unambiguous) labels for their Classes and > Properties, for instance: > > dcat:Catalog a rdfs:Class ; > rdfs:label "Class of Catalogs"@en . > > dcat:catalog a rdf:Property ; > rdfs:label "Catalog"@en . Depends on what you're modelling. If you're representing an ontology that models just classes, this will look fine. But if you're using them to model data, this is not a good representation. For example, if I want to model that you're using data for Marketing, I'm not going to label that concept "Class of Marketing", but just "Marketing". Similarly, in DCAT, the label is "Catalog" rather than "Class of Catalog". This reflects how these concepts are used in the real-world in terms of labels. > > > At first I thought that the label values for both the Class > `dcat:Catalog` and the Property `dcat:catalog` where both "Catalog". But > in fact, the English label for `dcat:Catalog` is `Catalog` (both capital > 'C'), and the English label for the `dcat:catalog` property is `catalog` > (both lowercase 'c')) > > So Harsh, on your points: > "[DCAT] either have (i) exact same label for classes and concepts;" > Well, no, not the '/exact same/' labels at all (e.g., even in the > case of "Catalog" and "catalog"). > > "or (ii) do not have the same language labels across classes and > properties. " > I don't follow what you mean here - they consistently '/do not/ have > the same language labels across classes and properties', i.e., they > differ by just the case of the first letter (in the 2 cases of > 'dcat:Catalog' and 'dcat:catalog', and 'dcat:Distribution' and > 'dcat:distribution'), or they differ more broadly in words (in the case > of 'CatalogRecord' and 'record'). In the context of your proposal, the issue was the prefix before property names. In DCAT, there is no such prefix, the IRI and label for class and property is the same i.e. 'catalog' - just differentiated by capitals. Now if you look in the DCAT file for labels in languages that don't have capitals, you will find the EXACT same label for both capital and property, or no label for that language. Here is an example: dcat:Catalog has the following language labels which dcat:catalog does not have - Arabic, Japanese (don't support capitals) dcat:Dataset has the following language labels which dcat:dataset also has, where they are EXACTLY SAME, and the language does not have capitals: "قائمة بيانات"@ar , "データセット"@ja > > But perhaps all these DCAT issues are actually being resolved in the v3 > (proposed) you mentioned. I see the HTML of the v3 spec (here > <https://www.w3.org/TR/vocab-dcat-3/>) - but how do I see the Turtle for > this new version (since the namespace IRI is still > http://www.w3.org/ns/dcat# <http://www.w3.org/ns/dcat#>, which right now > only gives me back the v2 Turtle, right?!) https://github.com/w3c/dxwg/blob/e89e7a5f313cc30b7c4504c1ad9bbadd01e88609/dcat/rdf/dcat3.ttl Same labels as v2. Now afer all this, lets also consider why we have labels, and how do we use them. 1) Documentation e.g. on a website for that concept - here the information is almost always accompanied by the type of that concept e.g. "Concept", is a Class, has definition ... - So the same label for a property or a class are not a problem because there is context that provides differentiation 2) Creating textual representation of a triple i.e. <subject label> <property label> <object label> - Here an example could be of the form: "Catalog" "dataset" "Dataset" - which seems confusing. - But properties are usually used with instances that have their own labels, so a better example is: "This Catalog" "dataset" "That Dataset" - which isn't how we speak, but better than before - If the purpose is to generate such text, the prefixes are great, i.e. "This Catalog" "has dataset" "That Dataset" - But if you're used to looking at this in JSON or whatever code form, its a structure, so the prefix isn't needed, i.e. { Catalog --dataset--> Dataset } seems comprehensible even if I haven't used any specific language here because we know implicitly what graph this is representing. So this is how I think why the conventions ended up the way they did. If there is evidence that one or the other is considered best practice by the community, then we have the option of adopting that best practice. Otherwise the additional effort of redrafting all properties, breaking compatibility without any benfit, and potentially someone suggesting the opposite in the future are reasons not to do this. Regards, -- --- Harshvardhan J. Pandit, Ph.D Research Fellow ADAPT Centre, Trinity College Dublin https://harshp.com/
Received on Thursday, 31 March 2022 15:38:45 UTC