Re: The dcterm/schema.org issue: a proposal to move forward from Andy Seaborne on 2014-10-09 (public-csv-wg@w3.org from October 2014)

From: Andy Seaborne <andy@apache.org>
Date: Thu, 09 Oct 2014 12:36:22 +0100
To: Ivan Herman <ivan@w3.org>
CC: Jeni Tennison <jeni@jenitennison.com>, Dan Brickley <danbri@google.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <54367336.3080405@apache.org>
On 09/10/14 12:17, Ivan Herman wrote:
>
> On 09 Oct 2014, at 13:08 , Andy Seaborne <andy@apache.org> wrote:
>
>> On 09/10/14 11:08, Ivan Herman wrote:
>>>
>>> On 09 Oct 2014, at 11:19 , Andy Seaborne <andy@apache.org> wrote:
>>>
>>>> I agree that defining yet more is worrisome.
>>>>
>>>> I don't see this as RDF (or rather Linked Data) specific.
>>>>
>>>> If the spec lists a number of terms to use, we are saying what they mean just by choosing names.  I was assuming there would be some text per term, not just a name.
>>>
>>> Correct.
>>>
>>>>
>>>> The question of how a term relates to schema.org or Dublin Core is there regardless of format.  Linked Data highlights this because properties have names and definitions but the issue is there regardless.
>>>
>>> If I put myself into a JSON user's skin, I am perfectly content
>>> getting the terms as defined in the metadata spec in my generated
>>> JSON. As a JSON user I do not necessarily care about the existence of
>>> other terminologies like DC or schema as long as I am consistent with
>>> myself.
>>>
>>> As a RDF user it is different: I always place myself on the Web, so
>>> to say, because I always think in terms of URI-s; it is a different,
>>> say, attitude. So the relationship to other, existing vocabularies
>>> become more critical.
>>>
>>> So, in your opinion, what should happen if somebody puts up a
>>> metadata with those agreed terms but without specifying what the
>>> relationship to a specific vocabulary is (ie, without adding a
>>> @context to the metadata)? What should one generate into RDF, and
>>> what should one generate into JSON?
>>
>> Let's look at what the specs would be saying. Do you agree that in your example the term means what the metadata spec says it means and only that.
>>
>
> I think, if I understand well what you say, yes:-)
>
>> And doesn't that mean the has now defined a new set of terms! There are N+1 sets of terms in use in data.  They are related but can be used on their own.
>>
>
> Yes, we have to define our own terms. But we are running around in circles: I would not want to see us picking one specific vocabulary out there over another one; there is simply no consensus in the community on what vocabulary to use. Hence the problem we are in, and hence my proposal to choose a core set that we define ourselves, with a mapping (see below).

"All problems in computer science can be solved by another level of 
indirection"

I can understand the not wanting to pick terms but also feel that the 
web is about encouraging reuse and creating associations.

We need to make a definite case if not using other term sets.  I'm not 
clear what the case is - there are a number of points from people; it 
would need to move to a group agreement.  i.e. text (maybe not in the 
doc but say, a WG resolution to test if we have agreement here).

>
>> I'm unsure how to deal with that but it does not seem ideal.
>
> No it is not. I wish the community would have converged to one set of terms already, but that is not the case. I simply do not see any viable alternative.
>
>>
>> One possibility (not advocacy) is to say "this term SHOULD be compatible for use as scheme.org Z and for use as Dublin Core Y".  i.e. mention and encourage the relationship but not mandate the relationship.
>>
>
> Another, and slightly more formal, way of saying that was that the document should include, as part of an informative appendix, a JSON-LD @context object for a mapping of the terms to DC and another @context object for mapping against schema.org. The user may choose which one to take, or to map against a third one. I am still not sure what you propose we do if the user does not say anything.

I don't think a proposal is needed or possible. The meta spec has 
defined the terms.  The consequence is that the user is using our 
(N+1)th set of terms.  Adding an appendix with the @context's is a good 
idea.  Even if not normative, it puts teh relationship in the mind of 
the reader.

>
> Ivan
>
>>
>> In your example, what had you imaged the JSON user do with the data (after reading the metadata spec :-)?

Ivan - I'm interested in your thoughts here.

BTW when you say "user" do you mean the producer or the consumer of the 
converted CSV-data+metadata?

>>
>> 	Andy
>>
>>> Ivan
>>>
>>>
>>>>
>>>> 	Andy
>>>>
>>>> On 09/10/14 09:41, Ivan Herman wrote:
>>>>>
>>>>> On 09 Oct 2014, at 10:33 , Jeni Tennison <jeni@jenitennison.com> wrote:
>>>>>
>>>>>> I feel very uncomfortable about (a) defining yet another namespace for metadata terms and (b) creating yet another sub-list/profile of metadata terms, when in both cases there are existing standards that we could reference instead.
>>>>>>
>>>>>
>>>>> The problem is, of course, that we have more than one (and more than two:-) to choose from... But I share your discomfort.
>>>>>
>>>>> We *could* introduce a rule into the RDF conversion (the only place where this 'namespace' issue occurs) that those core terms should be mapped onto RDF if and only if the metadata @context clearly assigns URI-s to them. Otherwise they are simply skipped. I would be pretty fine with that. For a purely JSON, non-RDF usage the issue is not relevant...
>>>>>
>>>>>> If we’re going to have some core terms then we should have some very clear criteria for choosing them, such as their utility in the display or validation or conversion of CSV files.
>>>>>
>>>>> +1. Based on our discussion on the call, Rufus will put into the document the core set (essentially a subset of the terms that are already called out in the metadata document) and we can then take it from there. I am in favour of keeping the core set very small, and rely on people using qualified terms for anything else they want.
>>>>>
>>>>> Ivan
>>>>>
>>>>>>
>>>>>> Jeni
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Herman <ivan@w3.org>
>>>>>> Reply: Ivan Herman <ivan@w3.org>>
>>>>>> Date: 8 October 2014 at 11:16:15
>>>>>> To: Andy Seaborne <andy@apache.org>>
>>>>>> Cc: Dan Brickley <danbri@google.com>>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>>
>>>>>> Subject:  Re: The dcterm/schema.org issue: a proposal to move forward
>>>>>>
>>>>>>>
>>>>>>> On 08 Oct 2014, at 12:04 , Andy Seaborne wrote:
>>>>>>>
>>>>>>>> On 08/10/14 10:30, Dan Brickley wrote:
>>>>>>>>> On 8 October 2014 10:16, Andy Seaborne wrote:
>>>>>>>>>> On 04/10/14 08:06, Ivan Herman wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. We define a small set of core properties that we consider to be
>>>>>>>>>>>> essential in the metadata. "We define" means that we specify the terms to be
>>>>>>>>>>>> used in the metadata specification as well as their data types and intended
>>>>>>>>>>>> meaning
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This makes sense though I do have one small question:
>>>>>>>>>>
>>>>>>>>>> By "we define" do you include giving it a w3c-csv:xyz URI then define
>>>>>>>>>> skos:/rdfs:/owl: mappings to other vocabularies? Or, if not, in what way is
>>>>>>>>>> it different to defining a property or class?
>>>>>>>>>
>>>>>>>>> That (creating an actual vocabulary definition) sounds the simplest
>>>>>>>>> way of making sure we're precise. However we might not want to be more
>>>>>>>>> precise than the mass-deployment vocabularies we're basing it on, and
>>>>>>>>> both DC and schema.org are pretty flexible. And of course it is
>>>>>>>>> comically close to http://xkcd.com/927/ ...
>>>>>>>>
>>>>>>>> Sure but a broadly worded definition isn't trying to be completely prescriptive.
>>>>>>>>
>>>>>>>> The defintion is going to be quite broad so only "precise" in the sense of a defintion
>>>>>>> at all. "it's a title" - we don't constrain what a 'title' is.
>>>>>>>>
>>>>>>>> This, and Ivan's message, are just about whether the same broad definition is given
>>>>>>> a URI name of not. Having "http://w3/csv#" and the list in the original message seem no
>>>>>>> more than "data on the web" to me.
>>>>>>>>
>>>>>>>
>>>>>>> In fact, that is true, something like that is probably a way to go. When generating an RDF
>>>>>>> for, say, 'title'
>>>>>>>
>>>>>>> - if there is a @context that assigns a URI to 'title', use that as a predicate URI
>>>>>>> - otherwise use http://www.w3.org/ns/csvw#title
>>>>>>>
>>>>>>> ie, there is a URI, you are correct. Except that we do not, normatively, define any kind
>>>>>>> of skos or owl or whatever equivalence to dc:title or schema:title; users can do that
>>>>>>> if they wish.
>>>>>>>
>>>>>>> Ivan
>>>>>>>
>>>>>>>
>>>>>>>> Andy
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> WebID: http://www.ivan-herman.net/foaf#me
>
>
>
>
>
Received on Thursday, 9 October 2014 11:36:53 UTC