- From: John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
- Date: Wed, 28 Jan 2015 11:28:23 +0100
- To: Penny Labropoulou <penny@ilsp.gr>
- Cc: public-ld4lt@w3.org
- Message-ID: <CAC5njqoq8tZqUJAUfOEfE9axOOnm4mpEc6DS=xYHjqySQq2crQ@mail.gmail.com>
On Tue, Jan 27, 2015 at 10:03 PM, Penny Labropoulou <penny@ilsp.gr> wrote: > Hi John and all. > > Thanx for the quick work! > > Below are a few comments/replies in between the lines. > > > > 1) Some names have been shortened, e.g., > 'ConformanceToBestStandardsAndPractices' -> > 'StandardsBestPractices', should we accept such names or stay true to > MetaShare? > I think we should decide this on a case-by-case basis; although some names > are long, they are self-explanatory. In general, at ld4lt we have changed > some names (e.g. resource to language resource) when it was agreed that the > new label is better. > Hmm... change for the sake of change is difficult, particularly when it is only a small part of the vocabulary, that creates gotchas. > > 2) A lot of MetaShare names have (unnecessarily) the words 'Info', 'Type' > or 'InfoType', we could eliminate these. > All “info” elements are in fact component names: in accordance to the CMDI > principles, elements (and other components) are grouped into semantically > coherent components. For instance, the identificationInfo groups together > elements that are used for the identification of a resource, such as the > resourceId, a url used as landing page, the resourceName and shortName, the > description etc. If I have understood well, this structure is not > needed/not a good practice for RDF and this is why they have been > eliminated already at the IULA/UPF mapping. > > “type” elements are used in MetaShare for components that can be re-used: > e.g. persons can be licensors, contact points, resource creators etc., but > in all cases they are encoded using the personInfoType, which groups > together given name, surname, communication information etc. Again, I think > this is not mapped in RDF as such, if I understand well. > Yeah that is my feeling too, I would like to shorten the names, however it seems hard to do this consistently as it would create clashes, e.g., ActualUse/ActualUseInfo, DocumentType/DocumentInfo > > 3) IULA have split the AnnotationType class into 5 subclasses > (DiscourseAnnotation, etc.) > > That’s an improvement from the original model and I suggest we stick to it. > > 4) There are many properties suggested by IULA or in the 'DISTRIBUTION' > model that have no correspondence in the MetaShare data... we should > discuss these on a case-by-case basis, right? > > We have already discussed with Victor the distribution and licensing > module and have come up with a proposal re-introducing some of the original > MetaShare elements that were not mapped in the IULA/UPF version and using > the odrl (mainly) and cc vocabularies ; the general ideas are to be found > at > https://www.w3.org/community/ld4lt/wiki/Metashare_vocabulary_for_licenses > and https://www.w3.org/community/ld4lt/wiki/Examples and the mappings > were documented in the previous googlesheet. I will add these to the new > googlesheet by next week. > I incorporated all the functional (non-documentary) information from the distribution model already... or at least I tried, let me know if I missed anything. > 5) The Prev. Google Doc proposed mapping to both SWRC and BIBO, do we > need to do BIBO as well (SWRC seems sufficient)? > > 6) I added the license modelling that LingHub does in ODRL, could one of > our ODRL experts look at it and fix the last one? > > Please, see also the two wikis on licensing, especially the examples. And > as discussed, together with Victor we will provide a file with the RDF > representations in odrl of the licenses used in MetaShare (of course, only > of those that have not already been RDFized). > This refers to "R4 To neatly represent conditions of use"... but I couldn't find the structured definitions of conditions of use so I wrote my own in the sheet titled "License Modelling" > 7) Some property values, especially *resource types*, such as *ontology* > or *corpus* were created as classes in the Google Doc, shall we confirm > this usage pattern? > > This needs some more thinking, checking the various cases. Is there a list > of these? > This seems to be individuals of the classes 'ResourceType' and 'LexicalConceptualResourceType', approximately, here are the lists for reference: In Prev. Google Doc: BabelNet*, ComputationalLexicon, Corpus, CorpusAudio*, CorpusCollection*, CorpusImage*, CorpusText*, CorpusTextNgram*, CorpusTextNumerical*, CorpusVideo*, Framenet, LexicalConceptualResource, Lexicon, MachineReadableDictionary, Ontology, TerminologicalResource, Thesaurus, ToolService*, WordList, WordNet >From Metashare: computationalLexicon, framenet, lexicon, machineReadableDictionary, ontology, other*, terminologicalResource, thesaurus, wordList, wordnet, corpus, languageDescription*, lexicalConcepturalResource *Unique elements > > > 8) *See attached diagram.* There is a big difference in granularity > between the XSD and IULA-UPF's ontology. For example, there are 4 tags > between the resource and its actual usage in the XML, e.g., > > <resourceInfo> ... > > <usageInfo> ... > > <actualUsageInfo> .... > > <useNLPspecific>parsing</useNLPspecific> .... > > Where is in the IULA model this is considerably simplified to > > :resource a ms:Resource ; > > ms:actualUse ms:parsing > > > > This would be great, but it also loses information, for example, the IULA > schema associates the *availability* with the *Resource*. However, the > XSD schema associates an *availability* with each *Distribution* > (download file). In fact, there are resources that have different > availability for different downloads (e.g., BabelNet), so there is > information loss here. Thus, LingHub is very conservative and sticks to the > XSD, e.g., > > :resource a ms:ResourceInfo ; > > ms:usageInfo [ > > ms:actualUsageInfo [ > > ms:useNLPspecific ms:parsing ] ] > > What shall we recommend here? > > > > Again, discuss on a case-by-case basis. For instance, for availability, we > have re-introduced the distribution element, as otherwise we lose in > semantics. For other cases, I think we should see them more closely. The > grouping into components made sense in XSD because it brought together > elements. I will have to look at them more closely and explain for each > case why this grouping was meant, so that we can decide if this should also > remain in the RDF mapping. Is there an easy way of spotting these cases? > OK, we should discuss this in a telco. > > > A final question: how will we add the comments/decisions from the previous > googlesheet to the current one? As said, I can do this for the > distribution/licensing module elements but for the rest? > Add any comments you want (possibly copied from previous doc). Apart from that I would like to keep the sheet itself clean until the next ldl4lt telco at least Regards, John > > > Best, > > Penny > > >
Received on Wednesday, 28 January 2015 10:28:52 UTC