Re: UDEF Representation in RDF

Hi -

Thanks, Arnold! I've added comments to your comments. We'll postpone the vote until the discussion is complete.


Regards,
Chris
++++

Chris Harding
c.harding@opengroup.org



On 4 Jun 2012, at 14:42, Overeem, Arnold van wrote:

> I have added my initial feddback in the wiki: https://wiki.opengroup.org/si-wiki/doku.php?id=regsw:rdf-syntax_discussion
> 
> 
> Vriendelijke groet / Regards / Gruß / С уважением / Terveisin / εγκάρδια / Saludos
> 
> Arnold van Overeem
> Global Architect
> Tel: (+31) (0) 6 54 774715     Grip 91073 (weblog)
> 
> 
>> -----Original Message-----
>> From: Overeem, Arnold van
>> Sent: Monday, June 04, 2012 2:27 PM
>> To: 'Chris Harding'; All UDEF Interested Parties; Simon Spero; public-
>> esw-thes@w3.org; richard.parent; Alan Doniger
>> Subject: RE: UDEF Representation in RDF
>> 
>> Hi Chris,
>> 
>> I'll add my feedback in the wiki
>> 
>> Vriendelijke groet / Regards / Gruß / С уважением / Terveisin /
>> εγκάρδια / Saludos
>> 
>> Arnold van Overeem
>> Global Architect
>> Tel: (+31) (0) 6 54 774715     Grip 91073 (weblog)
>> 
>> 
>>> -----Original Message-----
>>> From: Chris Harding [mailto:c.harding@opengroup.org]
>>> Sent: Wednesday, May 23, 2012 11:43 AM
>>> To: All UDEF Interested Parties; Simon Spero; public-esw-thes@w3.org;
>>> richard.parent; Alan Doniger
>>> Subject: Re: UDEF Representation in RDF
>>> 
>>> Hi -
>>> 
>>> There hasn't been anything new on this discussion for a while, so
>> it's
>>> time to make some conclusions. Unless anyone objects, here's what I
>>> propose we do.
>>> 
>>> 1. The core UDEF, and each associated UDEF vocabulary, shall have a
>>> base representation in RDF, plus other representations that will be
>>> mechanically derivable from the base representation and available for
>>> the convenience of applications that use particular representation
>>> formats. The derived representations shall include RDFS Class, SKOS,
>>> XML, and HTML representations.
>>> 
>>> 2. Each node of each UDEF vocabulary shall have its own URI. This
>> shall
>>> consist of a base URI that identifies the vocabulary, the #
>> character,
>>> and a fragment that identifies the node. For an object class node,
>> the
>>> fragment shall be the string "UDEF-" followed by the UDEF object
>> class
>>> id. For a property node, the fragment shall be the _ character
>> followed
>>> by the UDEF property id.
>>> 
>>> 3. The base URI for the core UDEF vocabulary shall be
>>> http://www.opengroup.org/udefinfo/rdf/udef  This means that the URIs
>>> for existing nodes of the UDEF vocabulary - see
>>> http://www.opengroup.org/udefinfo/dl/rdf/ - will not change.
>>> 
>>> 4. The base representation shall include, for each node, for each
>>> language in which that node is defined, a statement assigning to the
>>> node an rdfs:label for that language consisting of the UDEF label for
>>> the node in that language.  (Eg "Weather.Natural.Environment" for
>> UDEF
>>> object class node a.a.4 in the English language.) There may also be a
>>> statement assigning to the node an rdfs:comment for that language
>>> consisting of the UDEF description for the node in that language.
>>> 
>>> 5. An object property has_Connected_Node will be defined. For any
>> node
>>> of the core UDEF that has a connected node in another UDEF
>> vocabulary,
>>> the base representation of the core UDEF shall include a statement
>> with
>>> the URI of the core vocabulary node as subject, has_Connected_Node as
>>> predicate, and the URI of the connected node as object, and the base
>>> representation of the other vocabulary shall include an inverse
>>> statement.
>>> 
>>> 6. An HTTP GET request to the base URI shall return a set of RDF
>>> statements about the vocabulary. (The precise nature of these
>>> statements is to be determined.)
>>> 
>>> 7. The RDFS Class representation of a vocabulary shall consist of the
>>> core representation plus a set of RDF statements that define each
>> node
>>> as an RDFS class and relate each node to its parent by the
>>> rdfs:subClassOf property. (This is the representation currently used,
>>> see http://www.opengroup.org/udefinfo/dl/rdf/
>>> 
>>> 8. The SKOS representation of a vocabulary shall consist of the core
>>> representation plus a set of statements that define each node as a
>> SKOS
>>> concept and relate each node to its parent by the skos:broader
>>> property. The representation shall also include statements defining
>> the
>>> vocabulary itself as a SKOS concept scheme, the nodes as concepts of
>>> that scheme, the root nodes as top concepts of that scheme, and the
>>> labels of each node as SKOS Preferred Lexical Labels in their
>>> respective languages. Where a node in the core UDEF has a connected
>>> node in another vocabulary, the representations of the core UDEF and
>>> the other vocabulary shall include SKOS broader and narrower
>> statements
>>> relating the two nodes, with the core UDEF node as the narrower
>>> concept..
>>> 
>>> 9. The XML representation of a vocabulary shall be as at present -
>> see
>>> http://www.opengroup.org/udef/dl/dlxml.htm - with the addition of a
>>> "has_Connected_Node" tag to show connections to nodes of the base
>> UDEF.
>>> The precise format of this is to be determined.
>>> 
>>> 10. The HTML representation of a vocabulary shall be as at present -
>>> see http://www.opengroup.org/udefinfo/htm/en_defs.htm - but with the
>>> addition of a connector symbol (to be determined) to the list item
>> for
>>> each node that has a connected node in another vocabulary, and with a
>>> hyperlink from that symbol to the connected node in an HTML
>>> representation of that vocabulary.
>>> 
>>> Regards,
>>> Chris
>>> ++++
>>> 
>>> Chris Harding
>>> c.harding@opengroup.org
>>> 
>>> 
>>> 
>>> On 9 May 2012, at 15:09, Chris Harding wrote:
>>> 
>>>> Hello, Simon -
>>>> 
>>>> Thanks for your response - and for taking the trouble to look more
>>> than superficially at the UDEF in order to formulate your response. I
>>> have responded to the points that you raise below.
>>>> 
>>>> Regards,
>>>> Chris
>>>> ++++
>>>> 
>>>> Chris Harding
>>>> c.harding@opengroup.org
>>>> 
>>>> 
>>>> 
>>>> On 8 May 2012, at 21:33, Simon Spero wrote:
>>>> 
>>>>> On Tue, May 8, 2012 at 5:04 AM, Chris Harding
>>> <c.harding@opengroup.org> wrote:
>>>>> 
>>>>> We are looking at an approach that would define UDEF object
>> classes
>>> and UDEF properties as SKOS concepts, and use SKOS narrower/broader
>> to
>>> capture the relation between a parent object class and its children,
>>> and between a parent property and its children.
>>>>> 
>>>>> The initial questions that we have are:
>>>>>  - Does this look like a sensible approach?
>>>>>  - Should we make the whole of the UDEF a single SKOS concept
>>> scheme, or would it be better
>>>>>    to have separate concept schemes for object classes and
>>> properties?
>>>>> 
>>>>> I am not entirely sure that using SKOS would necessarily be the
>> most
>>> appropriate way of increasing the semantic richness for what seem to
>> be
>>> UDEF's target applications. Here are some considerations  that might
>>> help you decide whether a purely SKOS based approach is ideal for
>> your
>>> needs, or whether RDF(S) + OWL might be a better approach.
>>>>> 
>>>>>  • SKOS was originally developed for representing knowledge
>>> organizing systems, not knowledge representation systems; that is, it
>>> was designed to represent the relationship between ideas, not between
>>> the things that those ideas are about.  A good test to see if SKOS is
>>> right for your application is to consider whether in your application
>>> there is ever any difference between something being a kind of
>>> something else, and something being a part of  something else.
>>>> The UDEF is an index of fields in forms, columns in databases, etc.
>>> Each field (or column) corresponds to a concept, and what is entered
>>> into a field (or a cell in a column) provides information about a
>> thing
>>> that realises the concept. So it is perhaps not exactly about ideas,
>>> but it is more about ideas than it is about things.
>>>> 
>>>>>  • SKOS does not have a standardized way of expressing sequences
>>> of concepts, for generating concatenated notations, or for expressing
>>> restrictions on the types of concepts that can be used to restrict
>> what
>>> concepts could follow what other concepts.  This essentially forces
>>> UDEF to be a fully enumerated system, which may not be ideal.
>>>> Yes. A UDEF identifier is a concatenation of an object class
>>> identifier and a property identifier. In theory, it is this object
>>> class/property combination that corresponds best to a SKOS concept.
>> In
>>> practice, it is not feasible to enumerate all of the valid  object
>>> class/property combinations.
>>>> 
>>>>>  • The example used on the the UDEF CONOP page involves a sample
>>> interaction with the DLA (Defense Logistics Agency).  That ( and fact
>>> that the overview page is titled "Concept of Operation" :)  suggests
>>> that interworking with DoD and other government agencies is an
>>> important consideration.  DoD semantic interoperability and
>> federation
>>> work is using OWL/RDF as a basis - see these slides by Dennis
>> Wisnosky
>>> (DoD BMA CTO & Chief Architect)  from last week's DoD Enterprise
>>> Architecture conference.
>>>> The UDEF is applicable to all areas. It was originally conceived
>>> within the defence procurement community, which is why there are so
>>> many defence-related terms in the current version. The proportion of
>>> defence-related terms has decreased as other areas have come into
>>> consideration, and will continue to decrease.
>>>> 
>>>> But, IMHO, ability to express UDEF definitions in RDF and OWL is
>>> crucial, regardless of the area of application. In my understanding,
>>> SKOS is not an alternative to this, but a way of defining the mapping
>>> to RDF that could have additional benefits (a) in enabling us to use
>>> tools designed for SKOS to work with the UDEF definitions, and (b)
>>> potentially in the longer term enabling us to link the UDEF with
>> other
>>> SKOS vocabularies.
>>>> 
>>>>>  • Earlier work at USAF in the EVT under SAF-US(M) revealed that
>>> even RDF(S) + OWL was insufficient to capture all useful information
>>> for most Communities of Interest;  the weaker semantics of SKOS would
>>> presumably be able to capture even less information.  (Interestingly,
>>> the acronym EVT stood for "Enterprise Vocabulary Team"; the V was
>>> obsolete  even  before the team was stood up).
>>>> The UDEF does not set out to capture all information. Its limited
>>> (but useful) objective is to provide an index for data fields, as
>>> described above.
>>>> 
>>>>>  • Where a natural language term has different meanings in
>>> different CoIs, it is a very bad idea to try and force the subject
>>> matter experts in one or both areas  to use a different term.
>>> Performance level of subject matter experts is degraded to a level
>>> close to novices.
>>>> Agreed.
>>>>> Taking a look at some of the UDEF definitions seems to suggest
>> that
>>> the current definitions do not include much information that could be
>>> considered essential for interoperability.
>>>>> 
>>>>> For example, in the "identifier" sub tree, we find the following
>>> terminal node.
>>>>> 4.35.8 Air-Force.Assigned.Identifier
>>>>> 
>>>>>        1.4.35.8 United-States.Air-Force.Assigned.Identifier
>>>>> 
>>>>> Given the number of different USAF identifiers, this is somewhat
>>> problematic,  and would seem to be based on an unnamed specific use
>>> case.
>>>>> 
>>>> See below for more on this example.
>>>>> In terms of mapping to RDF(S)/OWL/SKOS,  these properties would
>> seem
>>> to imply a hierarchy  - SubPropertyOf in RDFS, SubDataPropertyOf in
>>> OWL, broader in SKOS.
>>>>> 
>>>> Yes, I think this how it should be interpreted.
>>>>> In terms of  creating interoperable systems, it is hard to
>>> understand what the semantics of these properties would be.
>>>>> 
>>>>> What would it mean to have a data field tagged "4.35.8"?
>>>> This is only half of a UDEF tag. A field would never be tagged
>>> 4.35.8. A field could for example be tagged
>>>> a.o.1_4.35.8 (Military.Aircraft.Asset_Air-
>> Force.Assigned.Identifier)
>>> in the case of an identification number of a military aircraft, or
>>> c.j.5_4.35.8 (Military.Officer.Person_Air-Force.Assigned.Identifier)
>> in
>>> the case of an identification number for its pilot.
>>>> 
>>>>> Could it be meaningfully compared to another data field tagged
>>> "4.35.8"?
>>>> It is meaningful to compare the full tags of fields, either to
>>> distinguish different fields, or to detect that fields are the same.
>> In
>>> the example above, it is easy to envisage records that have "Id"
>> fields
>>> that might refer to the aircraft or to the pilot. The UDEF makes it
>>> easier to put the right data in the right field.
>>>> 
>>>>> Could you join two records from different sources using this
>> field?
>>>> With extreme caution. The UDEF is in concept infinitely extensible,
>>> and currently very incomplete. There is not currently, for example, a
>>> tag for "Canadian.Air-Force.Assigned.Identifier" or for
>>> "Australian.Air-Force.Assigned.Identifier so an id for a Canadian
>>> aircraft would likely be tagged in the same way as an id for an
>>> Australian aircraft. Equating similarly-tagged fields  in a join of
>>> Australian and Canadian records would probably not give you a
>>> meaningful result, unless the Australian and Canadian air forces
>> happen
>>> to have agreed on a uniform identification scheme for aircraft. Even
>>> United-States.Air-Force.Assigned.Identifier might not work in a join
>> -
>>> I wouldn't be at all surprised if the US Air Force has more than one
>>> identification scheme for aircraft.
>>>> 
>>>>> Could you join two records, one identified by a  "1.4.35.8", the
>>> other by "4.35.8"?
>>>> Only if you were sure that the two sources were using non-
>> overlapping
>>> identification schemes - and the UDEF would not give you this
>>> assurance.
>>>> 
>>>>> 
>>>>> Is the  relationship between the fields strictly one of about-
>> ness;
>>> everything that is in some way about a United-States.Air-
>>> Force.Assigned.Identifier it is in somewhat about an Air-
>>> Force.Assigned.Identifier?
>>>>> 
>>>> A Military.Aircraft.Asset_United-States.Air-
>> Force.Assigned.Identifier
>>> is a Military.Aircraft.Asset_Air-Force.Assigned.Identifier. It is
>> also
>>> a Aircraft.Asset_United-States.Air-Force.Assigned.Identifier, an
>>> Asset_Identifier, and all the combinations in between.
>>>> 
>>>> I don't believe we should think in terms of joins in the classical
>>> sense, but rather about deductions to be made from the information
>>> contained in different records, that might produce new records
>>> containing derived information. (This is a bit like the difference
>>> between SPARQL and SQL). UDEF tagging can make a meaningful
>>> contribution to such deductions, even if the tags cannot be used to
>>> identify fields on which to join records.
>>>> 
>>>>> It well be that some many of the unique labels that UDEF has
>> created
>>> can be usefully refactored into their semantic components, and that
>> by
>>> doing so it becomes easier to create federate systems of systems that
>>> work at the enterprise scale and beyond.
>>>> A large part of the value of the UDEF is that its hierarchical
>>> structure imposes a discipline on the arbitrary combination of
>>> components. This makes it easier to use as a practical tool, though
>> it
>>> reduces its power of expression. We do however very much appreciate
>> the
>>> need to federate vocabularies developed by different communities of
>>> experts - and this is a good reason why we should look at SKOS.
>>>> 
>>>> 
>>>>> 
>>>>> Simon
>>>>> 
>>>> 
>>> 
> 

Received on Wednesday, 6 June 2012 10:40:50 UTC