Re: Best Practices - Semantic Tagging

(We're moving out of the realm of annotations, but I'll keep it on the
list just for the archive)

On Tue, Mar 5, 2013 at 1:29 PM, Tim Cook <tim@mlhim.org> wrote:

>> Hm.. that sounds more like provenance than tagging or identifying.
> I think of  provenance more as ownership or a history of ownership.

I would include "How it came to exist, and why" as part of provenance.

> Whereas here we are attempting to set a permanent reference for deeper
> explanation of the term/concept.

OK, well that is a bit different, I agree.

> I think the example you gave is *VERY* attractive: using an xml:ID on
> each complexType and then referencing all of them in one metadata
> section of the CCD.  I am producing a tool to just build these

Just be careful about the spelling, xml:id in the xs:complexType, and
rdf:ID in the rdf:Description. Now don't ask my why they could not
have the same casing..


> complexTypes outside of a CCD, since this is the approach to
> reproducing the models from SQL databases and dictionaries ie.
> https://wiki.nci.nih.gov/display/caDSR/CDE+Browser for
> interoperability with and between legacy systems.
>
> These complexType stubs or "Pluggable Complex Types" as we have
> started to call them; are not really valid schemas.  The are just a
> text file named after the complexType name
> (ct-f6c5ea6e-6458-4799-874d-7f3d365d260d.pct) I can put the
> inforamtion that will go into the CCD metadata section in this file
> as:
> ..
> I can then add the functionality to the CCD editor that will extract
> these and put them into the CCD rdf:RDF metadata section.  So
> everyhting ends up being much ncier and neater, all in one place.

I'm not quite sure about this as it got a bit too specific.. here is my guess:

You keep all the RDF linking complex types vs. purl.bioontology.org in
a single source. You then automatically embed the relevant statements
inside the top-level xs:annotation/xs:appinfo of the schema itself
where those complex types are used (not inside the complexType) .


You'll have to keep your tongue straight on the identifiers if you go
for this - as the local #ids will be unique depending on the URI of
the particular schema they are included in  - other than that it
sounds fine.


>      <rdf:Description rdf:ID="ct-f6c5ea6e-6458-4799-874d-7f3d365d260d">
>           <rdfs:isDefinedBy
> rdf:resource="http://purl.bioontology.org/ontology/SNOMEDCT/365761000"/>
>          < and any other references the modeller wants to create>
>    </rdf:Description>
>

So with rdfs:isDefinedBy I would expect to find something about
#ct-f6c5ea6e-6458-4799-874d-7f3d365d260d on
http://purl.bioontology.org/ontology/SNOMEDCT/365761000 - but there is
none.

Are you sure rdfs:isDefinedBy is not too strong? I know it's not a
requirement from rdfs:isDefinedBy, but common usage, at least for
vocabularies, is that the term would appear in the other end. I would
recommend checking this on the public-lod list.


It seems more like you want to use a skos relation like
skos:closeMatch [1] - I would believe you want to say that you mean
the same concept as
<http://purl.bioontology.org/ontology/SNOMEDCT/365761000> but that
your #ct-f6c5ea6e-6458-4799-874d-7f3d365d260d is somewhat differently
shaped (it's a complex type).



As for OA annotations, I think the oa:identifying motivation with an
oa:SemanticTag body I detailed before should be appropriate. Views
from the group?



[1] http://www.w3.org/TR/skos-primer/#secmapping


-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Tuesday, 5 March 2013 14:53:17 UTC