Re: schema.org Markup for DITA XML-based Technical Documentation

A use case is enrichment of technical documentation content with identifiers for named entities. These may provide links to general data sets or to specific ones, e.g. provided by the tech doc company.

I have explored this with others and produced this demo, showing the process with docbook and other XML vocabularies. I will present a DITA demo at this years TCWorld conference in autumn.
See the demo here
http://fsasaki.github.io/stuff/feisgiltt2016/ <http://fsasaki.github.io/stuff/feisgiltt2016/>

 
> Am 22.06.2016 um 18:47 schrieb John Walker <john.walker@semaku.com>:
> 
> Hi Keith
> 
> Given that linked data and DITA are two subjects close to my heart, I would be happy to spend time on this subject both to work out ideas and (in due course) to implement things.
> 
> A first thought is if a mapping could be generic or would depend on the use case at hand. In case the latter how could one define mappings from a DITA specialisation to concepts from an ontology (schema.org <http://schema.org/> or otherwise) that could be passed into a generic processor.
> 
> Alternatively it *should* be perfectly possible to use RDFa directly in the source DITA XML and pass this through into HTML+RDFa output, but not seen any deployments of this approach in the wild.

Choose in the demo the approach 2 „embed linked data via structured markup“. This uses not RDFa but micro data attributes, but this is just syntactic sugar. See an XLIFF example below.

<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" version="2.0" srcLang="en" trgLang="fr">
 <file id="f1">
  <unit id="u1">
   <segment>
   <source>We very much welcome you in the city of <mrk vocab="http://schema.org/" typeof="Place" property="name" resource="http://dbpedia.org/resource/Prague">Prague</mrk>, a home of XML!</source>
   </segment>
  </unit>
 </file>
</xliff>

> 
> Just thinking out loud but perhaps also feasible to embed RDF/XML into the DITA XML source (similar to how MathML and SVG can be embedded).

See the approach 4 in the demo. This does not embed RDF/XML but turtle, changing this to RDF/XML is no big issue.

> This is possible in SVG for example.
> 
> Otherwise perhaps an approach like embedding JSON-LD in HTML using the script tag with appropriate MIME type might work.

See approach 5 for a json-ld example - the information is stored as web annotation.

> 
> First step would be to define a few concrete use cases.

All use cases in above demo are related to SEO. The different approaches are supplied because they have different influences on existing XML workflows, e.g. may (or may not break) validation.That topic will be explored further in a new CG, see
https://www.w3.org/community/rax/ <https://www.w3.org/community/rax/>


Best,

Felix

> 
> 
> Regards
> John 
> 
> 
> Sent from my Samsung Galaxy smartphone.
> 
> 
> -------- Original message --------
> From: Keith Schengili-Roberts <keith.roberts@ixiasoft.com <mailto:keith.roberts@ixiasoft.com>> 
> Date: 22/06/2016 16:23 (GMT+01:00) 
> To: John Walker <john.walker@semaku.com <mailto:john.walker@semaku.com>>, Martynas Jusevičius <martynas@graphity.org <mailto:martynas@graphity.org>>
> Cc: public-schemaorg@w3.org <mailto:public-schemaorg@w3.org>, Colin Maudry <colin@maudry.com <mailto:colin@maudry.com>> 
> Subject: Re: schema.org <http://schema.org/> Markup for DITA XML-based Technical Documentation 
> 
> I'll be honest and say that I can't give you a straight answer as to which of those options I would go for as I am still researching what is feasible in terms of a bridge between DITA and Schema.org <http://schema.org/>. I was not previously aware of the TechArticle class that you mention, and have added that to my list of things to review.
> 
> Offhand I would say a combination of #1 and #3. Though not designed with SEO in mind, Colin Maudry's DITA OT plugin that produces RDF seems to me to be a natural stepping stone to producing content in RDFa that Schema.org <http://schema.org/> could parse, though asking for Schema.org <http://schema.org/>-aware descriptors to be built into the DITA-OT is also a possibility.
> 
> At the moment there is no effective bridge between DITA-based content and Schema.org <http://schema.org/>, and I really just want to get the ball rolling... (and educate myself as to what is required in the process).
> 
> Cheers!
> 
> -
> 
> Keith Schengili-Roberts
> DITA Information Architect / DITA Specialist
>  
> IXIASOFT 
> 825 Querbes, Suite 200, Montréal, Québec, Canada, H2V 3X1
> tel  + 1 514 279-4942 <tel:%2B%201%20514%20279-4942>  /  toll free + 1 877 279-4942 <tel:%2B%201%20877%20279-4942> 
> robertsk@ixiasoft.com <mailto:robertsk@ixiasoft.com>  /  www.ixiasoft.com <http://cp.mcafee.com/d/FZsS83gArhohhoh76zBN4TsSCztdBNV5xMSCztdBNVZUsrjhKCOUYyMedETo7n79EzCjpkDYqJxUa9RDVWN-SZ3oG_jBPpeI_fmfSTEr5nWsKrus7fnjovW_8TuKyqeuLsKCONvAQm4T6emKDp55mVEVvVkffGhBrwqrhdECXYDuZXTLuZPtPo0agvbqltDO-6P_QDO7GOfBk5i3VriHI-ndFEKc8L6MQ1wQg60MbwAQg1eDNd40Bm3LN-5Ld40Qp-4Ph07vfp7QdIL6Y11Q5gJZM7na>
> 
> <OutlookEmoji-1457643010967_UC2016-logo.jpg.jpg>
> 
> Interested in attending? Visit our event website <http://www.ixiasoft.com/en/news-and-events/ixiasoft-user-conference-2016> for more information. 
> From: John Walker <john.walker@semaku.com <mailto:john.walker@semaku.com>>
> Sent: Wednesday, June 22, 2016 9:04:34 AM
> To: Keith Schengili-Roberts; Martynas Jusevičius
> Cc: public-schemaorg@w3.org <mailto:public-schemaorg@w3.org>; Colin Maudry
> Subject: RE: schema.org <http://schema.org/> Markup for DITA XML-based Technical Documentation
>  
> Hi Keith,
>  
> Could you elaborate on what kind of (meta)data you would want to expose and the sort of use cases you would want to support?
>  
> For example is it to:
> 1.       annotate the HTML output (in which case schema.org <http://schema.org/> already has quite broad coverage)
> 2.       give some insights to the ‘underlying’ DITA resources (maps, topics, references between them, etc.) to, for example, better analyze re-use and other metrics
> 3.       improve SEO by describing the subject matter of the content (for example what product or subject the content is about)
>  
> An existing class such as http://schema.org/TechArticle <http://schema.org/TechArticle> might already map well to certain DITA concepts.
> Alternatively is there some way to classify/type DITA content according to some external classification scheme (more specific than SubjectScheme in that it should assert the rdf:type of the content resource).
>  
> Regards,
>  
> John Walker
> Principal Consultant & co-founder
> Semaku B.V.
> SFJ 4.009, Torenallee 20, 5617 BC Eindhoven
> Mobile: +31 6 475 22030
> Email: john.walker@semaku.com <mailto:john.walker@semaku.com>
> Skype: jaw111
> Web: http://semaku.com/ <http://semaku.com/>
>  
> KvK: 58031405
> BTW: NL852842156B01
> IBAN: NL94 INGB 0008 3219 95
>  
> From: Keith Schengili-Roberts [mailto:keith.roberts@ixiasoft.com <mailto:keith.roberts@ixiasoft.com>] 
> Sent: Wednesday, June 22, 2016 2:25 PM
> To: Martynas Jusevičius <martynas@graphity.org <mailto:martynas@graphity.org>>
> Cc: public-schemaorg@w3.org <mailto:public-schemaorg@w3.org>; Colin Maudry <colin@maudry.com <mailto:colin@maudry.com>>
> Subject: Re: schema.org <http://schema.org/> Markup for DITA XML-based Technical Documentation
>  
> Thanks for mentioning that. I have been in contact with Colin Maudry about this already, and I can see how it might be a stepping stone towards getting Schema.org <http://schema.org/> readable data from DITA.
>  
> I am still doing research into the feasibility of the whole thing, so am not clear as what you mean with your "shoehorn" comment.
>  
> Cheers! 
>  
> -
>  
> Keith Schengili-Roberts
> DITA Information Architect / DITA Specialist
>  
> IXIASOFT 
> 825 Querbes, Suite 200, Montréal, Québec, Canada, H2V 3X1
> tel  + 1 514 279-4942 <tel:%2B%201%20514%20279-4942>  /  toll free + 1 877 279-4942 <tel:%2B%201%20877%20279-4942> 
> robertsk@ixiasoft.com <mailto:robertsk@ixiasoft.com>  /  www.ixiasoft.com <http://cp.mcafee.com/d/FZsS83gArhohhoh76zBN4TsSCztdBNV5xMSCztdBNVZUsrjhKCOUYyMedETo7n79EzCjpkDYqJxUa9RDVWN-SZ3oG_jBPpeI_fmfSTEr5nWsKrus7fnjovW_8TuKyqeuLsKCONvAQm4T6emKDp55mVEVvVkffGhBrwqrhdECXYDuZXTLuZPtPo0agvbqltDO-6P_QDO7GOfBk5i3VriHI-ndFEKc8L6MQ1wQg60MbwAQg1eDNd40Bm3LN-5Ld40Qp-4Ph07vfp7QdIL6Y11Q5gJZM7na>
>  
> <image001.jpg>
>  
> Interested in attending? Visit our event website <http://www.ixiasoft.com/en/news-and-events/ixiasoft-user-conference-2016> for more information. 
> From: Martynas Jusevičius <martynas@graphity.org <mailto:martynas@graphity.org>>
> Sent: Tuesday, June 21, 2016 5:18:36 PM
> To: Keith Schengili-Roberts
> Cc: public-schemaorg@w3.org <mailto:public-schemaorg@w3.org>; Colin Maudry
> Subject: Re: schema.org <http://schema.org/> Markup for DITA XML-based Technical Documentation
>  
> If you want to use DITA in RDF, there is this effort by Colin Maudry: http://colin.maudry.com/dita-rdf/#concept/welcome.html <http://colin.maudry.com/dita-rdf/#concept/welcome.html>
>  
> If you for some reason want to shoehorn it into schema.org <http://schema.org/> specifically, then it sounds like a bad idea.
>  
> On Fri, Jun 17, 2016 at 8:46 PM, Keith Schengili-Roberts <keith.roberts@ixiasoft.com <mailto:keith.roberts@ixiasoft.com>> wrote:
> Hello there:
>  
> I am wondering if there's the possibility of coming up with a Schema.org <http://schema.org/> format for content produced using the DITA XML structured format? It is primarily (but not exclusively) used by technical writing departments to produce content. It is estimated to be used by somewhere between 5-10% of all technical writing groups, mainly with medium- to large-firms. The standard is open, and is managed by OASIS (https://www.oasis-open.org/committees/dita/ <https://www.oasis-open.org/committees/dita/>).
>  
> DITA is topic based, with the latest standard (DITA 1.3) having six topic types: a generic "topic" type, then more specific concept, task, reference, glossary and troubleshooting types. Best Practices suggests that each topic come with a short description, so it is possible to easily identify the type of topic and what it describes.
>  
> XHTML output from DITA currently uses Dublin Core descriptive metadata, but it could just as easily use something that Schema.org <http://schema.org/> could recognize, likely using either the RDFa or Microdata formats. 
>  
> Is there interest in helping devise a bridge between DITA-based output and something that Schema.org <http://schema.org/> could use? I am happy to be an expert on the DITA end of things if there is someone willing to help guide me through the process as to what's needed on the Schema.org <http://schema.org/> end of things.
>  
> Cheers!
> -
>  
> Keith Schengili-Roberts
> DITA Information Architect / DITA Specialist
>  
> IXIASOFT 
> 825 Querbes, Suite 200, Montréal, Québec, Canada, H2V 3X1
> tel  + 1 514 279-4942 <tel:%2B%201%20514%20279-4942>  /  toll free + 1 877 279-4942 <tel:%2B%201%20877%20279-4942> 
> robertsk@ixiasoft.com <mailto:robertsk@ixiasoft.com>  /  www.ixiasoft.com <http://cp.mcafee.com/d/FZsS83gArhohhoh76zBN4TsSCztdBNV5xMSCztdBNVZUsrjhKCOUYyMedETo7n79EzCjpkDYqJxUa9RDVWN-SZ3oG_jBPpeI_fmfSTEr5nWsKrus7fnjovW_8TuKyqeuLsKCONvAQm4T6emKDp55mVEVvVkffGhBrwqrhdECXYDuZXTLuZPtPo0agvbqltDO-6P_QDO7GOfBk5i3VriHI-ndFEKc8L6MQ1wQg60MbwAQg1eDNd40Bm3LN-5Ld40Qp-4Ph07vfp7QdIL6Y11Q5gJZM7na>
>  
> <image001.jpg>
>  
> Interested in attending? Visit our event website <http://www.ixiasoft.com/en/news-and-events/ixiasoft-user-conference-2016> for more information. 
>  
> <OutlookEmoji-1457643010967_UC2016-logo.jpg.jpg><image001.jpg>

Received on Wednesday, 22 June 2016 19:54:40 UTC