RE: Transforming RDF into XML (alternative using ShEx)

Hi Felix, all,

I must admit I meet no difficulty going from XML to RDF. I'll be happy to share files and we should have a way fo sharing such examples on github or else. I would personally by interested to actually "see" what problems others might have. Maybe I missed something :--)

Best, Jean-Pierre
________________________________
From: Felix Sasaki [fsasaki@w3.org]
Sent: 09 July 2016 11:52
To: Jose Emilio Labra Gayo
Cc: public-rax@w3.org
Subject: Re: Transforming RDF into XML (alternative using ShEx)

Thanks, Jose. From the call yesterday we came up with a rough classification of how to structure our discussion

1) functionality needed: convert XML to RDF, or the other way round
2) way to implement it: with standard technologies (e.g. XSLT on the XML side) or with enhancements
3) enhancements being a ad hoc program (see Quentin’s approach described in a mail yesterday: "The approach is supplemented by a library of Java functions  that can be called from XSLTs to process the graph.“)
4) enhancements being kind of „standard“ or at least a declarative form (see in reply to Quentin: Christoph describing https://github.com/EIS-Bonn/krextor which provides an abstraction layer)

I would see ShEX falling under 4).

The reason why people do not do 2) is described by Christoph and Quentin

[
As such, transformation between RDF and XML using XSLTs is not optimal and XSLTs need to be tailored  to different flavor of RDF/XML (which is costly).. One solution that we have used is to transform the RDF input (in any supported syntax) to be converted into a XML canonical form prior to applying XSLTs.

Indeed many manually-crafted XSLTs suffer from this,
]

Now, if people want to stay with 2), they can follow some potential best practice; e.g.  how to store RDF output, so that it can be processed with standard XML tools. See the mail from Martynas:

[
We use however a canonical "flat" RDF/XML layout, in which <rdf:Description>s are not nested (default output by Jena writer). By limiting RDF/XML to such layout, and using the key() function to lookup descriptions etc., the transformation becomes quite manageable.
]

In the list 1-4 above *currently* the actual functionality is mixed - but we could better classify the discussion if we start with three sub tasks:

1.1 Going from RDF > XML
1.2 Going from XML > RDF
1.3 Doing round tripping
1.4 Embedding RDF in XML (already feasible with technologies like RDFa or JSON-LD)
1.5 Embedding XML in RDF
…

Just an idea how to structure the discussion - comments welcome.

Best,

Felix


Am 09.07.2016 um 08:26 schrieb Jose Emilio Labra Gayo <jelabra@gmail.com<mailto:jelabra@gmail.com>>:

I could not attend yesterday's call because I was travelling and was not sure at what time I would arrive...

I was looking at the minutes and saw that there is interest on transforming RDF to XML. I wanted to say that one alternative could be Shape Expressions (ShEx).

Although the primary goal of ShEx is to describe and validate RDF data, it can also be extended with semantic actions which are similar to parsing grammar semantic actions and in that way, it can be used to transform the RDF that has been validated into XML or other languages like Json.

In [1] we describe ShEx and we include an example of how ShEx can be used to generate XML from RDF.

For more information about ShEx you can visit: http://shex.io<http://shex.io/>

Also, if you have interest on the subject, we have recently given a tutorial about RDF validation at ESWC. The slides are available online at: http://weso.github.io/RDFValidation_ESWC16/

[1] Shape Expressions: An RDF validation and transformation language, Eric Prud'hommeaux<http://www.w3.org/People/Eric/>, Jose Emilio Labra Gayo<http://www.di.uniovi.es/%7Elabra>, Harold Solbrig<http://www.mayo.edu/research/faculty/solbrig-harold-ralph/bio-20022700>, 10th International Conference on Semantic Systems<http://www.semantics.cc/>, Sept. 2014, Leipzig, Germany, PDF<http://labra.github.io/ShExcala/papers/semantics2014.pdf> (slides<http://www.slideshare.net/jelabra/semantics-2014>)

Best regards, Jose Labra

On Fri, Jul 8, 2016 at 3:09 PM, Felix Sasaki <fsasaki@w3.org<mailto:fsasaki@w3.org>> wrote:
See
https://www.w3.org/2016/07/08-rax-minutes.html
and below as text. I created this wiki page

https://github.com/rax-w3c/draft-material/wiki

for starting editing. But I have trouble adding people to the group - can somebody help? Florent?

- Felix

   [1]W3C

      [1] http://www.w3.org/

                               - DRAFT -

                                 RAX CG

08 Jul 2016

   See also: [2]IRC log

      [2] http://www.w3.org/2016/07/08-rax-irc

Attendees

   Present
          felix, christian, gerard, john, timea, christoph,
          quentin

   Regrets
          rob, phil

   Chair
          christian:

   Scribe
          fsasaki

Contents

     * [3]Topics
         1. [4]new members
         2. [5]action items
         3. [6]input from quentin et al.
         4. [7]mapping XML of dictionary data to RDF
         5. [8]other sections & topics
         6. [9]next call in two weeks from now
     * [10]Summary of Action Items
     * [11]Summary of Resolutions
     __________________________________________________________

   <scribe> chair: christian

   <scribe> scribe: fsasaki

   [12]https://lists.w3.org/Archives/Public/public-rax/2016Jul/000
   4.html

     [12] https://lists.w3.org/Archives/Public/public-rax/2016Jul/0004.html

new members

   no new members, skipped

action items

   github link for rax cg is [13]https://github.com/rax-w3c

     [13] https://github.com/rax-w3c

input from quentin et al.

   see
   [14]https://lists.w3.org/Archives/Public/public-rax/2016Jul/000
   2.html

     [14] https://lists.w3.org/Archives/Public/public-rax/2016Jul/0002.html

   "One solution that we have used is to transform the RDF input
   (in any supported syntax) to be converted into a XML canonical
   form prior to applying XSLTs. "

   <clange> help

   processing RDF data in XML / xslt workflows

   christian: also covered by one XML prague scenario

   <quentin> canonical format new to be optimized for performance

   christoph: similar solution is here
   [15]https://lists.w3.org/Archives/Public/public-rax/2016Jul/000
   7.html
   ... has RDF abstraction layer, this is independent of RDF
   output serialization
   ... related: what RDF abstract serialization to use

     [15] https://lists.w3.org/Archives/Public/public-rax/2016Jul/0007.html

   collection of state of the art canonical representations of RDF
   in XML

   quentin: you should always optimize canonical format that you
   are processsing

   <clange> there is, e.g., RXR (Regular XML RDF), there is TriX,
   and there are more

   quentin: XSLT is generally not enough, one needs libraries on
   top of XSLT

   christian: "processing RDF data in XML / xslt workflows" -
   important topic, has several solutions, we need to collect info
   what works good, what not etc.

   <clange> sources for the canonical formats mentioned above: RXR
   ([16]https://www.dajobe.org/papers/xmleurope2004/), TriX
   ([17]https://www.w3.org/2004/03/trix/,
   [18]https://en.wikipedia.org/wiki/TriX_(syntax) )

     [16] https://www.dajobe.org/papers/xmleurope2004/),
     [17] https://www.w3.org/2004/03/trix/,
     [18] https://en.wikipedia.org/wiki/TriX_(syntax)

   <clange> AFAIK there is also a newer approach based on the
   SPARQL XML results format

   [19]https://github.com/rax-w3c

     [19] https://github.com/rax-w3c

   <clange> @fsasaki my GitHub account is "clange"

   <TimeaT> my GitHub is theRealImy

   <quentin> @fasaki my GitHub is qhreul

   <scribe> ACTION: felix to add people to rax-cg github [recorded
   in [20]http://www.w3.org/2016/07/08-rax-minutes.html#action01]

     [20] http://www.w3.org/2016/07/08-rax-minutes.html#action01]

   quentin: happy to do the description. won't be able to go
   through all material that christoph provided

   christoph: happy to help

   <scribe> ACTION: quention to start section on RDF data in XML /
   xslt workflows topic. christoph to add material [recorded in
   [21]http://www.w3.org/2016/07/08-rax-minutes.html#action02]

     [21] http://www.w3.org/2016/07/08-rax-minutes.html#action02]

   TimeaT: previous problems, solutions were sent directly per
   mail. there was also a template, not sure how to send input

   Christian: use the template, send to the list

   <clange> BTW my colleagues will write up their two use cases
   (AutomationML Industry 4.0 and GATE NLP) by the next telco.

mapping XML of dictionary data to RDF

   timea: dictionary company stores data in XML format, for
   further publishing. They wanted to bring XML into RDF, to be
   able to enrich dictionaries
   ... they can be multi- or monolingual, because language is
   complex
   ... need a way to convert XML to RDF
   ... for working with dictionaries you have to use lemon /
   ontolex, which may be extended
   ... solved currently with XSLT
   ... writing the XSLT proved to be very long and tedious
   ... not sure if we can achieve a general XSLT
   ... next step is how to connect words between dictionaries
   ... we reached very good point with XSLT, just wondering if
   there is a better way

   felix: so no integration of RDF in XML needed?

   timea: would have been desired, but dropped because XML mapping
   took a long time
   ... roundtripping is difficult
   ... not sure if it is needed

   christoph: for making it easier to create XML / RDF
   translation, there are solutions
   ... I have developed something, felix has one
   ... so this should be a part: write a classification of
   existing libraries
   ... e.g. libraries that are based on XSLT. You start with that
   and it becomes hard to maintain, then people add an abstraction
   layer
   ... adding link to my implementation here

   <quentin> for RDF -> XML, an approach could be to use SPARQL
   and XML templates

   <quentin> ... essentially creating registere queries

   <clange> [22]https://github.com/EIS-Bonn/krextor is a library
   of XSLT templates and functions for XML to RDF

     [22] https://github.com/EIS-Bonn/krextor

   timea: we used "unified views", etl tool that read the XML and
   writes all data into virtuoso
   ... the actual mapping if endepend of unified views

   <clange> @Timea see
   [23]http://ceur-ws.org/Vol-449/ShortPaper2.pdf for an old,
   high-level overview

     [23] http://ceur-ws.org/Vol-449/ShortPaper2.pdf

   [24]http://fsasaki.github.io/stuff/feisgiltt2016/

     [24] http://fsasaki.github.io/stuff/feisgiltt2016/

   felix: currently purely XSLT based solution, without an
   abstraction layer

   timea: classification of existing libraries, describe their
   drawbacks etc.

   christoph: would contribute to such a section

   <quentin> depending on the complexity of the transformation,
   you can use XPATH and templates to populate the RDF

   <scribe> ACTION: timea to start section on existing tools &
   their classification, benefits and drawbacks - others to
   contribute [recorded in
   [25]http://www.w3.org/2016/07/08-rax-minutes.html#action03]

     [25] http://www.w3.org/2016/07/08-rax-minutes.html#action03]

   quentin: you can also just use XPath and java @@@ and then are
   able to keep mappings
   ... you can do similar things from RDF > XML perspective, using
   SPARQL templates

   timea: that sounds familiar how the XSLT is built; has a lot of
   templates
   ... we transformed the XML in a simple RDF, and then converted
   RDF into the model needed
   ... ontolex / lemon + extension

   christioph: I have pointers to further systems, have seen a
   related paper
   ... will add a link here
   ... on roundtripping, let's not forget the XSParql
   specification
   ... it combines XQuery and SPARQL

   <clange> SPARQL templates for RDF to XML have actually been
   implemented as a proper language:
   [26]https://hal.inria.fr/hal-01150623/document. XSPARQL
   ([27]http://xsparql.deri.org/) does a similar job but it's IMHO
   more suited for queries over XML or RDF or both at the same
   time, rather than transforming whole RDF graphs or whole XML
   documents.

     [26] https://hal.inria.fr/hal-01150623/document.
     [27] http://xsparql.deri.org/)

   christioph: it is better for query than complete
   transformation, but very interesting approaches

   <quentin> there is an overlap between the two use-case in terms
   of transformation from 1 format to another

   felix: asking john on mark logic support for RDF and how it
   fits into the picture

other sections & topics

   nothing at the moment

next call in two weeks from now

   same time

   john: we are working on tech. to store RDF in XML
   ... also to make it easier to work with XML described by RDF
   graph
   ... interested to see in problems that people have and finding
   solutions

   adjourned

Summary of Action Items

   [NEW] ACTION: felix to add people to rax-cg github [recorded in
   [28]http://www.w3.org/2016/07/08-rax-minutes.html#action01]
   [NEW] ACTION: quention to start section on RDF data in XML /
   xslt workflows topic. christoph to add material [recorded in
   [29]http://www.w3.org/2016/07/08-rax-minutes.html#action02]
   [NEW] ACTION: timea to start section on existing tools & their
   classification, benefits and drawbacks - others to contribute
   [recorded in
   [30]http://www.w3.org/2016/07/08-rax-minutes.html#action03]

     [28] http://www.w3.org/2016/07/08-rax-minutes.html#action01
     [29] http://www.w3.org/2016/07/08-rax-minutes.html#action02
     [30] http://www.w3.org/2016/07/08-rax-minutes.html#action03

Summary of Resolutions

   [End of minutes]
     __________________________________________________________


    Minutes formatted by David Booth's [31]scribe.perl version
    1.144 ([32]CVS log)
    $Date: 2016/07/08 12:48:48 $
     __________________________________________________________

     [31] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
     [32] http://dev.w3.org/cvsweb/2002/scribe/

Scribe.perl diagnostic output

   [Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.144  of Date: 2015/11/17 08:39:34
Check for newer version at [33]http://dev.w3.org/cvsweb/~checkout~/2002/
scribe/

     [33] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/tempaltes/templates/
Found Scribe: fsasaki
Inferring ScribeNick: fsasaki
Present: felix christian gerard john timea christoph quentin
Regrets: rob phil
Got date from IRC log name: 08 Jul 2016
Guessing minutes URL: [34]http://www.w3.org/2016/07/08-rax-minutes.html
People with action items: felix quention timea

     [34] http://www.w3.org/2016/07/08-rax-minutes.html


   [End of [35]scribe.perl diagnostic output]

     [35] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm





--
-- Jose Labra


------------------------------------------------------------------------------

**************************************************
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error, please notify the system manager. This footnote also confirms that this email message has been swept by the mailgateway
**************************************************

Received on Saturday, 9 July 2016 16:36:59 UTC