RE: Java, RDF, XML ? from Waleed Abdulla on 2005-07-08 (public-rule-workshop-discuss@w3.org from July 2005)

From: Waleed Abdulla <Waleed_Abdulla@xrules.org>
Date: Fri, 8 Jul 2005 01:00:57 -0700
To: "'Sandro Hawke'" <sandro@w3.org>
Cc: "'Anthony Finkelstein'" <anthony@systemwire.com>, <public-rule-workshop-discuss@w3.org>
Message-ID: <001901c58393$2e331ce0$6501a8c0@waleedxp2>
> From: Sandro Hawke [mailto:sandro@w3.org]
>
> Let me work through the mapping a bit here, and see if our options
> become more clear.
> 
> == 1.  Use Programming-Language Objects as Fundamental Data
> ....
> 
> == 2.  Use RDF Triples as Fundamental Data
> .....
> 
> == 3.  Use XML Documents as Fundamental Data
> .....
> ====
> So that's my quick analysis to get us started.  What issues have I
> missed?


Very interesting analysis. Let's see if we can take this a step further. As
you mentioned, the Java + Jena + DOM doesn't strike me either as a suitable
basis for a standard. So, for the sake of this discussion, and to keep
things simple, let's drop this option for now. 

This leaves us with two Fundamental Data Formats: XML and RDF. So, what are
our options? I'll list three:

    1. Choose RDF and build a rules language for it. XML folks 
       will have to map their data to RDF to be able to use the 
       rules language.

    2. Choose XML and build a rules language for it. RDF folks 
       will have to map their data to XML.

    3. Build two rules languages: one for XML and one for RDF.


Now, let's see if we can find arguments for or against these options. 

    a. A made-up statistic shows that at least 60% of the 
       workshop attendees were interested in RDF rules. 
       Of course, the workshop attendees are not a large 
       enough sample of the overall user base, but it's 
       still something to consider nonetheless.

    b. Between the two formats, the vast majority of data 
       exchanged on the Web today is in XML, not RDF. 

    c. The mapping issue. We always talk about how XML 
       data can be mapped to RDF and vice versa. It's 
       true, mapping can be done. But, the real question 
       is: Is it practical, and will people be willing 
       to do it? To be clear, there are two types of 
       mapping:
          c1. Real mapping; in which I convert my UBL 
              PurchaseOrder from XML format to RDF using 
              XSLT or some other method; apply my business 
              rules; and then convert the results back 
              to XML. Or the other way around (RDF-XML-RDF).

          c2. On the fly mapping (couldn't think of a 
              better name). Here, we're not really doing 
              a data conversion, but instead, allowing RDF 
              technologies to access XML nodes as if they 
              were RDF resources. Or, allowing XPath and 
              other XML tools to access RDF resources as if 
              they were XML nodes. You mentioned treehugger 
              and TriX as possible options. 


    What does this tell us? My own, personal, feeling is leaning towards
option 3. I feel that we do need two languages:

1. A rules language for RDF to do inferenceing and all 
   those other cool things. And, 

2. A rules language for XML to complement XML Schema 
   validations, add more expressive power to XFroms 
   formulas, and allow Java/C# rule engine users to 
   write rules against serialized forms of Java 
   and C# objects.


With that said, and despite my belief that there is a need for an XML rules
language, I couldn't help but be very excited about N3 for its elegance and
simplicity. So, I'm also testing option 1, and working on an extension to
CWM to test the possibility, and practicality, of expressing XML rules in N3
to apply them to XML documents without real mapping (using on the fly
mapping). I'll try to post my findings once I have something working. 


>(2) Relationship to XSLT: how is XSLT not the rule language people
>    are looking for?   How would this one be different?

I think the answer is all about simplicity. For example, why do we need
Java, C#, Python...etc if we can do all kinds of coding using assembly
language? It has to be easy for people to understand and work with. And XSLT
is not an easy way to express business rules. I always hear from people who
try it, but then stop quickly because their XSLT documents become too
complex too quickly. The business rules essentially get lost among a lot of
transformation code. 

Regards,
Waleed



> -----Original Message-----
> From: Sandro Hawke [mailto:sandro@w3.org]
> Sent: Thursday, July 07, 2005 8:46 AM
> To: Waleed Abdulla
> Cc: 'Anthony Finkelstein'; public-rule-workshop-discuss@w3.org
> Subject: Java, RDF, XML ?
> 
> 
> Waleed Abdulla writes:
> > Another dimension to divide rules by is the type of data they work with.
> I
> > see three here:
> >
> >     * Rules that work with Java or C# objects (the kind common in some
> >       of the popular rules engines). These can be inference, validation,
> >       or execution rules.
> >
> >     * Ruels that work with RDF, such as N3 in CWM. These are typically
> >       inference rules, but they can also be used for validation (think
> >       of validation as inferring weather a set of data is valid by
> >       some criteria). I haven't seen examples of RDF rules used for
> >       execution, but I can't think of a reason why they can't be used
> >       to trigger events that an application intercepts to execute tasks.
> >
> >     * Rules that work with XML documents. Again, these can be used to
> >       validate XML documents, inference values of nodes from values of
> >       other nodes, or trigger execution when certain conditions apply.
> 
> This is an interesting issue, which I didn't notice being discussed at
> the workshop.  I suspect a lot of people were thinking either "of
> course it'll be in [Java/RDF/XML]" (for whichever form they are used
> to), or like me they thought "let's not open this can of worms yet!".
> But I agree it needs to be addressed.  (I'd probably add SQL data to
> the above list, but I'll leave it out for now, for simplicity.)
> 
> As with the three types of functionality in the previous mail [1], I
> think these data forms mostly map to each other; they present very
> different views to the world, but their basic expressive power is the
> same, or nearly the same.  That said, Java expressions, SPARQL, and
> XPath feel about as different as three languages possibly could!
> 
> Let me work through the mapping a bit here, and see if our options
> become more clear.
> 
> == 1.  Use Programming-Language Objects as Fundamental Data
> 
>   It's easy enough to see how to access RDF or XML data through Java
>   expressions: look at the relevent APIs.
> 
>   The difficulties here are:
> 
>    (1) Do we use Java, C#, Python, C++, ECMA Script, ... or what?
>        Picking one doesn't seem feasible, and supporting more than one
>        means we don't have much of a standard.
> 
>    (2) Even if we could pick a programming language, we'd also have to
>        pick an API, right?   I don't know the relevant markets well
>        enough to know if there are defacto standards or not.
> 
>   I guess the strawman here would be Java + Jena + DOM.  That doesn't
>   strike me as a suitable basis for a standard, but we can talk about it
>   more if anyone really likes it.
> 
> == 2.  Use RDF Triples as Fundamental Data
> 
>   Data held in programming-language objects can be presented as RDF
>   triples.  The mapping is usually quite natural -- a data member of
>   an instance is pretty much the same as a value for a property of a
>   resource.  Side-effect-free member functions can appear as
>   predicates (properties) as well, although some approach to doing
>   n-ary predicates in RDF is required if the function takes any
>   parameters.  Functions/methods with side-effects need to be handled
>   as actions, which I think are a different topic.
> 
>   XML documents can be presented in RDF fairly naturally too.  An XML
>   element is a resource with some properties (XML attributes) and some
>   content -- an RDF list of more XML element (as above, recursively)
>   and text elements (literals in RDF).
> 
>   The downsides here are:
> 
>     (1) N-ary predicates and lists need syntactic sugar, at least, to
>         be practical in RDF.   N3's [] and () might do the trick.
> 
>     (2) This is a fairly novel approach.
> 
> == 3.  Use XML Documents as Fundamental Data
> 
>   Java, etc, can already serialize their data as XML, and of course
>   RDF has an XML serialization.  So one could use XPath to access Java
>   or RDF data via an XML infoset.
> 
>   Downsides:
> 
>     (1) Getting at RDF triples via XPath from the RDF/XML syntax is
>         extremely difficult.   This could be addressed with either an
>         XPath extension (eg "treehugger") or a new XML syntax for RDF
>         (eg "TriX").   Or maybe one could extend XPath with rules in
>         the rule language, so one could write an RDF/XML parser in the
>         rule language and then use it.  (Rereading this, that sounds
>         kind of scary, but it's also kind of exciting.)
> 
>     (2) Relationship to XSLT: how is XSLT not the rule language people
>         are looking for?   How would this one be different?
> 
> 
> ====
> 
> So that's my quick analysis to get us started.  What issues have I
> missed?
> 
> 
> > > Each area has its own typical language: "if condition then condition",
> > > "if [not] condition then error", and "if condition then action", but
> > > they obviously have a lot in common, too.
> > >
> > > Do you see a big problem with establishing a standard language of "if
> > > condition then condition/error/action" and defining conformant
> > > implementations in terms of the above functions?
> >
> > I think the real challenge is more about data and expression language.
> > Basically, "what data do our rules work with" and "what language the
> rules
> > are expressed in". As you suggested, all rules can be expressed as
> > "If condition then condition/error/action", but are we going to write
> the
> > condition in Java, XML, XPath, RDF, or something else? If we just define
> a
> > skeleton for If .. Then and leave the language of expression
> > un-standardized, then we haven't really done much.
> >
> >     Obviously, the answer depends on the type of data we're targeting.
> If
> > our data is Java objects, then XPath is probably not a good choice. If
> our
> > data is RDF, then some sort of RDF extension might be the right choice.
> > Weather one language can be written to express conditions and actions
> for
> > all of our Java objects, RDF, and XML is still not clear to me. Any
> ideas
> > about how this language might look like would be a great contribution
> (and
> > I'm not talking about a rules language, just an expression language to
> > express the conditions and actions). If we get that, we're half way
> done.
> 
> It's not clear to me whether this question needs to be settled
> pre-chartering, or if it can be handled by the Working Group.  It does
> seem like it would be helpful to give it some discussion now, though.
> 
>       -- sandro
> 
> 
> [1] http://lists.w3.org/Archives/Public/public-rule-workshop-
> discuss/2005Jul/0007
Received on Friday, 8 July 2005 08:01:13 UTC