Re: PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML Serialization]

Hi Hook,

Thanks for looking into this.  I would like to test out the proposed solution and I will provide feedback by the EOD.

Luc, would the proposed solution resolve this issue?

--Stephan

On Mar 28, 2013, at 4:40 AM, "Hua, Hook (388C)" <hook.hua@jpl.nasa.gov> wrote:

> Hi Luc and Stephan,
> 
> Somehow, with jaxb-ri-2.2.6, the removal of xsd:any still generates
> JAXBElements.
> 
> Hopefully we may not need to modify the xsd:any support nor use customized
> bindings mapping for JAXB. In looking into it further, I believe I have
> found a more upstream cause and a potentially cleaner solution.
> 
> Given that we have the following unfriendly XML binding mapping:
> 
> ------------------------------------------------------
> <xs:element name="document" type="prov:Document" />
> 
> <xs:complexType name="Document">
>  <xs:sequence maxOccurs="unbounded">
>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>    <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>  </xs:sequence>
>  </xs:complexType>
> 
> ---------
> 
> 
> public class Document {
>  @XmlElementRefs({
>  @XmlElementRef(name = "wasRevisionOf", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  @XmlElementRef(name = "activity", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  @XmlElementRef(name = "collection", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  @XmlElementRef(name = "bundle", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  @XmlElementRef(name = "wasQuotedFrom", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  @XmlElementRef(name = "wasInvalidatedBy", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>  ...
>  })
>  @XmlAnyElement(lax = true)
>  protected List<Object> entityAndActivityAndWasGeneratedBy;
>  ...
> ------------------------------------------------------
> 
> It appears that to help retain a round-trip marshalling/unmarshalling of
> our prov:Document, the unbounded sequence of its elements (including
> prov:BundleConstructor) must be uniquely distinguished by JAXB. The
> repeating sequences are treated as a List<Object> of generic JAXBElements,
> where the JAXBElemnt's QName is used to distinguish elements with
> different names. So the culprit may be the unbounded sequence.
> 
> <xs:sequence maxOccurs="unbounded">
>  <xs:group ref="prov:documentElements" minOccurs="0"/>
>  <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>  <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>  </xs:sequence>
> 
> 
> 
> 
> What if we move the unbounded occurrence into a wrapper complex type and
> keep the sequence singular? Below, I've introduced a "prov:DocumentBundle"
> wrapper complex type in which to apply the unbounded occurrence to. Then
> in "prov:DocumentBundle", maintain the same subelements as before, but as
> one occurrence of the sequence. Running it through JAXB now generates the
> cleaner prov-typed List elements. No customized bindings for JAXB needed.
> No removal of xsd:any needed.
> 
> 
> ------------------------------------------------------
> <xs:element name="document" type="prov:Document" />
> 
>  <xs:complexType name="DocumentBundle">
>  <xs:sequence>
>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>    <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>  </xs:sequence>
>  </xs:complexType>
> 
>  <xs:complexType name="Document">
>  <xs:sequence>
>    <xs:element name="documentBundle" type="prov:DocumentBundle"
> minOccurs="0" maxOccurs="unbounded"/>
>  </xs:sequence>
>  </xs:complexType>
> 
> ---------
> 
> public class Document {
>  protected DocumentBundle documentBundle;
>  ...
> 
> public class DocumentBundle {
>  protected List<Entity> entity;
>  protected List<Activity> activity;
>  protected List<Generation> wasGeneratedBy;
>  protected List<Usage> used;
>  protected List<Communication> wasInformedBy;
>  protected List<Start> wasStartedBy;
>  protected List<End> wasEndedBy;
>  protected List<Invalidation> wasInvalidatedBy;
>  ...
> ------------------------------------------------------
> 
> We could also rename the wrapper "prov:DocumentBundle" to something else
> reduce possible confusion with prov:Bundle and prov:BundleConstructor.
> 
> 
> 
> I think we need to understand that this approach introduces another
> indirection artifact in the PROV-XML encoding. Would this be an acceptable
> compromise approach around the JAXBElement issue?
> 
> --Hook
> 
> 
> 
> 
> 
> 
> 
> 
> On 3/21/13 8:49 AM, "Stephan Zednik" <zednis@rpi.edu> wrote:
> 
>> 
>> 
>> On Mar 21, 2013, at 5:26 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>> 
>>> H Hook,
>>> 
>>> Thanks for this analysis.
>>> 
>>> In this specific instance, I think that it is the element
>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>> occurring inside
>>> <xs:sequence maxOccurs="unbounded">
>>> that causes these jaxb elements to be generated.
>>> 
>>> If you were to remove xsd:any there, jaxbElements would no longer be
>>> generated.
>>> 
>>> While we want to allow the possibility of elements from other schemas,
>>> do we
>>> really want to allow them any where inside a document/bundle?
>> 
>> We want to provide for elements from other schemas but I don't think we
>> formally identified what areas we intend to allow non-prov elements in
>> before we added this functionality to the schema.
>> 
>> What if we made a FAQ entry about OXM mappings with PROV-XML and created
>> a customized schema or bindings file specifically for JAXB code
>> generation?  This would allow us to work on asynchronously with the
>> document and past the note publication, it would also allow us to
>> introduce JAXB-specific solutions that I do not think make sense in the
>> official schema or note.
>> 
>> --Stephan
>> 
>>> 
>>> Luc
>>> 
>>> 
>>> On 03/21/2013 11:09 AM, Hua, Hook (388C) wrote:
>>>> Hi Luc,
>>>> 
>>>> I'm using jaxb-ri-2.2.6 against our latest prov*.xsd and seeing
>>>> slightly
>>>> different bindings with JAXBElement:
>>>> 
>>>> public class Document {
>>>>    @XmlElementRefs({
>>>>        @XmlElementRef(name = "hadPrimarySource", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>        @XmlElementRef(name = "agent", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>        @XmlElementRef(name = "activity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>        @XmlElementRef(name = "organization", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>        @XmlElementRef(name = "softwareAgent", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>> ....
>>>> 
>>>> 
>>>> 
>>>> Some findings:
>>>> 
>>>> 
>>>> (1) JAXB's generation of JAXBElement<T> classes seems to be a wrapper
>>>> approach to preserve sufficient information in the schema for
>>>> round-trip
>>>> marshaling & unmarshalling of values in XML instances. More
>>>> specifically,
>>>> it wraps the data with a QName and a nillable flag [1].
>>>> 
>>>> It appears that the a frequent cause of JAXB producing JAXBElement<T>
>>>> is
>>>> its attempt to preserve elements with both minOccurs=0 and
>>>> nillable=true.
>>>> JAXB needs to distinguish between the two cases where:
>>>> 
>>>>  a. element missing, minOccurs=0, then jaxbElement==null
>>>>  b. element present, xsi:nil=true, then jaxbElement.isNil()==true
>>>> 
>>>> It would not be possible to distinguish between these two states if the
>>>> bindings were the raw types.
>>>> 
>>>> 
>>>> 
>>>> (2) It would be possible to customize the JAXB bindings [2] to ignore
>>>> the
>>>> full round-trip requirement. The "generateElementProperty=false"
>>>> customization option "can be used to generate an alternate developer
>>>> friendly but lossy binding" [3].
>>>> 
>>>> I tried variations of a "bindings.xjb" customization file:
>>>> 
>>>> <jaxb:bindings version="2.1"
>>>>  xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
>>>>  xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
>>>>  xmlns:xs="http://www.w3.org/2001/XMLSchema">
>>>>  <jaxb:bindings schemaLocation="prov-core.xsd"
>>>>    node="//xs:complexType[@name='Document']">
>>>>    <jaxb:globalBindings generateElementProperty="false" />
>>>>  </jaxb:bindings>
>>>> </jaxb:bindings>
>>>> 
>>>> $ xjc.sh -d BINDINGS -b bindings.xjb prov.xsd
>>>> 
>>>> But none truly eliminated the JAXBElement<T> from the bindings.
>>>> 
>>>> 
>>>> 
>>>> (3) Nowhere in our prov-core.xsd do we define minOccurs=0 in
>>>> conjunction
>>>> with nillable=true. In my attempts with JAXB, I'm seeing JAXBElements
>>>> appearing in the bindings for the (a) Document class and (b)
>>>> BundledConstructor class. Both types leverage the prov:documentElements
>>>> grouping.
>>>> 
>>>>  <xs:element name="document" type="prov:Document" />
>>>> <xs:complexType name="Document">
>>>>  <xs:sequence maxOccurs="unbounded">
>>>>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>    <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>> minOccurs="0"/>
>>>>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>  </xs:sequence>
>>>>  </xs:complexType>
>>>> 
>>>> It's unclear if there is some nillable-like affect that triggers JAXB
>>>> to
>>>> generate the JAXBElements.
>>>> 
>>>> 
>>>> 
>>>> (4) On the upside, JAXB does provide an ObjectFactory class as part of
>>>> the
>>>> generated bindings that define creational factory methods to generate
>>>> the
>>>> JAXBElement instances. For example:
>>>> 
>>>>  public JAXBElement<Usage> createUsed(Usage value)
>>>> 
>>>> Still, I agree that it is not as clean.
>>>> 
>>>> 
>>>> --Hook
>>>> 
>>>> 
>>>> [1] http://docs.oracle.com/javaee/5/api/javax/xml/bind/JAXBElement.html
>>>> [2]
>>>> 
>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/tut
>>>> ori
>>>> al/doc/JAXBUsing4.html#wp148515
>>>> [3]
>>>> 
>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/reference/tu
>>>> tor
>>>> ials/wsit/doc/DataBinding5.html
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 3/8/13 4:20 AM, "Provenance Working Group Issue Tracker"
>>>> <sysbot+tracker@w3.org> wrote:
>>>> 
>>>>> PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML
>>>>> Serialization]
>>>>> 
>>>>> http://www.w3.org/2011/prov/track/issues/648
>>>>> 
>>>>> Raised by: Luc Moreau
>>>>> On product: XML Serialization
>>>>> 
>>>>> 
>>>>> Hi
>>>>> 
>>>>> I have ported the ProvToolbox and the ProvValidator to the new XML
>>>>> schema.
>>>>> I just wanted to report on my experience with the schema and JAXB.
>>>>> Obviously, others may have better experience with JAXB and may be able
>>>>> to help on some of the issues I encountered.
>>>>> 
>>>>> Everything worked fine, except:
>>>>> - <xs:element ref="prov:internalElement abstract=true/>
>>>>> - extensibility <xs:any namespace="##other"/> in Document and Bundle
>>>>> 
>>>>> 
>>>>> These two constructs, while processable by JAXB, are not
>>>>> JAXB-friendly.
>>>>> 
>>>>> Indeed, JAXB compiles the schema in a list containing all possible
>>>>> statements.
>>>>> 
>>>>>   protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>> 
>>>>> However, the presence on an abstract element and an <any/> element
>>>>> result
>>>>> in the
>>>>> content of that list to be of type:
>>>>> 
>>>>> 
>>>>>   @XmlElementRefs({
>>>>>       @XmlElementRef(name = "used", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>       @XmlElementRef(name = "wasAssociatedWith", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>       @XmlElementRef(name = "person", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>       @XmlElementRef(name = "entity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>       @XmlElementRef(name = "wasInfluencedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#"
>>>>> ....
>>>>>   })
>>>>> 
>>>>>   @XmlAnyElement(lax = true)
>>>>>   protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>> 
>>>>> where all data structures are wrapped up in this unpleasant
>>>>> JAXBElement.
>>>>> 
>>>>> Without these features, we get a much more natural mapping:
>>>>>   @XmlElements({
>>>>>       @XmlElement(name = "entity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Entity.class),
>>>>>       @XmlElement(name = "activity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Activity.class),
>>>>>       @XmlElement(name = "wasGeneratedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = WasGeneratedBy.class),
>>>>>       @XmlElement(name = "used", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Used.class),
>>>>>       @XmlElement(name = "wasInformedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = WasInformedBy.class),
>>>>>   ...
>>>>> })
>>>>> 
>>>>> So, how I did I solve the problem?  I inserted the extension schemas
>>>>> into
>>>>> the schema file, and hence got rid of the abstract element.  I am ok
>>>>> with
>>>>> this. We could possible provide the utility to that transformation.
>>>>> 
>>>>> For the extensibility, I used a different definition. It happens to
>>>>> parse prov-xml compliant xml. When serializing, it  puts all
>>>>> extensibility elements at the end.  This is not a satisfactory
>>>>> solution, and is likely to be dependent of the jaxb implementation
>>>>> (though I am not entirely sure).
>>>>> 
>>>>> 
>>>>>  <xs:complexType name="Document">
>>>>>    <xs:sequence>
>>>>>      <xs:choice maxOccurs="unbounded">
>>>>>        <xs:group ref="prov:documentElements"/>
>>>>>        <xs:element name="bundleContent" type="prov:NamedBundle"/>
>>>>>      </xs:choice>
>>>>>      <xs:any namespace="##other" processContents="lax" minOccurs="0"
>>>>> maxOccurs="unbounded"/>
>>>>>    </xs:sequence>
>>>>>  </xs:complexType>
>>>>> 
>>>>> Can something be done to make the XML schema a bit more jaxb friendly,
>>>>> while still keeping the same flexibility?  Thoughts welcome.
>>>>> 
>>>>> Cheers,
>>>>> Luc
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Professor Luc Moreau
>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>> University of Southampton          fax:   +44 23 8059 2865
>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

Received on Thursday, 28 March 2013 15:05:02 UTC