Re: PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML Serialization]

Hi Luc and Stephan,

Somehow, with jaxb-ri-2.2.6, the removal of xsd:any still generates
JAXBElements.

Hopefully we may not need to modify the xsd:any support nor use customized
bindings mapping for JAXB. In looking into it further, I believe I have
found a more upstream cause and a potentially cleaner solution.

Given that we have the following unfriendly XML binding mapping:

------------------------------------------------------
<xs:element name="document" type="prov:Document" />

<xs:complexType name="Document">
  <xs:sequence maxOccurs="unbounded">
    <xs:group ref="prov:documentElements" minOccurs="0"/>
    <xs:element name="bundleContent" type="prov:BundleConstructor"
minOccurs="0"/>
    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
  </xs:sequence>
  </xs:complexType>

---------


public class Document {
  @XmlElementRefs({
  @XmlElementRef(name = "wasRevisionOf", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  @XmlElementRef(name = "activity", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  @XmlElementRef(name = "collection", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  @XmlElementRef(name = "bundle", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  @XmlElementRef(name = "wasQuotedFrom", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  @XmlElementRef(name = "wasInvalidatedBy", namespace =
"http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
  ...
  })
  @XmlAnyElement(lax = true)
  protected List<Object> entityAndActivityAndWasGeneratedBy;
  ...
------------------------------------------------------

It appears that to help retain a round-trip marshalling/unmarshalling of
our prov:Document, the unbounded sequence of its elements (including
prov:BundleConstructor) must be uniquely distinguished by JAXB. The
repeating sequences are treated as a List<Object> of generic JAXBElements,
where the JAXBElemnt's QName is used to distinguish elements with
different names. So the culprit may be the unbounded sequence.

<xs:sequence maxOccurs="unbounded">
  <xs:group ref="prov:documentElements" minOccurs="0"/>
  <xs:element name="bundleContent" type="prov:BundleConstructor"
minOccurs="0"/>
  <xs:any namespace="##other" processContents="lax" minOccurs="0" />
  </xs:sequence>




What if we move the unbounded occurrence into a wrapper complex type and
keep the sequence singular? Below, I've introduced a "prov:DocumentBundle"
wrapper complex type in which to apply the unbounded occurrence to. Then
in "prov:DocumentBundle", maintain the same subelements as before, but as
one occurrence of the sequence. Running it through JAXB now generates the
cleaner prov-typed List elements. No customized bindings for JAXB needed.
No removal of xsd:any needed.


------------------------------------------------------
<xs:element name="document" type="prov:Document" />

  <xs:complexType name="DocumentBundle">
  <xs:sequence>
    <xs:group ref="prov:documentElements" minOccurs="0"/>
    <xs:element name="bundleContent" type="prov:BundleConstructor"
minOccurs="0"/>
    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
  </xs:sequence>
  </xs:complexType>

  <xs:complexType name="Document">
  <xs:sequence>
    <xs:element name="documentBundle" type="prov:DocumentBundle"
minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  </xs:complexType>

---------

public class Document {
  protected DocumentBundle documentBundle;
  ...

public class DocumentBundle {
  protected List<Entity> entity;
  protected List<Activity> activity;
  protected List<Generation> wasGeneratedBy;
  protected List<Usage> used;
  protected List<Communication> wasInformedBy;
  protected List<Start> wasStartedBy;
  protected List<End> wasEndedBy;
  protected List<Invalidation> wasInvalidatedBy;
  ...
------------------------------------------------------

We could also rename the wrapper "prov:DocumentBundle" to something else
reduce possible confusion with prov:Bundle and prov:BundleConstructor.



I think we need to understand that this approach introduces another
indirection artifact in the PROV-XML encoding. Would this be an acceptable
compromise approach around the JAXBElement issue?

--Hook





 


On 3/21/13 8:49 AM, "Stephan Zednik" <zednis@rpi.edu> wrote:

>
>
>On Mar 21, 2013, at 5:26 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>
>> H Hook,
>> 
>> Thanks for this analysis.
>> 
>> In this specific instance, I think that it is the element
>>  <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>> occurring inside
>>  <xs:sequence maxOccurs="unbounded">
>> that causes these jaxb elements to be generated.
>> 
>> If you were to remove xsd:any there, jaxbElements would no longer be
>>generated.
>> 
>> While we want to allow the possibility of elements from other schemas,
>>do we
>> really want to allow them any where inside a document/bundle?
>
>We want to provide for elements from other schemas but I don't think we
>formally identified what areas we intend to allow non-prov elements in
>before we added this functionality to the schema.
>
>What if we made a FAQ entry about OXM mappings with PROV-XML and created
>a customized schema or bindings file specifically for JAXB code
>generation?  This would allow us to work on asynchronously with the
>document and past the note publication, it would also allow us to
>introduce JAXB-specific solutions that I do not think make sense in the
>official schema or note.
>
>--Stephan
>
>> 
>> Luc
>> 
>> 
>> On 03/21/2013 11:09 AM, Hua, Hook (388C) wrote:
>>> Hi Luc,
>>> 
>>> I'm using jaxb-ri-2.2.6 against our latest prov*.xsd and seeing
>>>slightly
>>> different bindings with JAXBElement:
>>> 
>>> public class Document {
>>>     @XmlElementRefs({
>>>         @XmlElementRef(name = "hadPrimarySource", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>false),
>>>         @XmlElementRef(name = "agent", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>false),
>>>         @XmlElementRef(name = "activity", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>false),
>>>         @XmlElementRef(name = "organization", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>false),
>>>         @XmlElementRef(name = "softwareAgent", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>false),
>>> ....
>>> 
>>> 
>>> 
>>> Some findings:
>>> 
>>> 
>>> (1) JAXB's generation of JAXBElement<T> classes seems to be a wrapper
>>> approach to preserve sufficient information in the schema for
>>>round-trip
>>> marshaling & unmarshalling of values in XML instances. More
>>>specifically,
>>> it wraps the data with a QName and a nillable flag [1].
>>> 
>>> It appears that the a frequent cause of JAXB producing JAXBElement<T>
>>>is
>>> its attempt to preserve elements with both minOccurs=0 and
>>>nillable=true.
>>> JAXB needs to distinguish between the two cases where:
>>> 
>>>   a. element missing, minOccurs=0, then jaxbElement==null
>>>   b. element present, xsi:nil=true, then jaxbElement.isNil()==true
>>> 
>>> It would not be possible to distinguish between these two states if the
>>> bindings were the raw types.
>>> 
>>> 
>>> 
>>> (2) It would be possible to customize the JAXB bindings [2] to ignore
>>>the
>>> full round-trip requirement. The "generateElementProperty=false"
>>> customization option "can be used to generate an alternate developer
>>> friendly but lossy binding" [3].
>>> 
>>> I tried variations of a "bindings.xjb" customization file:
>>> 
>>> <jaxb:bindings version="2.1"
>>>   xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
>>>   xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
>>>   xmlns:xs="http://www.w3.org/2001/XMLSchema">
>>>   <jaxb:bindings schemaLocation="prov-core.xsd"
>>>     node="//xs:complexType[@name='Document']">
>>>     <jaxb:globalBindings generateElementProperty="false" />
>>>   </jaxb:bindings>
>>> </jaxb:bindings>
>>> 
>>> $ xjc.sh -d BINDINGS -b bindings.xjb prov.xsd
>>> 
>>> But none truly eliminated the JAXBElement<T> from the bindings.
>>> 
>>> 
>>> 
>>> (3) Nowhere in our prov-core.xsd do we define minOccurs=0 in
>>>conjunction
>>> with nillable=true. In my attempts with JAXB, I'm seeing JAXBElements
>>> appearing in the bindings for the (a) Document class and (b)
>>> BundledConstructor class. Both types leverage the prov:documentElements
>>> grouping.
>>> 
>>>   <xs:element name="document" type="prov:Document" />
>>> <xs:complexType name="Document">
>>>   <xs:sequence maxOccurs="unbounded">
>>>     <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>     <xs:element name="bundleContent" type="prov:BundleConstructor"
>>> minOccurs="0"/>
>>>     <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>   </xs:sequence>
>>>   </xs:complexType>
>>> 
>>> It's unclear if there is some nillable-like affect that triggers JAXB
>>>to
>>> generate the JAXBElements.
>>> 
>>> 
>>> 
>>> (4) On the upside, JAXB does provide an ObjectFactory class as part of
>>>the
>>> generated bindings that define creational factory methods to generate
>>>the
>>> JAXBElement instances. For example:
>>> 
>>>   public JAXBElement<Usage> createUsed(Usage value)
>>> 
>>> Still, I agree that it is not as clean.
>>> 
>>> 
>>> --Hook
>>> 
>>> 
>>> [1] http://docs.oracle.com/javaee/5/api/javax/xml/bind/JAXBElement.html
>>> [2]
>>> 
>>>http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/tut
>>>ori
>>> al/doc/JAXBUsing4.html#wp148515
>>> [3]
>>> 
>>>http://docs.oracle.com/cd/E17802_01/webservices/webservices/reference/tu
>>>tor
>>> ials/wsit/doc/DataBinding5.html
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 3/8/13 4:20 AM, "Provenance Working Group Issue Tracker"
>>> <sysbot+tracker@w3.org> wrote:
>>> 
>>>> PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML
>>>> Serialization]
>>>> 
>>>> http://www.w3.org/2011/prov/track/issues/648
>>>> 
>>>> Raised by: Luc Moreau
>>>> On product: XML Serialization
>>>> 
>>>> 
>>>> Hi
>>>> 
>>>> I have ported the ProvToolbox and the ProvValidator to the new XML
>>>>schema.
>>>> I just wanted to report on my experience with the schema and JAXB.
>>>> Obviously, others may have better experience with JAXB and may be able
>>>> to help on some of the issues I encountered.
>>>> 
>>>> Everything worked fine, except:
>>>> - <xs:element ref="prov:internalElement abstract=true/>
>>>> - extensibility <xs:any namespace="##other"/> in Document and Bundle
>>>> 
>>>> 
>>>> These two constructs, while processable by JAXB, are not
>>>>JAXB-friendly.
>>>> 
>>>> Indeed, JAXB compiles the schema in a list containing all possible
>>>> statements.
>>>> 
>>>>    protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>> 
>>>> However, the presence on an abstract element and an <any/> element
>>>>result
>>>> in the
>>>> content of that list to be of type:
>>>> 
>>>> 
>>>>    @XmlElementRefs({
>>>>        @XmlElementRef(name = "used", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>        @XmlElementRef(name = "wasAssociatedWith", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>        @XmlElementRef(name = "person", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>        @XmlElementRef(name = "entity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>        @XmlElementRef(name = "wasInfluencedBy", namespace =
>>>> "http://www.w3.org/ns/prov#"
>>>> ....
>>>>    })
>>>> 
>>>>    @XmlAnyElement(lax = true)
>>>>    protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>> 
>>>> where all data structures are wrapped up in this unpleasant
>>>>JAXBElement.
>>>> 
>>>> Without these features, we get a much more natural mapping:
>>>>    @XmlElements({
>>>>        @XmlElement(name = "entity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = Entity.class),
>>>>        @XmlElement(name = "activity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = Activity.class),
>>>>        @XmlElement(name = "wasGeneratedBy", namespace =
>>>> "http://www.w3.org/ns/prov#", type = WasGeneratedBy.class),
>>>>        @XmlElement(name = "used", namespace =
>>>> "http://www.w3.org/ns/prov#", type = Used.class),
>>>>        @XmlElement(name = "wasInformedBy", namespace =
>>>> "http://www.w3.org/ns/prov#", type = WasInformedBy.class),
>>>>    ...
>>>> })
>>>> 
>>>> So, how I did I solve the problem?  I inserted the extension schemas
>>>>into
>>>> the schema file, and hence got rid of the abstract element.  I am ok
>>>>with
>>>> this. We could possible provide the utility to that transformation.
>>>> 
>>>> For the extensibility, I used a different definition. It happens to
>>>> parse prov-xml compliant xml. When serializing, it  puts all
>>>> extensibility elements at the end.  This is not a satisfactory
>>>> solution, and is likely to be dependent of the jaxb implementation
>>>> (though I am not entirely sure).
>>>> 
>>>> 
>>>>   <xs:complexType name="Document">
>>>>     <xs:sequence>
>>>>       <xs:choice maxOccurs="unbounded">
>>>>         <xs:group ref="prov:documentElements"/>
>>>>         <xs:element name="bundleContent" type="prov:NamedBundle"/>
>>>>       </xs:choice>
>>>>       <xs:any namespace="##other" processContents="lax" minOccurs="0"
>>>> maxOccurs="unbounded"/>
>>>>     </xs:sequence>
>>>>   </xs:complexType>
>>>> 
>>>> Can something be done to make the XML schema a bit more jaxb friendly,
>>>> while still keeping the same flexibility?  Thoughts welcome.
>>>> 
>>>> Cheers,
>>>> Luc
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> -- 
>> Professor Luc Moreau
>> Electronics and Computer Science   tel:   +44 23 8059 4487
>> University of Southampton          fax:   +44 23 8059 2865
>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>> 
>> 
>> 
>
>

Received on Thursday, 28 March 2013 10:44:36 UTC