Re: Model card ML v1 no brainer, no errors found

Paola, Peter, et al., the JSON v. XML debate is not one that needs to be had.  If information is shared in ANY well structured and semantically well-specified format (even PDF, for example), it should be relatively easy to transform it into any other such format.
The real issue is that people seem bound and determined to "creatively" reinvent the wheel when existing data standards may already be reasonably well fit for purpose.  In the case of the StratML standard, there are now more than 5.8K examples, including 11 proposed plans specifically relating to the purposes of the AIKR CG.  I'd love to see someone render all 5.8K them in JSON format as well and demonstrate the benefits of having done so.
Here's what Gemini (formerly Bard) has to say about JSON Schema versus XML Schema.
ChatGPT diplomatically concludes:

Ultimately, the choice between XML Schema and JSON Schema depends on factors such as the specific requirements of the application or system, existing technology stack, compatibility with other systems, and developer preferences. Both schemas have their own strengths and are suitable for different use cases.

See also the additional details provided by ChatGPT.
Owen Ambur
https://www.linkedin.com/in/owenambur/
 

    On Saturday, March 2, 2024 at 08:19:18 PM EST, Paola Di Maio <paoladimaio10@gmail.com> wrote:  
 
 Peter, thanks a lot for offering guidance
I know the answer is could be useful

I ll try to keep my demands for your time to a minimum

When it comes to schema requirements, guidelines vary greatly and
their usefulness depends on the use cases,  I do not think we are
short of that

There are good materials everywhere from W3 schools to other places
Here is a good example
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC543829/

I ll try to compile a list of resources, as it would be helpful to
have a nice and easy tutorial
we can contribute to the universe, if it adds or simplifies existing resources
I ll share my shortlist to tutorials and the rehash of the draft asap for review

P


Please point us to your preferred.recommended tutorials we can go do
(how to make an xml schema for your model) if different from/better than
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC543829/

and perhaps pointers sto XML TO JSON transformers  (okay, I am using
ML terminology here, say, converters?)  and maybe tutorials

Either a collection of existing tutorials, or a new tutorial (with
shortcuts etc) could be useful


On Sat, Mar 2, 2024 at 8:45 PM Peter Rivett
<pete.rivett@federatedknowledge.com> wrote:
>
> I'm not saying XML is completely dead but even the NeuroML folks are going to JSON https://github.com/NeuroML/NeuroMLlite
>
>
> Going from your XML document representation to a XSD involves a lot more than stripping out tags. You need elements such as xsd:Element, xsd:ComplexType and of course xsd:Schema. It also involves understanding and encoding the constraints.
>
> Converting XSD to JSON Schema is possible, I have a license for a tool that could do that. It can also do other useful things,  see https://www.oxygenxml.com/json_converter.html
>
> Frankly I think a JSON Schema would be more useful for the community, so it would be better to start with authoring that, and then autogenerate the XSD for the minority that might want it. I'm surprised no one's created a JSON Schema already.
>
> I'm new to model cards but can see they could have a lot of value - if they get/have wide usage I could help make them an official standard through OMG (and after that, ISO) - I'm already working on a Semantic Data Products spec (which builds on DCAT).
>
> LMK how you'd like to proceed, I don't have a lot of time but could help with some automation, checking,  and pointing out blind alleys.
>
> Pete
>
> ________________________________
> From: Paola Di Maio <paoladimaio10@gmail.com>
> Sent: Saturday, March 2, 2024 3:43:18 AM
> To: Peter Rivett <pete.rivett@federatedknowledge.com>
> Cc: W3C AIKR CG <public-aikr@w3.org>
> Subject: Re: Model card ML v1 no brainer, no errors found
>
> Just for your ference Peter, ppl are still using xml to model
> scientific domains,
> https://github.com/NeuroML/NeuroML2/blob/master/Schemas/NeuroML2/NeuroML_v2.3.xsd
> is xml isnt it
> lI dont know how things end up being that way
> i ll work toward an xsd and json version of th file once it is in good shape
> shall ping
>
> On Sat, Mar 2, 2024 at 12:10 PM Paola Di Maio <paoladimaio10@gmail.com> wrote:
> >
> > Thank you Peter, this is the feedback I was looking for
> > so
> > 1. first flesh the schema out (shall I just take out the labels that
> > refer to the modelc card instance and create schema type labels, more
> > generalized? I think there are a bunch of lines that dont need to be
> > there so I ll try to figure out but maybe you can guide them there
> > 2, then transform to JSON with your help (never done that but there is
> > always  first time)
> > does that sound right?
> >
> > On Sat, Mar 2, 2024 at 9:16 AM Peter Rivett
> > <pete.rivett@federatedknowledge.com> wrote:
> > >
> > > Things have changed since I last published an xml schema
> > >
> > > Sad to say (as an XML and XSLT whizz myself), things have changed since XML itself was felt useful.
> > > Most people are now all about JSON and JSON Schema. See, for example https://blog.axway.com/learning-center/apis/api-management/why-json-won-over-xml
> > >
> > > I see Model Card seems to use YAML for its metadata https://huggingface.co/docs/hub/en/model-cards. And that's readily transformable to-from JSON with no loss. And can be validated with JSON Schema https://json-schema-everywhere.github.io/yaml
> > >
> > > BTW this file https://huggingface.co/STARBORN/modelcardML_V1/blob/main/MODELCARDML_V1.xml is not an XML Schema at all but what seems to be the HTML of the web page saved as a document in Open Office XML format. So you did convert the web page to XML but not an XML schema that could be used to validate Model Cards.
> > >
> > > Regards,
> > > Pete
> > >
> > >
> > > Pete Rivett (pete.rivett@federatedknowledge.com)
> > > Federated Knowledge, LLC (LEI 98450013F6D4AFE18E67)
> > > tel: +1-701-566-9534
> > > Schedule a meeting at https://calendly.com/rivettp
> > >
> > > ________________________________
> > > From: Paola Di Maio <paola.dimaio@gmail.com>
> > > Sent: Friday, March 1, 2024 6:27 PM
> > > To: W3C AIKR CG <public-aikr@w3.org>; Leigh Dodds <leigh.dodds@talis.com>; francois.remy@ugent.be <francois.remy@ugent.be>
> > > Subject: Model card ML v1 no brainer, no errors found
> > >
> > > Things have changed since I last published an xml schema, using foaf
> > > generator (cc Leigh Dodds!)
> > >
> > > Things have become faster, easier, I generated an xml schema for a model card
> > > https://huggingface.co/STARBORN/modelcardML_V1/tree/main
> > >
> > > following the process described below
> > > 1. found an annotate model card, scraped  html
> > > https://huggingface.co/docs/hub/en/model-card-annotated
> > >
> > > 2. converted to xml using a cool tool (vertopal, thank you)
> > >
> > > 2. validated using another cool tool (vertopal) and xml validator
> > > No errors found
> > >
> > > Questions:, is this it? Is it useful? not useful? (either way, I may
> > > have a paper) <g>
> > >
> > > Can it be improved?
> > > are there any redundant elements or can it be modelled more meaningfullY
> > >

  

Received on Sunday, 3 March 2024 18:02:06 UTC