Re: [ISSUE-2] Re: Strawman microdata proposal

The RDF Web Applications Working Group has just announced that three of their documents are Candidate Recommendations: RDFa Core 1.1 , RDFa Lite 1.1 and XHTML+RDFa 1.1 Fabien Gandon , a "metadata literate", introduces Jeni Tennison 's HTML Data Guide : > Microformats, RDFa and microdata all enable consumers to extract data
> from HTML pages. This data may be embedded within enhanced search
> engine results, exposed to users through browser extensions,
> aggregated across websites or used by scripts running within those
> HTML pages.
> This guide aims to help publishers and consumers of HTML data use it
> well. With several syntaxes and vocabularies to choose from, it
> provides guidance about how to decide which meets the publisher's or
> consumer's needs. It discusses when it is necessary to mix syntaxes
> and vocabularies and how to publish and consume data that uses
> multiple formats. It describes how to create vocabularies that can be
> used in multiple syntaxes and general best practices about the
> publication and consumption of HTML data.
> http://www.w3.org/TR/html-data-guide/
> --
> fabien, inria, @fabien_gandon, http://fabien.info
I recommend to have a look on the "4.2 Designing Vocabularies" sub-section: http://www.w3.org/TR/html-data-guide/#designing-vocabularies I'm an newcomer here and I only discover ITS. I believe that: - porting the ITS to the Semantic Web formalisms would be the best practise to represent hierarchical concepts and thus to enable simple reasoning (if we wish to do so), - using RDFa in HTML5 would leverage the interoperability between existing/futur w3c standards (correct validation, easy data consumability by browser extensions, enhanced search engine results goo.gl/aCb2P , etc.) - of course, RDFa makes documents much more verbose than simple xml attributes... Let's consider the simple ITS annotated example : <body xmlns:its="http://www.w3.org/2005/11/its"> <span its:translate="no" its:term="yes" its:locNote="foo">bar</span> </body> The rewriting of this example would depend on the model choosen for ITS 2.0. In the following, I illustrate this with two different models of ITS. For each model, two syntaxes: one using RDFa 1.1 and one using Microdata that produce output "bar", and equivalent annotations are given. Example of model 1: its:translate is a a literal of datatype xsd:boolean and what is described is an instance of the its:Term class / is of type its:Term RDFa 1.1 syntax: <body prefix="xsd: http://www.w3.org/2001/XMLSchema# its: http://www.w3.org/20XX/XX/its#"> <span typeof="its:Term" property="its:value"> <meta property="its:translate" content="false" datatype="xsd:boolean" /> <meta property="its:locNote" content="foo" >bar</span> </body> --> This will produce the following triples, expressed in Turtle syntax: @prefix its: <http://www.w3.org/20XX/XX/its#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <> rdf:type its:Term ; its:translate "false"^^xsd:boolean ; its:locNote "foo" ; its:value "bar" . Microdata syntax: <body> <span itemscope itemtype="http://www.w3.org/2005/11/its#Term"> <meta itemprop="http://www.w3.org/2005/11/its#translate" content="false"/> <meta itemprop="http://www.w3.org/2005/11/its#locNote" content="foo"/> <span itemprop="http://www.w3.org/2005/11/its#value">bar</span> </span> </body> --> This will produce the following JSON: { "items": [ { "type": [ "http://www.w3.org/2005/11/its#Term" ], "properties": { "http://www.w3.org/2005/11/its#translate": [ "false" ], "http://www.w3.org/2005/11/its#locNote": [ "foo" ], "http://www.w3.org/2005/11/its#value": [ "bar" ] } } ] } Example of model 2: what is described is an instance of the its:Term and the its:NoTranslate classes (as these are unary relations, using classes may be the best modelization choice) -> this leads to a shorten syntax. RDFa 1.1 syntax: <body prefix=" xsd: http://www.w3.org/2001/XMLSchema# its: http://www.w3.org/20XX/XX/its#"> <p typeof="its:Term its:NoTranslate" property="its:value"> <meta property="its:locNote" content="foo" /> bar </p> </body> --> This will produce the following triples, expressed in Turtle syntax: @prefix its: <http://www.w3.org/20XX/XX/its#> . <> rdf:type its:Term ; rdf:type its:NoTranslate ; its:locNote "foo" ; its:value "bar" . Microdat: <body> <span itemscope itemtype="http://www.w3.org/2005/11/its#Term http://www.w3.org/2005/11/its#NoTranslate"> <meta itemprop="http://www.w3.org/2005/11/its#locNote" content="foo"/> <span itemprop="http://www.w3.org/2005/11/its#value">bar</span> </span> </body> --> This will produce the following JSON: { "items": [ { "type": [ "http://www.w3.org/2005/11/its#Term" "http://www.w3.org/2005/11/its#NoTranslate" ], "properties": { "http://www.w3.org/2005/11/its#locNote": [ "foo" ], "http://www.w3.org/2005/11/its#value": [ "bar" ] } } ] } Maxime Lefrançois Ph.D. Student, INRIA - WIMMICS Team http://maxime-lefrancois.info @Max_Lefrancois ----- Mail original -----
> De: "Phil Ritchie" < philr@vistatec.ie >
> À: "Jirka Kosek" < jirka@kosek.cz >
> Cc: "Felix Sasaki" < fsasaki@w3.org >,
> public-multilingualweb-lt@w3.org
> Envoyé: Mardi 20 Mars 2012 10:16:21
> Objet: Re: [ISSUE-2] Re: Strawman microdata proposal
> I'm by no means an expert here but here's my thoughts:
> • Attributes may not allow us to describe hierarchical concepts.
> • XML syntaxes would provide good parsable structure by may become
> very verbose.
> I'm not sure I have any string preference but I suspect the more
> "metadata literate" among us will.
> Phil.
> -----Jirka Kosek < jirka@kosek.cz > wrote: -----
> To: Felix Sasaki < fsasaki@w3.org >
> From: Jirka Kosek < jirka@kosek.cz >
> Date: 03/20/2012 08:47AM
> Cc: public-multilingualweb-lt@w3.org
> Subject: Re: [ISSUE-2] Re: Strawman microdata proposal
> On 19.3.2012 9:32, Felix Sasaki wrote:
> >> Any comments welcomed.
> Well, I have investigated more and talked to other people. For now I
> see
> 5 ways how to express ITS:
> 1) Use pure XML syntax suitable for XML and XHTML content
> <p its:locNote="...">...</p>
> 2) Use microdata in HTML5 as proposed in previous email
> 3) Use RDFa in HTML5 on which Tadej is working. I'm looking forward to
> see outcome but I think that output will be even more baroque then
> microdata as connection to the source element will have to be
> expressed
> as an additional triplet.
> 4) Use custom attributes in HTML5 prefixed with its-, eg.:
> <p its-locnote="...">...</p>
> This is actually sort of allowed in HTML5 spec (see
> http://dev.w3.org/html5/spec/infrastructure.html#extensibility ):
> "When vendor-neutral extensions to this specification are needed,
> either
> this specification can be updated accordingly, or an extension
> specification can be written that overrides the requirements in this
> specification. When someone applying this specification to their
> activities decides that they will recognize the requirements of such
> an
> extension specification, it becomes an applicable specification."
> Such attributes will cause no troubles in Web browsers, but page will
> raise errors in validators. We can create our own "applicable
> specification" for HTML5+ITS and then create our own validator.
> 5) Use data-* attributes in HTML5 like:
> <p data-its-locnote="...">...</p>
> This is valid in HTML5, but non-conforming as data-* attributes are
> currently reserved for application private use only (see
> http://dev.w3.org/html5/spec/global-attributes.html#attr-data )
> "Custom data attributes are intended to store custom data private to
> the
> page or application, for which there are no more appropriate
> attributes
> or elements.
> These attributes are not intended for use by software that is
> independent of the site that uses the attributes."
> For ITS in HTML5 I think that option 4) is the best while option 5) is
> also quite good.
> What I think we should do now is to raise bug against HTML5 spec and
> ask
> for either allowing arbitrary prefix-* attributes or lifting existing
> "private use only" clause from data-* attributes.
> If there are no objection to such approach, I'm going to raise
> respective HTML5 bug.
> Jirka
> --
> ------------------------------------------------------------------
> Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
> ------------------------------------------------------------------
> Professional XML consulting and training services
> DocBook customization, custom XSLT/XSL-FO document processing
> ------------------------------------------------------------------
> OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
> ------------------------------------------------------------------
> [attachment "signature.asc" removed by Phil Ritchie/VISTATEC]
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
> www.vistatec.com
> ************************************************************

Received on Tuesday, 20 March 2012 16:20:17 UTC