W3C home > Mailing lists > Public > public-html-data-tf@w3.org > November 2011

Guidance for Publishers

From: Jeni Tennison <jeni@jenitennison.com>
Date: Tue, 1 Nov 2011 08:48:10 +0000
Message-Id: <8B9FD436-80D3-4196-8133-EEA8C248823E@jenitennison.com>
To: HTML Data Task Force WG <public-html-data-tf@w3.org>

Based on the discussions that we've had over the past month, I've started to flesh out guidance for publishers in the wiki [1], which for ease of reference I've copied below.

Please could you read, make any obvious editorial changes yourself and raise any issues or points for discussion here.




You are likely to find that the markup within your pages is simpler and easier to maintain if you only use one format (syntax and vocabulary) within each page. To decide which to use, your first consideration has to be which consumers will read the data within your web pages, and which formats they support. These may include:

* scripting libraries
* browsers and browser plug-ins
* general-purpose search engines
* vertical or domain-specific search engines
* data reusers with whom you have agreements

Your second consideration may be the current state of the tooling to support a particular format. For example:

;Are you able to publish using HTML5?
:If you are using a content-management system that doesn't support adding new attributes such as <code>@itemprop</code> or <code>@typeof</code>, or if your publishing guidelines require validity against an older version of HTML or XHTML, then you will be constrained to using microformats.
;Are there development tools available?
:Because it is not visible within a web page, it can be hard to tell whether HTML data has been written correctly. Consumers should provide validators that enable you to check that your data has been correctly detected and interpreted, but you may also want to consider tool support for generating the HTML data.

Once you have considered both your target consumers and the tooling support that is available, you will be in one of four situations:

# '''with a single choice of format''' in which case you are good to go
# '''unable to publish HTML data that your target consumers understand''' in which case you either have to lobby those consumers to add support for the format(s) you can publish in, or consider changing your toolset so that you can publish in something they understand
# '''still with a choice between a number of formats''' in which case you will want to pick one (see below)
# '''having to use multiple formats at the same time to provide data to all your target customers''' in which case you will need to mix formats within your pages (see below)

=== Choosing a Publishing Format ===

This section addresses a situation where all the consumers that you as a publisher want to target recognise a set of formats (each with a particular syntax and vocabulary), your toolset supports publishing in all of them, and you need to make a choice about which to use.

==== Syntax Considerations ====

The different syntaxes -- microformats, microdata and RDFa -- have different capabilities which may inform your choice.

;Structured HTML values
:Under appropriate conditions, RDFa and microformats will use markup within the content of an element to provide a property value; in microdata values never retain markup. If property values within your page contain markup (for example <code>description</code>s containing emphasised text, multiple paragraphs, tables and so on), you may want to use RDFa or microformats to ensure that structure is available to consumers of your pages.
;Language support
:Microformats and RDFa use the language of the HTML elements in the page (from the <code>lang</code> attribute) to indicate the language of relevant values. In microdata, the vocabulary has to provide a separate mechanism to indicate a language (pending resolution of [http://www.w3.org/Bugs/Public/show_bug.cgi?id=14470 bug 14470]). If you have multi-lingual information in your pages, you may find it easier to use microformats or RDFa than microdata.

TODO: Other guidelines?

==== Vocabulary Considerations ====

Vocabularies and syntaxes are closely tied together, especially in the case of microformats. Aspects of a vocabulary to bear in mind are:

* How closely does it match with the information that you have?
* How much support does it have? Are there tools for validating and viewing it? Is there good documentation?
* How stable is it? Who has control to make changes to it? How frequently might those changes be made?
* Are other consumers likely to adopt it in the future?

==== Usability Considerations ====

The usability of a particular format is likely to depend on your existing expertise and the match between the structure and content of your web pages and the required structure and content of the format. The best thing to do is to try using the format to mark up an example page from your site.

TODO: Example?

=== Publishing in Multiple Formats ===

TODO: further guidance on publishing in multiple formats

[1] http://www.w3.org/wiki/Choosing_an_HTML_Data_Format#Publishers
Jeni Tennison
Received on Tuesday, 1 November 2011 08:48:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 1 November 2011 08:48:38 GMT