Re: Guidance for Publishers

Lookin good, and the text looks good at a glance, but I noticed that any23.org was only listed in the online parsers list for RDFa when it also parses microdata, so I added it to the microdata list as well. Hope that was okay.  

Question: Regarding the section on "Are there development tools available?", some of the tools listed as "online parsers", like RDF Distiller (Ruby), Python RDFa Distiller(s),  and any23.org's parser (Java) are also available as a software that can be incorporated into publishers' test tools. Could that be mentioned in this section? 

j




Jayson Lorenzen
Senior Software Engineer
____________________________ 
B  U  S  I  N  E  S  S       W  I  R  E 
A Berkshire Hathaway Company
 
+1.415.986.4422, ext. 766 
+1.415.956.2609 (fax) 
www.BusinessWire.com
 
Business Wire/San Francisco 
44 Montgomery St. 39th Floor
San Francisco, CA 94104



>>> 
From: 	Jeni Tennison <jeni@jenitennison.com>
To:	HTML Data Task Force WG <public-html-data-tf@w3.org>
Date: 	11/1/2011 1:49 AM
Subject: 	Guidance for Publishers

Hi,

Based on the discussions that we've had over the past month, I've started to flesh out guidance for publishers in the wiki [1], which for ease of reference I've copied below.

Please could you read, make any obvious editorial changes yourself and raise any issues or points for discussion here.

Thanks,

Jeni

---

You are likely to find that the markup within your pages is simpler and easier to maintain if you only use one format (syntax and vocabulary) within each page. To decide which to use, your first consideration has to be which consumers will read the data within your web pages, and which formats they support. These may include:

* scripting libraries
* browsers and browser plug-ins
* general-purpose search engines
* vertical or domain-specific search engines
* data reusers with whom you have agreements

Your second consideration may be the current state of the tooling to support a particular format. For example:

;Are you able to publish using HTML5?
:If you are using a content-management system that doesn't support adding new attributes such as <code>@itemprop</code> or <code>@typeof</code>, or if your publishing guidelines require validity against an older version of HTML or XHTML, then you will be constrained to using microformats.
;Are there development tools available?
:Because it is not visible within a web page, it can be hard to tell whether HTML data has been written correctly. Consumers should provide validators that enable you to check that your data has been correctly detected and interpreted, but you may also want to consider tool support for generating the HTML data.

Once you have considered both your target consumers and the tooling support that is available, you will be in one of four situations:

# '''with a single choice of format''' in which case you are good to go
# '''unable to publish HTML data that your target consumers understand''' in which case you either have to lobby those consumers to add support for the format(s) you can publish in, or consider changing your toolset so that you can publish in something they understand
# '''still with a choice between a number of formats''' in which case you will want to pick one (see below)
# '''having to use multiple formats at the same time to provide data to all your target customers''' in which case you will need to mix formats within your pages (see below)

=== Choosing a Publishing Format ===

This section addresses a situation where all the consumers that you as a publisher want to target recognise a set of formats (each with a particular syntax and vocabulary), your toolset supports publishing in all of them, and you need to make a choice about which to use.

==== Syntax Considerations ====

The different syntaxes -- microformats, microdata and RDFa -- have different capabilities which may inform your choice.

;Structured HTML values
:Under appropriate conditions, RDFa and microformats will use markup within the content of an element to provide a property value; in microdata values never retain markup. If property values within your page contain markup (for example <code>description</code>s containing emphasised text, multiple paragraphs, tables and so on), you may want to use RDFa or microformats to ensure that structure is available to consumers of your pages.
;Language support
:Microformats and RDFa use the language of the HTML elements in the page (from the <code>lang</code> attribute) to indicate the language of relevant values. In microdata, the vocabulary has to provide a separate mechanism to indicate a language (pending resolution of [http://www.w3.org/Bugs/Public/show_bug.cgi?id=14470 bug 14470]). If you have multi-lingual information in your pages, you may find it easier to use microformats or RDFa than microdata.

TODO: Other guidelines?

==== Vocabulary Considerations ====

Vocabularies and syntaxes are closely tied together, especially in the case of microformats. Aspects of a vocabulary to bear in mind are:

* How closely does it match with the information that you have?
* How much support does it have? Are there tools for validating and viewing it? Is there good documentation?
* How stable is it? Who has control to make changes to it? How frequently might those changes be made?
* Are other consumers likely to adopt it in the future?

==== Usability Considerations ====

The usability of a particular format is likely to depend on your existing expertise and the match between the structure and content of your web pages and the required structure and content of the format. The best thing to do is to try using the format to mark up an example page from your site.

TODO: Example?

=== Publishing in Multiple Formats ===

TODO: further guidance on publishing in multiple formats

[1] http://www.w3.org/wiki/Choosing_an_HTML_Data_Format#Publishers
-- 
Jeni Tennison
http://www.jenitennison.com





Please Note:  

The information in this Business Wire e-mail message, and any files transmitted with it, is confidential and may be legally privileged. It is intended only for the use of the individual(s) named above. If you are the intended recipient, be aware that your use of any confidential or personal information may be restricted by state and federal privacy laws. If you, the reader of this message, are not the intended recipient, you are hereby notified that you should not further disseminate, distribute, or forward this e-mail message. If you have received this e-mail in error, please notify the sender and delete the material from any computer.

Received on Tuesday, 1 November 2011 15:11:31 UTC