Re: "scheme" attribute of META element from Julian Reschke on 2009-06-06 (public-html-comments@w3.org from June 2009)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sat, 06 Jun 2009 09:40:39 +0200
To: Ian Hickson <ian@hixie.ch>
CC: "Chabot, Elliot" <Elliot.Chabot@mail.house.gov>, public-html-comments@w3.org
Message-ID: <4A2A1D77.2040509@gmx.de>

Ian Hickson wrote:
>> I can't speak for Elliot, but the Web repository connector inside SAP 
>> Netweaver's Knowledge Management has supported RFC2731-style encoded 
>> metadata (as shown above) for many years now.
> 
> Could you elaborate on how this tool consumes this data? Any information 
> you may have would be very useful. Could you walk us through an example of 
> how this information gets used? How do the various schemes affect the 
> handling of the metadata? Have you found particular processing is needed 
> to process invalid values? Is the tool's input limited to files generated 
> by one organisation, or does it process input from arbitrary Web sites?

I think I did that already once many months ago.

Anyway.

In the SAP KM system, everything is designed around the concept of 
resources, which essentially consist of a binary content stream, generic 
metadata (MIME type, encoding, whatnot), Access Control information 
(ACLs), versioning information (checked-in/out, version history...), and 
custom metadata.

Most metadata lives in name/value pairs, where the name is an XML type 
name (nsuri + localname), and the value can be numbers, strings, XML, 
... (and lists of them).

SAP KM resources expose a generic API, which is used by the UI, protocol 
handlers (HTTP/WebDAV, ICE, web services...), and internals services 
(search, collaboration, ...).

The implementation of resources varies, they can be be based on file 
shares, database tables, remote content management systems, remote 
WebDAV servers, LDAP, ... and also generic HTTP servers.

The latter are usually used to pull in read-only information that should 
be exposed to the internal search system (SAP TREX). The code that 
implements these resources extracts metadata from well-known HTML 
elements (title, keywords, ...), using configurable filters, and through 
the use of RFC 2731 formatted meta elements.

How this information is used in detail depends on the consumers using 
the KM API, which is hard to predict. Some use cases are decorations in 
the UI based on additional properties, or support in custom searches.

One of the reasons RFC 2731 support was added specifically was that 
several companies wanted to expose additional properties in their HTML 
documents (such as additional document related dates), and have them be 
accessible through the services mentioned above.

Back to your question:

 > how this information gets used? How do the various schemes affect the
 > handling of the metadata? Have you found particular processing is needed

Schemes aren't used (as I said in a later mail), but link/meta and the 
RCF 2731 style encoding of prefixes is.

 > to process invalid values? Is the tool's input limited to files 
generated
 > by one organisation, or does it process input from arbitrary Web sites?

The tool works for generic web resources.

BR, Julian

Received on Saturday, 6 June 2009 07:41:23 UTC