document types in retrieving "information about me"

At 05:53 AM 2001-10-20 , Jim Ley wrote:
>e793c3@y0r1d9>
>Subject: Re: Where does the EARL go?
>Date: Sat, 20 Oct 2001 10:53:23 +0100
>MIME-Version: 1.0
>Content-Type: text/plain;
> charset="iso-8859-1"
>Content-Transfer-Encoding: 7bit
>X-Priority: 3
>X-MSMail-Priority: Normal
>X-Mailer: Microsoft Outlook Express 6.00.2600.0000
>X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
>
>Sean:
>> EARL is an application of RDF.
>
>Hence my problem with the technical solution coming before the human, as
>you've described EARL it doesn't fit the job, because there's no way for
>me to use an EARL report on a url.
>
><pretend EARL doesn't exist>
>It would be useful to have a machine readable format for reporting on the
>accessibility of URI's.
>
>To use a reporting language, I need to be able to identify that it is a
>report and hand it off to my processor to provide in human readable, or to
>make judgements about links for me.
>

It is not necessary to narrow its topic to "it is a report" or have a
dedicated
"report" processor.  It is necessary to be able, before processing it, to
recognize that it belongs in some category of "information about X" that you
can decide you are interested in, and to represent it in a format for which
you
have a processor.

So we need YetA_Meta protocol for knowing about the topic of the report [link
role?] and a properties vocabulary that will let us say things in queries like
I want to know...

- format orthodoxy
- accessibility compliance
...about
- this page
- regions in the Web containing this page/object_in_page

>In HTML the use of the LINK element seems to be appropriate for linking to
>the report, there are certainly no others within the current
>recommendations.  The LINK though is poorly mentioned in the standard:
>
>"Authors may use the following recognised link types, listed here with
>their conventional interpretations. - -"
><http://www.w3.org/TR/REC-html40/types.html#type-links>http://www.w3.org/T
R/REC-html40/types.html#type-links
>
>and then proceeds to list them which are in no way conventional, or what
>they might actually mean.  However it would seem sensible to add an
>"accessibility report" for this to be useful, it could be done in multiple
>formats, then you could write (generate from the machine readable) an HTML
>version of the report in addition to the machine-readable. e.g.
>
><LINK rel="accessibility-report" href="/report.html" type="text/html">
><LINK rel="accessibility-report" href="/report.arl"
>type="text/x-accessibility-reporting-language">

There are alternatives, such as putting the executive summary of the report
inline in the HEAD in html:meta syntax with an
html:meta.name=accessibility-details content=<uri-reference> reference to
[optionally an RDDL indexed collection of] resource material which gives a
more
complete description including definitions.

The META syntax is limiting in having only one name-value pair at a time.  The
ideal thing here is something on the order of 

<META KEYWORDS="<list>" HREF="<uri-reference>">

This would indicate that for more meta-information in any of the concern areas
mentioned in the list of keywords, it would be suggested to process the
resource indicated in the HREFed resource.  The ability to do tuple-space
references to topical domains like this is of generic interest -- hence cc to
wai-tech-comments.

What one perhaps wants is the same effect in a LINK.

<LINK ROLE="meta" HREF="resource" A-PROPOS-DE="<keywords list>">

The availability of processors for languages which are somewhat self-defining,
that is to say which include definitions for the terms they use, is a hot
issue.  Relevant here, and controversial in the community.

And there are other alternatives.  But LINK is a strong candidate and despite
its want of use in the HTML community it has enough support in the format
definition community so that X-Link exists as a generic utility in XML, so we
can use it widely with at least assurance that processors should understand
the
syntax.

>The strength of the alternative versions being the user doesn't require an
>interpreter of our machine readable format, and could still learn about
>the report/get contact details.  Or also for allowing future development
>where more than one reporting language can be used in the same HTML
>document.

Yes.  I would suggest that the schema make it clear that multiple LINKs with a
common ROLE are permitted and to be processed as a bag with selection based on
the other properties articulated in the individual LINKs.  Given that
processing the LINKed resources is not required under any circumstances, it is
not necessary to have a language-defined default choice or selection
algorith. 
The algorithm may be a function of the processor.  But this needs to be
considered in terms of pros and cons.

>This does though require a mime-type - as a generic RDF or text/plain or
>text/xml causes future problems, if we use text/plain then I would expect
>a user to be able to understand it as is. if we use a generic RDF, what do
>we do with future reporting languages, or even evolutions of the current,
>there's no way of identifying it, so we'd be limited to evolving the
>current language and then with constraints.  

If I understand RDF right, the problem that you envision for RDF is not a
problem.

You are assuming that semantics and syntax are one-to-one.  That is the point
where the present architecture is broken.

The ability to process RDF includes the capability to discover semantic
subclasses or specialization within the RDF being processed, unlike the
processing of HTML.  

The conventions of reporting accessibility, and the ability to interoperably
report on other domains of questions besides accessibility, is built into the
processing capabilities of the RDF core capabilities (undefined for now --
that's an RDF Core issue).  So you don't need a new type to upgrade the EARL
schema if the EARL is linearised as text/rdf or info/rdf.

>Of course you could use an
>alternative method in the LINK element to say the type/version of the
>reporting language, but that would need an extension to the LINK element,
>unlikely and unwarranted.
>

The proposal that I am working on is that 

a) the element that cites the report indicates why you would be interested in
what it says.  So the "accessibility evaluation results" semantic is
somehow in
the LINK mode or other syntactic slots in the citation expression.

b) there is a type available before parsing is performed which allows you at
minimum to parse the resource without breaking it.  Citation syntaxes may make
this type indication required and/or binding; the tradition of having this be
loose is not required in general.  In particular, schema citations are an area
where a binding type indication may be appropriate.

c) the resource itself is allowed to internally, after the syntax has been
processed, identify more information about constraints that support precise
interpretation of what is said there.

The idea that the HTTP indicated Content-Type property defines all the type
knowledge about the content forever and aye is rejected.  Self-defining
content
types are allowed, so long as the processing that can be performed safely on
the basis of indication b) is well defined and indeed safe.

So long as the syntactic processing can be completed in accordance with the
[super-] type given ahead of time [in LINK or HTTP Content-Type: header or
otherwise] pursuant to b) above; versioning can be handled at the c) level.  

>Then there's resources which aren't HTML (or other resource where linking
>is possible.), how do we provide a mechanism whereby the user can get at
>these reports.   

There is always the access path where you go into the information domain and
start asking.

This is the native assumption of the RDF design, where I agree with you that
this is inadequate.

The Web documents themselves must self-document to some degree.

>There are 2 possible solutions I can see, one is an HTTP
>request of the same url with an accept header asking for an accessibility
>report of that resource.  

There is alread a meta request provided in the HEAD request in HTTP.  So long
as the URL is HTTP based, all you need is to add properties to the HEAD
response, which fits with defining a generic more-about relationship which can
be used in X-Link 'role' in XHTML META and in HTTP headers.

The point is that the header should be legal on all HTTP requests and by
content accessibility guidelines required on HEAD requests.

>This would require a mime-type, and can't be a
>generic (as then you couldn't create accessibility reports for any of the
>generics.)  The other has less support in the existing framework, other
>than by general agreement, this is the robots.txt file, which wouldn't
>need its own mime-type, but I struggle to see how this could really be
>effectively suggested on sites with radically different reports, how would
>you do the association within the reporting language, and the report would
>quickly get very large.

The basic flaw in your argument is that you assume there has to be a MIME type
for what the report is _about_.  Actually, all there has to be a MIME type for
is a format which defines the syntax that the report conforms to so you can
parse it; and which defines enough bootstrapping semantics so that one can
learn details of what it is about on the fly as you interpret the report.

The starting classification of what this meta-information is about in terms of
topics is given outside the type and content of the resource in the reference
to it.

><EARL exists again>
>
>So how does EARL fit in the above framework?  Well it doesn't if we can't
>give it a mime-type, other than the last one, which seems to suffer other
>problems with the current EARL framework.
>

If you will look at processing the report in two stages; a syntactic stage
which is by rote per type and strips the linearisation of the infoset, and a
semantic stage which is interpretive of the infoset, I think that there is
more
room to share MIME types with other categories of information and communicate
common categories of information within different MIME types.

This is a lot of stuff, and it is only a working estimate trying to bracket
the
range within which a solution could be found.  Your questions and comments
would be most welcome.

Al
>Jim.
>  

Received on Saturday, 20 October 2001 10:32:57 UTC