Re: COMURI

Dear Tomas,

Interesting document!

Here are some remarks per section:

* 3.3 Official data server
The example says that it is best to avoid doing as data.gov or data.gov.uk
did... This is somewhat surprising and could be better argued for. I don't
think the length of the URI matter here as a argument, especially
considering that the construct http://foo.example.com will probably still
want to use the path "data" somewhere and may end up with a "
http://foo.example.com/data" of the same size as "
http://data.example.com/foo". The point about what goes into one server and
what goes into another is also to be discussed as the selection of a URI
pattern can be totally dissociated from the actual storage/hosting of the
data behind it. E.g. use "{register}.data.gov.nl" and then delegate each
register to a specific machine as suggested in
http://www.pilod.nl/w/images/a/aa/D1-2013-09-19_Towards_a_NL_URI_Strategy.pdf

* 4. URI patterns
Why is it needed to define such a list of naming patterns ? The list reads
as "that, or that, or anything else you want" anyway ;-)

* 5. URI variants
Besides the naming of the variant using extensions the document could also
mention using different paths. This is, for instance, what DBpedia and
other sites using Pubby as a front-end do with the resource/page/data
scheme. It would be good to let BP adopters follow this approach if they
like it.
Also, nothing is said about content negotiation in this part. It could be
recommended that this gets implemented in order to link all the variants to
each other. Another good thing to recommend is to augment the HTTP
responses with links in the header (again, as Pubby does)

* 6.2.1 http
I don't think this will fly: "For the URI metadata request, empty query can
be used". Asking for a resource without any parameters is like asking for
the resource directly. That is "GET http://example.org/test" and "GET
http://example.org/test?" are the same query (or maybe not ? please correct
me there if needed). Considering this it not be possible to differentiate
between asking for a description of a resource and asking for its meta-data.

* 6.3 Comuri authority
"Fourth level domains and beyond should be avoided as it makes URIs too
long" hummm... what if I use "ship-dgt-foo.ec.europa.eu" instead of "
ship.dgt-foo.ec.europa.eu" then ? It is a long but perfectly acceptable
third level domain ;-) It's not really the number of sub domains that
contribute most to the length of an URI as they only require an extra ".".
It's what's in the namespace name what matters most. What about
recommending a maximum length for the FQDN ? and maybe ground this maximum
length to some cognitive this. E.g. people in general can not easily
remember a string of symbols longer than 15 caracters (just picking up a
random number here).

* 6.4 Comuri path
Should there be a sister document "COMIRI" that will let users use IRI ?
This would be useful for those that will use RDF 1.1 and/or OWL2
Or considering that IRI are a generalisation of URI the present document
could already be adjusted (and renamed ?) to enable their usage.

* 6.4.2 Without dot extensions
"The path must not contain unnecessary dot extensions such as php." could
be extended to say "The path must not contain unnecessary dot extensions
such as php,jsp,asp or cgi" to cover a bit more example and let users know
what it specificaly all extensions coming from server-side rendering
engines that are not allowed. Otherwise, this recommendation goes directly
against saying variants should be indicated using ".html", ".pdf" etc.
Maybe a good idea to merge 6.4.2 and 6.4.3 to better explain the
restriction.
This leaves an ambiguity by the way, what do I do if my resource is a PHP
script displayed as a resource ?
Lastly, the example "http://example.com/foo.language.format" indicates that
language always comes first and format second, which is a problem if one
want to specify only a format. Using the example 10, looking at "
http://example.com/palma.es" and "http://example.com/palma.xml" it is
unclear why "xml" would not be a language like "es" is.

* 6.4.4 Dot character in archival
Typo "or two transform" -> "or to transform"
I don't get this... If we look at what the web archive is doing, they just
stick the target archived resource name to the resolver. E.g. "
http://web.archive.org/web/20010201000000*/http://www.google.fr" without
renaming it. Why would it be best to change every dot into a dash ?

* 6.5 Comuri query
"This mechanism does not exist in servers and it has to be implemented" ->
Does this mean we need an update of the spec of HTTP ?
How would this section apply to the "file" scheme ? Is it just not
supported ?

* 7. URI metadata
"URI metadata is the metadata associated to the resource, such as the
Dublin Core. " -> Dublin Core is a standard that can be used to describe
the metadata, it's is not the metadata of the URI.

* 7.2 URI metadata structure
Typo "for appropriate for the" -> "appropriate for the"

* 7.4 XHTML-ID
What is the motivation for proposing a new extension to XHTML instead of
just using HTML5 and microdata ? Using "itemid" instead of "id" is not a
big stretch.

* 8. Ultrapersistent URI
I love this section title ^_^

* 8.2 Data archival
Could you add references for the classification of archival services ? That
is "Online archival sites", "Offline archival" and "Pack"

* 8.5.1 Online data
"If other techniques are used, URI should take priority, For example, if
the appropriate header field request the German variant of the resource and
the URI request the Spanish variant, the server should send the Spanish
variant. " -> I think this is actually up to HTTP, or the implementers of
it, to decide. COMURI is just using it and can not make any assumption on
its behaviour.

* 8.6.1 Static data
In this section the example for "Static data" shows what was called
"Offline data" in section 8.5.2 just a few lines before
As I already indicated earlier in some mails, I really don't think
restricting the usage of numbers to indicate version is a workable
solution. The precision of versions should follow a specific pattern which
is not just saying "whatever number is found at the end of the resource
identifier is the version". There is already a strict specification for
languages and format, why not just extend it with version then ? For
instance, "http://example.com/foo.language.version.format" or "
http://example.com/foo.version.language.format" ? Not that I would find
this a really good solution either but at least this would be consistent
with the rest of the specification

* 9.1 Language neutral URI
"http://example.com/1234" -> version 1234 of the default resource for "
example.com" ?

* 9.2 Language identification in URIs
Indicating the language as part of the domain name does not seem to be
consistent with the rest of the document.

Cheers,

Christophe

On 18 September 2014 15:48, Manuel.CARRASCO-BENITEZ@ec.europa.eu <
Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote:

> Dear WG members,
>
> Please could you comment on
>   Compact Uniform Resource Identifier (COMURI)
>   http://dragoman.org/comuri
>   mirror -  https://joinup.ec.europa.eu/site/med/dragoman/comuri
>
> It is nearly completed and as per the calendar, the First Public Working
> Draft is planned by the 30 Sep
>   http://www.w3.org/2013/meeting/dwbp/2014-08-22#URI_construction
>
> The language in the final version will be corrected by a proof-reader.
>
> Regards
> Tomas
>
>
>


-- 
Onderzoeker
+31(0)6 14576494
christophe.gueret@dans.knaw.nl

*Data Archiving and Networked Services (DANS)*

DANS bevordert duurzame toegang tot digitale onderzoeksgegevens. Kijk op
www.dans.knaw.nl voor meer informatie. DANS is een instituut van KNAW en
NWO.


Let op, per 1 januari hebben we een nieuw adres:

DANS | Anna van Saksenlaan 51 | 2593 HW Den Haag | Postbus 93067 | 2509 AB
Den Haag | +31 70 349 44 50 | info@dans.knaw.nl <info@dans.kn> |
www.dans.knaw.nl


*Let's build a World Wide Semantic Web!*
http://worldwidesemanticweb.org/

*e-Humanities Group (KNAW)*
[image: eHumanities] <http://www.ehumanities.nl/>

Received on Thursday, 25 September 2014 12:59:59 UTC