RE: Compact Uniform Resource Identifier (CURI) from Manuel.CARRASCO-BENITEZ@ec.europa.eu on 2014-08-29 (public-dwbp-wg@w3.org from August 2014)

From: <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
Date: Fri, 29 Aug 2014 08:37:03 +0000
To: <leigh@ldodds.com>
CC: <public-dwbp-wg@w3.org>
Message-ID: <39DB516E46C0E842A2CFFF1BBB7412F15F81C441@S-DC-ESTF01-J.net1.cec.eu.int>
# Dear Leigh and Mark,

Some initial comments based on a brief review. I see Mark Harrison has
already noted the name clash with CURIE.

# Name
If there is concern about the name, it can be changed to COMURI. I was aware of CURI, but it might be better to avoid a Java-Javascript  :-) 

Based on the working group charter I had assumed that the working
group would be publishing guidance on creating and maintaining
persistent URIs and URI schemes that covered similar topics to the UK
and Australian guidance [1, 2] and RFC 7320 [3]. The URI Design and
Ownership draft is also relevant [4] (particularly with respect to
some of my comments below).

Is that material to be covered elsewhere? It would be really useful to
see some of the existing deployed guidance generalised into reusable
best practices relevant to data on the web.

# Persistent URI
This aspect is not covered on purpose, though as you rightly point out it should be covered by the group. 

However this document profiles existing standards (HTTP, URIs, etc) by
limiting, e.g. length/format of domain names, legal characters in
URIs, etc. In practice, these might be useful things to do and
something to consider when designing a URI scheme, but I'm not sure
its correct to mandate a single approach complete with conformance
requirements.

# Design choices
The design choices are on purpose: one has to balance the area covered and the cost. Standards is about making choices and mandating: users that want something else can choose another one.

Some specific questions:

* Why is URI length the overriding design characteristic?

# URI length
As declared, the intention is to have "human and machine" readable URIs: at the beginning of the web they were called "napkin transportable URIs".

* Why are longer domain names "bad"?

# Domain
Domain is the authority component part of the URI and hence it make the overall URI longer.

* Why must the path only have a single component? Plenty of existing,
stable URI schemes have longer paths

# Path segments
One path segment are recommended, longer area allowed particularly for the “file” scheme. Again, path is a component part of the URI and hence it make the overall URI longer.

* Why must language codes be given using a dotted extension?

# Language as extension
This is common practice; for example, it has been implemented in Apache for over 15 years. 

* Why must URIs only contain ASCII characters? RDF 1.1 was recently
updated to use IRIs which has a larger repertoire of characters

# IRI
IRIs are allowed, though it is recommended to stick to the original URI character set. One should avoid unnecessary complications and most of the time the original character set is sufficient.

* Why use specific reserved parameters, rather than existing
mechanisms, e.g. HEAD, OPTIONS, HTTP headers to communicate metadata?

# Header fields
The objective is to allow in the URi some of the functionalities of the header fields, both are integrated:
  “Curi query takes precedence over parameters in the server or the HTTP header fields.”

Though I have a preference for the header fields, end user really demand the facilities in the URI. For example

 http://example.com/foo.fr.pdf        # end users want to this facilities
 http://example.com/foo                 # install and add-on in the browser and manipulate the header fields (observed in a recent implementation) 

* Why the recommendation to use file URIs for data to be published to
the web? I don't think the distinction between static/dynamic data
belongs here anyway.

# File URI
Reality: lots of data is offline and one cannot even assume the facilities provided by HTTP; hence we has to address it. Static/dynamic data is only here to explain the relation to the schemes “http” and “file”; beyond this is very much out of scope.

As it stands this specification looks like it would declare rather a
lot of existing well-maintained and well-designed URI schemes as
"invalid".

# Legacy
The group is about best practices and this implies looking at current practices and picking from different sources. Whatever are our future best practices, many (if not most) would not be following it, hopefully it should be followed in the future. But it is not an essential component of the Web such as HTTP where it would be totally unacceptable to break it.

# Conclusion
The rationale of the proposal is to address URIs looking at current practices and requirements, though very aware of the cost of each functionality. It is not about my preferences, it is about the demands of the end users (eaters) and not just the people creating the standards (cooks).

I prefer header fields, I am very aware about IRIs and its dangers (look the acknowledgements), it would be wonderful if all the data was online, but … 

# Thanks
Thanks for taking the time to comment
Regards
Tomas



Cheers,

L.

[1]. https://www.gov.uk/government/publications/designing-uri-sets-for-the-uk-public-sector
[2]. https://github.com/AGLDWG/TR/wiki/URI-Guidelines-for-publishing-linked-datasets-on-data.gov.au-v0.1
[3]. http://www.rfc-editor.org/rfc/rfc7320.txt
[4]. http://tools.ietf.org/html/draft-ietf-appsawg-uri-get-off-my-lawn-05

On Thu, Aug 28, 2014 at 3:44 PM,  <Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote:
> Dear all,
>
> I changed the name of the draft from
>   Old   : Best Practice for Web Data URI (DAURI)
>   New : Compact Uniform Resource Identifier (CURI)
>
> It is at
>   http://dragoman.org/curi
>
> A copy for the people having problems reading dragoman at
>   https://joinup.ec.europa.eu/site/med/dragoman/curi
>
> CURI must be considered a new version of DAURI. The name change is to better reflect the "compact" (term copied from RFC3986) aspect over the "data" aspect, though the data requirements must be fully supported.  I will continue to work and the objective is to have the First Public Working Draft by the 30 September 2014. I will change our wiki pages and load it to Github when appropriate.
>
> Regards
> Tomas
>
>



--
Leigh Dodds
Freelance Technologist
Open Data, Linked Data Geek
t: @ldodds
w: ldodds.com
e: leigh@ldodds.com
Received on Friday, 29 August 2014 08:37:34 UTC