W3C home > Mailing lists > Public > public-digipub-ig@w3.org > September 2015

Summary of planned glossary changes (Re: web resource and terminology)

From: Ivan Herman <ivan@w3.org>
Date: Tue, 29 Sep 2015 11:30:35 +0200
Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>, Leonard Rosenthol <lrosenth@adobe.com>, Matt Garrish <matt.garrish@bell.net>, Markus Gylling <markus.gylling@gmail.com>, Brady Duga <duga@google.com>
Message-Id: <87CE1311-F42E-4F14-8508-BDB23B53B3F0@w3.org>
To: W3C Digital Publishing IG <public-digipub-ig@w3.org>
This is partially an answer Matt, thereby closing (hopefully) the separate thread, plus a summary of yesterday's discussions.


> On 28 Sep 2015, at 19:50 , Matt Garrish <matt.garrish@bell.net> wrote:
> 
> Sorry, my head's in a cloud, but on looking at the definitions again if you
> add the definition of content that alone would address most of my concerns
> (if "collated" goes).

I would propose to do that.

Combined with the results of the discussions on yesterday's IG call, here are the changes I plan to do on the glossary (both on the wiki page and in the PWD document)

- Rename (Portable) Web Document to (Portable) Web Publication overall (and close the relevant issue on github)
- Add the definition of 'Content', taking over (possibly adjusting) the WCAG definition
- Add a separate bullet item to Web Publication emphasizing the fact that the set has its own URI
- Change 'collated' to 'aggregated'
- Change the online/offline to protocol vs. file access, making it clear that this distinction has a value for the way user agents operate (and that is why we have to distinguish between the two states.

Anything I missed?

As for the PWD (now PWP) document itself (ie, the replacement of the EPUB+WEB document) I propose

- (of course) update the the definitions of PWP along these lines
- *Add* the definitions of the states, too (with the change above)

Actually I would consider making the PWP document the "authoritative" one and, instead of having glossary items in double, and add a pointer to the wiki page to the PWP editor's draft. It is never good to have the same things repeated…

Unless somebody is loudly objecting to these, I will do the changes soon, probably tomorrow.

On practical terms, I would prefer if we could use the github issue list with specific issues; it helps keeping the document evolving. Of course, if you do not have a github id (or you do not want to have one) than email works as well…

Thanks to all,

Ivan



> It would give context to "web resource" as a
> content-carrying resource and not just anything. Assuming that's what you
> want web resource to mean now.
> 
> Matt
> 
> -----Original Message-----
> From: Matt Garrish [mailto:matt.garrish@bell.net]
> Sent: September 28, 2015 11:43 AM
> To: 'Ivan Herman' <ivan@w3.org>
> Cc: 'Bill Kasdorf' <bkasdorf@apexcovantage.com>; 'Leonard Rosenthol'
> <lrosenth@adobe.com>; 'W3C Digital Publishing IG' <public-digipub-ig@w3.org>
> Subject: RE: web resource and terminology
> 
> Rather than pick at what I think are ambiguous words, here’s quick rundown
> of changes I’d suggest:
> 
> Don't use "content" in the web resource definition because content means
> something else in the context of a document you read. You already say it is
> a digital resource, so state that it can be accessed and leave it at that:
> 
> A Web Resource is a digital resource that can be uniquely addressed by a
> Unified Resource Identifier (URI) [URI], and _that_ can be accessed through
> standard protocols like HTTP, FTP, the File Protocol, etc.
> 
> Include the wcag definition perhaps tweaked to:
> 
> content (Web content)
> information and sensory experience to be communicated to the user by means
> of a user agent, including web all _web resources_ that define the
> content's structure, presentation, and interactions
> 
> Then move essential content and functionality under this definition where
> they're more tightly bound.
> 
> State what a web document represents (single document/publication), not only
> that it is a web resource. You start the definition saying it is a web
> resource, so why end it saying it's to be considered as one?
> 
> Sorry if I'm being pedantic about these definitions, but I think they make
> the concepts harder to understand than they need to be.
> 
> Anyway, I'm coming down with yet another cold, so if I go silent for a while
> I'm not just ignoring people.
> 
> Matt
> 
> ________________________________________
> From: Ivan Herman [mailto:ivan@w3.org]
> Sent: September-28-15 10:13
> To: Matt Garrish
> Cc: Bill Kasdorf; Leonard Rosenthol; W3C Digital Publishing IG
> Subject: Re: web resource and terminology
> 
> 
> On 28 Sep 2015, at 16:01 , Matt Garrish <matt.garrish@bell.net> wrote:
> 
> Getting rid of "collate" is a useful step toward clarity, but I'm not
> suggesting dropping resources for content. What I'm saying is that you're
> already using content without any clear definition of what you mean when you
> use it, and that's equally confusing.
> 
> I misunderstood you. I really thought you wanted to remove resources in
> favour of content
> 
> Just to be clear, would it be o.k. with you and others if
> 
> - copied the WCAG definition for content into the definition, ie:
> 
> • content (Web content): information and sensory experience to be
> communicated to the user by means of a user agent, including code or markup
> that defines the content's structure, presentation, and interactions
> - change collate to aggregate
> 
> I fine making these changes, unless somebody stops me:-)
> 
> Ivan
> 
> 
> 
> To run through your definitions again, from web resource:
> 
>> and whose content can be accessed
> 
> What does content mean here? A style sheet has content that can be accessed
> by any protocol. It's like you're trying to scope the RDF meaning of web
> resource here without stating why this even matters. There's a difference
> between the content of a file and the content that gets consumed by a user.
> WCAG recognizes this, but you took two sub-definitions and omitted stating
> what content is. It leaves me having to read between the lines.
>> Essential Content of a Web Resource: if removed, would fundamentally
> change the information or functionality of the content.
> Here content becomes "essential", but the only "content" mentioned so far is
> the data of the resource. Isn't all the data of a single resource
> fundamental? Why would any user agent be removing bytes of data? This
> statement makes no sense unless I go off on my own tangent and assume that
> you don't really mean the data of the resource anymore but (perhaps) other
> resources that are referenced by the resource (e.g., images, audio, video,
> etc.).
> 
> I'd ask in that case why essential content isn't defined under web document,
> since the impact is on the document, whether or not it affects a particular
> resource. If you remove certain essential resources, I can follow that you
> break the fundamental information/functionality expressed by the document.
> 
> Anyway, I'm running out of steam. Most casual readers I suppose skip right
> over terminology, anyway, and read whatever meaning they want into documents
> from their titles and loose skimming of the content...
> 
> Matt
> 
> From: Ivan Herman [mailto:ivan@w3.org]
> Sent: September 28, 2015 7:40 AM
> To: Matt Garrish <matt.garrish@bell.net>
> Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>; Leonard Rosenthol
> <lrosenth@adobe.com>; W3C Digital Publishing IG <public-digipub-ig@w3.org>
> Subject: Re: web resource and terminology
> 
> (A common response to the thread, not only to this mail.)
> 
> - I must admit I do not have the same feeling about "resource" v.a.v.
> "content". I guess everyone comes with a different baggage that influences
> our reactions. For me (and I think it was Deborah who brought this into the
> discussion) the term 'resource' is very generic and I was primarily
> influenced by the term as used in RDF[1], although we intentionally
> restricted the RDF term to Web resources (in RDF, conceptually, I can also
> be considered as a resource:-).
> 
> Also, to be awfully pedantic: the "content" of a resource is not the same as
> the resource itself. If I remove some content from a resource, it is still
> the same resource, though with a different content. Ie, I do not think
> relying exclusively on the concept of 'content' would cut it either.
> 
> - I accept the criticism on "collation". I must admit I did not realize it
> has the concept of ordering in it but I obviously yield to my anglo-saxon
> colleagues (and the Merriam Webster entry:-).
> 
> Trying to retrace the history in the thread[2], the way we got to this term
> (and not only use 'set') is, primarily, because we wanted to differentiate
> between a random set of resources bound together and something with a clear
> intention of expressing something. The term 'curated' did come up, but there
> was a sense that the term has a jargon meaning in museums or libraries, ie,
> we should avoid using it. "Collated" came into the picture, expressing the
> intentionality. Another term that did come up during the discussion is
> "aggregated"; maybe that term is better than "collated". I just checked in
> Merriam Webster, and this terms does not suggest ordering, so I am happy to
> change that if people agree.
> 
> Thanks
> 
> Ivan
> 
> 
> 
> [1] http://www.w3.org/TR/rdf11-concepts/#resources-and-statements
> [2] http://j.mp/1O8eB6g
> 
> On 28 Sep 2015, at 01:45 , Matt Garrish <matt.garrish@bell.net> wrote:
> 
> I just hate nuances, and web document and html document are often used
> interchangeably without consideration that a web document isn't restricted
> to being an html document. It's clear that html documents aren't the only
> content-carrying resources allowed, but outside an audience well-versed in
> web terminology I expect the difference will get lost. I did a quick search
> and after an initial wikipedia entry that got it right, every use equated
> web document with html page.
> 
> But I get there is also awkwardness when you do, in fact, only want to
> represent a single html document.
> 
> Go with Portable Web Content and no one wins... ;)
> 
> Matt
> 
> From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com]
> Sent: September 27, 2015 7:11 PM
> To: Matt Garrish <matt.garrish@bell.net>; 'Leonard Rosenthol'
> <lrosenth@adobe.com>; 'W3C Digital Publishing IG' <public-digipub-ig@w3.org>
> Subject: RE: web resource and terminology
> 
> On "curation," I wasn't actually recommending it, I was just speculating
> that perhaps that was what was meant rather than "collation." I agree, a
> more neutral term, something like "assemble" or "collect" or their noun
> forms might be best. "Assemble" has the connotation of a bunch of stuff
> intended to work together, whereas "collect" really just connotes "gather
> together."
> 
> I like the direction you're going with the definition, but I still have a
> problem calling it a Web Document instead of a Web Publication. I have a
> hard time thinking of a big complex collection of resources as a document,
> but I don't have a hard time thinking of a simple standalone document as a
> publication.
> 
> --Bill K
> 
> From: Matt Garrish [mailto:matt.garrish@bell.net]
> Sent: Sunday, September 27, 2015 7:04 PM
> To: Bill Kasdorf; 'Leonard Rosenthol'; 'W3C Digital Publishing IG'
> Subject: RE: web resource and terminology
> 
> I agree that's better than collation, but curation is still odd. Do you
> curate your epub file? Do you curate a web page to make a portable
> representation of it?
> 
> Using "curation" also suggests strong ties with digital curation, and, while
> that activity that might use this portable format as part of the larger
> process of curation, it seems like unnecessary baggage to saddle the
> definition with.
> 
> Is how the resources came to be collected together of any importance
> compared to what they're intended to represent? That point is currently hard
> to discern, but why not something like "A Web Document is set of
> interrelated Web Resources that is intended to be considered as a single
> document or publication."?
> 
> Matt
> 
> From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com]
> Sent: September 27, 2015 4:53 PM
> To: Leonard Rosenthol <lrosenth@adobe.com>; Matt Garrish
> <matt.garrish@bell.net>; 'W3C Digital Publishing IG'
> <public-digipub-ig@w3.org>
> Subject: RE: web resource and terminology
> 
> Actually, three of the four non-religious definitions of "collate" in
> Merriam Webster are about arranging in a proper order, and people in
> publishing almost always associate it with ordering. So although you're
> technically correct that it doesn't always mean ordering, most of the time
> it does.
> 
> My guess was that possibly "curation" was meant, not "collation," which has
> more of a sense of a purposeful gathering together.
> 
> 
> 
> From: Leonard Rosenthol [mailto:lrosenth@adobe.com]
> Sent: Sunday, September 27, 2015 1:55 PM
> To: Matt Garrish; 'W3C Digital Publishing IG'
> Subject: Re: web resource and terminology
> 
> Matt – let me see if I can help.  (and, anyone else, feel free to correct
> me)
> 
> You are correct that a style sheet and a script (or a font) are as much
> resources as HTML is.  That is as it should be, because in the context of a
> web document, they aren’t necessarily different.  There is no reliance on a
> “primary resource” (as there is with EPUB, for example).
> 
> Essential content is what would be displayed to the user and/or machine
> processor – depending on the context.   So it might be displaying text, or a
> .csv of spreadsheet data or … But it’s not a font, for example, that
> wouldn’t (necessarily) change the content itself (granted there are
> exceptions to that rule as well, but…)
> 
> Collation is simply a grouping – it has nothing to do with ordering.
> 
> I don’t recall if “web content” was suggested or not, but from your
> description, I don’t think it fits our model (or at least mine).  There are
> things that fit into a PWD that are neither “web content” nor “rendering
> resource” - for example, my .csv in the previous example.  But that is a
> perfectly valid web resource.  I think web resource is a more generic form
> of both – and maybe we could define it that way, if necessary. (though I
> don’t see the necessity right now).
> 
> I think the single page vs. multiple page – or the general problem of
> “sectioning’ a web document hasn’t yet been raised.
> 
> Leonard
> 
> From: Matt Garrish
> Date: Sunday, September 27, 2015 at 8:52 AM
> To: 'W3C Digital Publishing IG'
> Subject: web resource and terminology
> Resent-From: <public-digipub-ig@w3.org>
> Resent-Date: Sunday, September 27, 2015 at 8:52 AM
> 
> I've been trying to read through the terminology and find there's a
> confusing reliance on "web resource" to mean both the content of the
> document/publication and the resources needed to render the document.
> 
> The definition of web resource seems reasonable enough, in that anything
> that can be referenced by a URI is a resource. By that definition, an HTML
> document is a web resource, but so is a style sheet, script, etc. Stating
> that the content of the resource can be retrieved by a protocol doesn't mean
> that a resource has content in the readable content of the document sense
> (e.g., a style sheet's "content" is all the rules defined in it).
> 
> The two sub-bullets then start to make an unstated distinction between types
> of web resources, however, as an html document will have "essential
> content", but a style sheet or script wouldn't appear to.
> 
> The confusion grows in the web document definition, as now web resources are
> "collated." Is it really the case that fonts, scripts, etc. are combined
> into a specific ordering? I didn't follow the entire email chain,
> unfortunately, but I do recall seeing this in relation to an ordering of the
> content in the web document. Collation makes sense in that context, as it is
> analogous to the epub spine.
> 
> And finally, web resource reappears in its more general sense in the third
> bullet, but here suggesting "essentiality" of certain resources but not
> others (I take from the discussions this has to do with not every resource
> impacting the overall readability).
> 
> Long story short, was consideration given to including a definition of "web
> content" (as also exists in WCAG) to disambiguate these many uses of "web
> resource" for both content and rendering resources? Essential web content
> and functionality is clearer than stated now for resources. A web document
> as a collation of web content is also clearer, and it being a web resource
> is less confusing. Portability would depend on the ability to present the
> content, even if some rendering resources aren't available.
> 
> Anyway, just wanted to share that thought I had while reading. The
> definitions are very nuanced right now without the context of the email
> discussions.
> 
> And as a side note, if "web document" is the ultimate choice for this then
> it might be good to bump up in importance that web document != html document
> from the last sub-bullet of the web document definition. I expect the terms
> are read as synonymous by many people, in which case having a web document
> made up of resources makes it sound like you're defining portability only
> for single pages.
> 
> Matt
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> 
> 
> 
> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
> 
> 
> 
> 
> 
> 


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704





Received on Tuesday, 29 September 2015 09:30:58 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:13 UTC