- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 29 Sep 2015 11:30:35 +0200
- To: W3C Digital Publishing IG <public-digipub-ig@w3.org>
- Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>, Leonard Rosenthol <lrosenth@adobe.com>, Matt Garrish <matt.garrish@bell.net>, Markus Gylling <markus.gylling@gmail.com>, Brady Duga <duga@google.com>
- Message-Id: <87CE1311-F42E-4F14-8508-BDB23B53B3F0@w3.org>
This is partially an answer Matt, thereby closing (hopefully) the separate thread, plus a summary of yesterday's discussions. > On 28 Sep 2015, at 19:50 , Matt Garrish <matt.garrish@bell.net> wrote: > > Sorry, my head's in a cloud, but on looking at the definitions again if you > add the definition of content that alone would address most of my concerns > (if "collated" goes). I would propose to do that. Combined with the results of the discussions on yesterday's IG call, here are the changes I plan to do on the glossary (both on the wiki page and in the PWD document) - Rename (Portable) Web Document to (Portable) Web Publication overall (and close the relevant issue on github) - Add the definition of 'Content', taking over (possibly adjusting) the WCAG definition - Add a separate bullet item to Web Publication emphasizing the fact that the set has its own URI - Change 'collated' to 'aggregated' - Change the online/offline to protocol vs. file access, making it clear that this distinction has a value for the way user agents operate (and that is why we have to distinguish between the two states. Anything I missed? As for the PWD (now PWP) document itself (ie, the replacement of the EPUB+WEB document) I propose - (of course) update the the definitions of PWP along these lines - *Add* the definitions of the states, too (with the change above) Actually I would consider making the PWP document the "authoritative" one and, instead of having glossary items in double, and add a pointer to the wiki page to the PWP editor's draft. It is never good to have the same things repeated… Unless somebody is loudly objecting to these, I will do the changes soon, probably tomorrow. On practical terms, I would prefer if we could use the github issue list with specific issues; it helps keeping the document evolving. Of course, if you do not have a github id (or you do not want to have one) than email works as well… Thanks to all, Ivan > It would give context to "web resource" as a > content-carrying resource and not just anything. Assuming that's what you > want web resource to mean now. > > Matt > > -----Original Message----- > From: Matt Garrish [mailto:matt.garrish@bell.net] > Sent: September 28, 2015 11:43 AM > To: 'Ivan Herman' <ivan@w3.org> > Cc: 'Bill Kasdorf' <bkasdorf@apexcovantage.com>; 'Leonard Rosenthol' > <lrosenth@adobe.com>; 'W3C Digital Publishing IG' <public-digipub-ig@w3.org> > Subject: RE: web resource and terminology > > Rather than pick at what I think are ambiguous words, here’s quick rundown > of changes I’d suggest: > > Don't use "content" in the web resource definition because content means > something else in the context of a document you read. You already say it is > a digital resource, so state that it can be accessed and leave it at that: > > A Web Resource is a digital resource that can be uniquely addressed by a > Unified Resource Identifier (URI) [URI], and _that_ can be accessed through > standard protocols like HTTP, FTP, the File Protocol, etc. > > Include the wcag definition perhaps tweaked to: > > content (Web content) > information and sensory experience to be communicated to the user by means > of a user agent, including web all _web resources_ that define the > content's structure, presentation, and interactions > > Then move essential content and functionality under this definition where > they're more tightly bound. > > State what a web document represents (single document/publication), not only > that it is a web resource. You start the definition saying it is a web > resource, so why end it saying it's to be considered as one? > > Sorry if I'm being pedantic about these definitions, but I think they make > the concepts harder to understand than they need to be. > > Anyway, I'm coming down with yet another cold, so if I go silent for a while > I'm not just ignoring people. > > Matt > > ________________________________________ > From: Ivan Herman [mailto:ivan@w3.org] > Sent: September-28-15 10:13 > To: Matt Garrish > Cc: Bill Kasdorf; Leonard Rosenthol; W3C Digital Publishing IG > Subject: Re: web resource and terminology > > > On 28 Sep 2015, at 16:01 , Matt Garrish <matt.garrish@bell.net> wrote: > > Getting rid of "collate" is a useful step toward clarity, but I'm not > suggesting dropping resources for content. What I'm saying is that you're > already using content without any clear definition of what you mean when you > use it, and that's equally confusing. > > I misunderstood you. I really thought you wanted to remove resources in > favour of content > > Just to be clear, would it be o.k. with you and others if > > - copied the WCAG definition for content into the definition, ie: > > • content (Web content): information and sensory experience to be > communicated to the user by means of a user agent, including code or markup > that defines the content's structure, presentation, and interactions > - change collate to aggregate > > I fine making these changes, unless somebody stops me:-) > > Ivan > > > > To run through your definitions again, from web resource: > >> and whose content can be accessed > > What does content mean here? A style sheet has content that can be accessed > by any protocol. It's like you're trying to scope the RDF meaning of web > resource here without stating why this even matters. There's a difference > between the content of a file and the content that gets consumed by a user. > WCAG recognizes this, but you took two sub-definitions and omitted stating > what content is. It leaves me having to read between the lines. >> Essential Content of a Web Resource: if removed, would fundamentally > change the information or functionality of the content. > Here content becomes "essential", but the only "content" mentioned so far is > the data of the resource. Isn't all the data of a single resource > fundamental? Why would any user agent be removing bytes of data? This > statement makes no sense unless I go off on my own tangent and assume that > you don't really mean the data of the resource anymore but (perhaps) other > resources that are referenced by the resource (e.g., images, audio, video, > etc.). > > I'd ask in that case why essential content isn't defined under web document, > since the impact is on the document, whether or not it affects a particular > resource. If you remove certain essential resources, I can follow that you > break the fundamental information/functionality expressed by the document. > > Anyway, I'm running out of steam. Most casual readers I suppose skip right > over terminology, anyway, and read whatever meaning they want into documents > from their titles and loose skimming of the content... > > Matt > > From: Ivan Herman [mailto:ivan@w3.org] > Sent: September 28, 2015 7:40 AM > To: Matt Garrish <matt.garrish@bell.net> > Cc: Bill Kasdorf <bkasdorf@apexcovantage.com>; Leonard Rosenthol > <lrosenth@adobe.com>; W3C Digital Publishing IG <public-digipub-ig@w3.org> > Subject: Re: web resource and terminology > > (A common response to the thread, not only to this mail.) > > - I must admit I do not have the same feeling about "resource" v.a.v. > "content". I guess everyone comes with a different baggage that influences > our reactions. For me (and I think it was Deborah who brought this into the > discussion) the term 'resource' is very generic and I was primarily > influenced by the term as used in RDF[1], although we intentionally > restricted the RDF term to Web resources (in RDF, conceptually, I can also > be considered as a resource:-). > > Also, to be awfully pedantic: the "content" of a resource is not the same as > the resource itself. If I remove some content from a resource, it is still > the same resource, though with a different content. Ie, I do not think > relying exclusively on the concept of 'content' would cut it either. > > - I accept the criticism on "collation". I must admit I did not realize it > has the concept of ordering in it but I obviously yield to my anglo-saxon > colleagues (and the Merriam Webster entry:-). > > Trying to retrace the history in the thread[2], the way we got to this term > (and not only use 'set') is, primarily, because we wanted to differentiate > between a random set of resources bound together and something with a clear > intention of expressing something. The term 'curated' did come up, but there > was a sense that the term has a jargon meaning in museums or libraries, ie, > we should avoid using it. "Collated" came into the picture, expressing the > intentionality. Another term that did come up during the discussion is > "aggregated"; maybe that term is better than "collated". I just checked in > Merriam Webster, and this terms does not suggest ordering, so I am happy to > change that if people agree. > > Thanks > > Ivan > > > > [1] http://www.w3.org/TR/rdf11-concepts/#resources-and-statements > [2] http://j.mp/1O8eB6g > > On 28 Sep 2015, at 01:45 , Matt Garrish <matt.garrish@bell.net> wrote: > > I just hate nuances, and web document and html document are often used > interchangeably without consideration that a web document isn't restricted > to being an html document. It's clear that html documents aren't the only > content-carrying resources allowed, but outside an audience well-versed in > web terminology I expect the difference will get lost. I did a quick search > and after an initial wikipedia entry that got it right, every use equated > web document with html page. > > But I get there is also awkwardness when you do, in fact, only want to > represent a single html document. > > Go with Portable Web Content and no one wins... ;) > > Matt > > From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com] > Sent: September 27, 2015 7:11 PM > To: Matt Garrish <matt.garrish@bell.net>; 'Leonard Rosenthol' > <lrosenth@adobe.com>; 'W3C Digital Publishing IG' <public-digipub-ig@w3.org> > Subject: RE: web resource and terminology > > On "curation," I wasn't actually recommending it, I was just speculating > that perhaps that was what was meant rather than "collation." I agree, a > more neutral term, something like "assemble" or "collect" or their noun > forms might be best. "Assemble" has the connotation of a bunch of stuff > intended to work together, whereas "collect" really just connotes "gather > together." > > I like the direction you're going with the definition, but I still have a > problem calling it a Web Document instead of a Web Publication. I have a > hard time thinking of a big complex collection of resources as a document, > but I don't have a hard time thinking of a simple standalone document as a > publication. > > --Bill K > > From: Matt Garrish [mailto:matt.garrish@bell.net] > Sent: Sunday, September 27, 2015 7:04 PM > To: Bill Kasdorf; 'Leonard Rosenthol'; 'W3C Digital Publishing IG' > Subject: RE: web resource and terminology > > I agree that's better than collation, but curation is still odd. Do you > curate your epub file? Do you curate a web page to make a portable > representation of it? > > Using "curation" also suggests strong ties with digital curation, and, while > that activity that might use this portable format as part of the larger > process of curation, it seems like unnecessary baggage to saddle the > definition with. > > Is how the resources came to be collected together of any importance > compared to what they're intended to represent? That point is currently hard > to discern, but why not something like "A Web Document is set of > interrelated Web Resources that is intended to be considered as a single > document or publication."? > > Matt > > From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com] > Sent: September 27, 2015 4:53 PM > To: Leonard Rosenthol <lrosenth@adobe.com>; Matt Garrish > <matt.garrish@bell.net>; 'W3C Digital Publishing IG' > <public-digipub-ig@w3.org> > Subject: RE: web resource and terminology > > Actually, three of the four non-religious definitions of "collate" in > Merriam Webster are about arranging in a proper order, and people in > publishing almost always associate it with ordering. So although you're > technically correct that it doesn't always mean ordering, most of the time > it does. > > My guess was that possibly "curation" was meant, not "collation," which has > more of a sense of a purposeful gathering together. > > > > From: Leonard Rosenthol [mailto:lrosenth@adobe.com] > Sent: Sunday, September 27, 2015 1:55 PM > To: Matt Garrish; 'W3C Digital Publishing IG' > Subject: Re: web resource and terminology > > Matt – let me see if I can help. (and, anyone else, feel free to correct > me) > > You are correct that a style sheet and a script (or a font) are as much > resources as HTML is. That is as it should be, because in the context of a > web document, they aren’t necessarily different. There is no reliance on a > “primary resource” (as there is with EPUB, for example). > > Essential content is what would be displayed to the user and/or machine > processor – depending on the context. So it might be displaying text, or a > .csv of spreadsheet data or … But it’s not a font, for example, that > wouldn’t (necessarily) change the content itself (granted there are > exceptions to that rule as well, but…) > > Collation is simply a grouping – it has nothing to do with ordering. > > I don’t recall if “web content” was suggested or not, but from your > description, I don’t think it fits our model (or at least mine). There are > things that fit into a PWD that are neither “web content” nor “rendering > resource” - for example, my .csv in the previous example. But that is a > perfectly valid web resource. I think web resource is a more generic form > of both – and maybe we could define it that way, if necessary. (though I > don’t see the necessity right now). > > I think the single page vs. multiple page – or the general problem of > “sectioning’ a web document hasn’t yet been raised. > > Leonard > > From: Matt Garrish > Date: Sunday, September 27, 2015 at 8:52 AM > To: 'W3C Digital Publishing IG' > Subject: web resource and terminology > Resent-From: <public-digipub-ig@w3.org> > Resent-Date: Sunday, September 27, 2015 at 8:52 AM > > I've been trying to read through the terminology and find there's a > confusing reliance on "web resource" to mean both the content of the > document/publication and the resources needed to render the document. > > The definition of web resource seems reasonable enough, in that anything > that can be referenced by a URI is a resource. By that definition, an HTML > document is a web resource, but so is a style sheet, script, etc. Stating > that the content of the resource can be retrieved by a protocol doesn't mean > that a resource has content in the readable content of the document sense > (e.g., a style sheet's "content" is all the rules defined in it). > > The two sub-bullets then start to make an unstated distinction between types > of web resources, however, as an html document will have "essential > content", but a style sheet or script wouldn't appear to. > > The confusion grows in the web document definition, as now web resources are > "collated." Is it really the case that fonts, scripts, etc. are combined > into a specific ordering? I didn't follow the entire email chain, > unfortunately, but I do recall seeing this in relation to an ordering of the > content in the web document. Collation makes sense in that context, as it is > analogous to the epub spine. > > And finally, web resource reappears in its more general sense in the third > bullet, but here suggesting "essentiality" of certain resources but not > others (I take from the discussions this has to do with not every resource > impacting the overall readability). > > Long story short, was consideration given to including a definition of "web > content" (as also exists in WCAG) to disambiguate these many uses of "web > resource" for both content and rendering resources? Essential web content > and functionality is clearer than stated now for resources. A web document > as a collation of web content is also clearer, and it being a web resource > is less confusing. Portability would depend on the ability to present the > content, even if some rendering resources aren't available. > > Anyway, just wanted to share that thought I had while reading. The > definitions are very nuanced right now without the context of the email > discussions. > > And as a side note, if "web document" is the ultimate choice for this then > it might be good to bump up in importance that web document != html document > from the last sub-bullet of the web document definition. I expect the terms > are read as synonymous by many people, in which case having a web document > made up of resources makes it sound like you're defining portability only > for single pages. > > Matt > > > ---- > Ivan Herman, W3C > Digital Publishing Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > ORCID ID: http://orcid.org/0000-0003-0782-2704 > > > > > > > ---- > Ivan Herman, W3C > Digital Publishing Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > ORCID ID: http://orcid.org/0000-0003-0782-2704 > > > > > > ---- Ivan Herman, W3C Digital Publishing Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Tuesday, 29 September 2015 09:30:58 UTC