- From: Nick Ruffilo <nickruffilo@gmail.com>
- Date: Wed, 27 Jan 2016 10:18:05 -0500
- To: Craig Francis <craig.francis@gmail.com>
- Cc: Leonard Rosenthol <lrosenth@adobe.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>, Ivan Herman <ivan@w3.org>
- Message-ID: <CA+Dds5_K_xuyG085xci=kYENQ42RfGQuB0wbknvs6zDB5bwYAg@mail.gmail.com>
It's possible that PWP will not solve the use case of legal documents being static - and I believe that's OK. We cannot expect to have one document format to rule them all (MY PRECIOUS!) Also - It's probably worth a discussion on the list whether we need to have something that works immediately or not. My feeling is that is to restrictive. -Nick On Wed, Jan 27, 2016 at 6:25 AM, Craig Francis <craig.francis@gmail.com> wrote: > On 26 Jan 2016, at 21:51, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > PDF seems like a much better alternative. (NOTE: a PDF can be 100% > identically accessible to HTML – it just happens that authoring accessible > HTML is easier than accessible PDF, but that’s a tool issue not a format > issue) > > > > Hi Leonard, > > In regards to accessibility, PDF can be marked up to help assistive > devices like screen readers, but it's rarely (if ever) used. > > Actually, I've found too many PDF documents where the authoring tool has > placed every character individually, but that can happen in any file format > (unfortunately more so with PDF, probably because its authoring tools only > focus on the visual output). > > And there are other accessibility problems with PDF's, such as the > inability to change the font size (simply zooming in is not enough), the > font cannot be changed (e.g. applying the OpenDyslexic > <http://opendyslexic.org/get-it-free/> font), the font/background colours > cannot be changed (e.g. colour contrast), and they cannot be re-formatted > for different screen sizes (I think this effects anyone who has tried to > read an A4 document on a 4" screen, needing to use horizontal and vertical > scrolling)... and in most cases you can't even copy/paste the text content > (try doing it when the document is formatted with 2 columns of text). > > Anyway, accessibility in PDFs aside... > > Because the PWP model seems to work with the assumption of a central URL, > and new versions can be pulled down, it will not work for legal documents. > > So I agree with you Leonard, PDF seems to be a better alternative to PWP > in these situations. > > All I'm proposing is a much simpler (but probably similar) file format to > PWP, using Open Web Technology (not so much the Platform), nearly all of > which is already available in web browsers. > > And I believe this provides a much better alternative to PDF's... the only > downsides is that CSS cannot currently do the "pixel perfect" recreation of > the document (like PDF's kind of do), and that there are many existing > programs that already work with PDF's (this includes converting to a format > that printers can understand). > > This also applies to the other document formats such as MHTML and DOCX (MS > Office), which introduce security problems, and considerable development > complexity. > > Craig > > > > > On 26 Jan 2016, at 21:51, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > >I do feel that there is a need for a document format, as per my > understanding of PWP, that has the ability to be updated (e.g. for > publications). > >But that is different to files that need to remain as atomic units, that > remain isolated from everything else. > > > There is no requirement that a PWP needs to be updatable – that’s just one > use case where it could. At the same time, there are also clear use cases > (such as your own) where the document/publication is “atomic” or “unique” > and would never be modified. And these criteria are also separate from > others such as self-containment. > > Thanks for the info below – but I don’t see any advantage for HTML-based > publications in those workflows. You wouldn’t be leveraging anything > specific to the Open Web Platform and its ecosystem. PDF seems like a much > better alternative. (NOTE: a PDF can be 100% identically accessible to > HTML – it just happens that authoring accessible HTML is easier than > accessible PDF, but that’s a tool issue not a format issue) > > Leonard > > From: Craig Francis <craig.francis@gmail.com> > Date: Tuesday, January 26, 2016 at 9:53 AM > To: Leonard Rosenthol <lrosenth@adobe.com> > Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Ivan Herman < > ivan@w3.org>, Nick Ruffilo <nickruffilo@gmail.com> > Subject: Re: Proposal: PDF alternative using HTML (ZIP/GZIP) > > On 26 Jan 2016, at 12:47, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > PWP is designed to cover all of those use cases, as there are many uses > for publishing content – as seen in the myriad of industries that have > adopted PDF. > > > > > Hi Leonard, > > You are probably right, and I'm just thinking about it from a programmers > point of view (one who has to send reports). > > I do feel that there is a need for a document format, as per my > understanding of PWP, that has the ability to be updated (e.g. for > publications). > > But that is different to files that need to remain as atomic units, that > remain isolated from everything else. > > We also need to think how these files are consumed. For example, if I send > you an ePub file today, you will probably want to open and save it in an > e-reader with other books. Whereas if the email contained a PDF file, it > would be opened/read, but ultimately closed and not saved (where the email > can be archived if it needs to be read again later). > > I might be going into too many specifics, but I have a few examples below > if you're interested. > > Craig > > > > > > > I work for a company that assess students with disabilities who are going > to university. > > In the UK we have a couple of organisations, such as Student Finance > England (SFE), who provide funding to those students, so they can the get > the equipment or support they need. > > So the company I work for meet and do assessments for each student, get > quotes from suppliers, and make recommendations as to what each student > should have (e.g. a laptop, and note taking lessons). > > The report the assessor writes is currently sent to SFE as a PDF file, > which introduces a few accessibility issues. > > Ideally I would instead create a HTML file, package that into a ZIP (to > include some extra resources), and send it to SFE. > > But they will not open a HTML file due to the security implications (nor > would any student who we send it to, assuming they know that the HTML file > attachment can be opened in a web browser). > > Then, because SFE are so worried about the students private information, > they actually use PGP (the zip kind) and I believe they open the PDF report > on a computer that has extremely limited access to the internet (as in, can > only send and receive email). > > So when PWP does becomes available, I doubt they will accept them, > especially if they know that the report could be updated/changed in any way. > > SFE then send out a DSA2 file (which authorises the supplier to dispatch > the items), and the supplier in turn raises an invoice for SFE to pay... > neither of these (currently PDF) documents can be editable from a technical > or legal point of view. > > Another example is the Terms and Conditions we send to the student. While > this is a "living document" that is changed over time, the copy the student > receives must remain the same for them. > > Or when we send some statistics to SFE for the number/type of assessments > that were completed, even if we later find out that the type of one > assessment was wrong, and is technically incorrect, that file still needs > to record what was sent (plus a follow up report to show the corrected > statistics). > > Then, with a couple of my other clients, there are still contracts that > need to be signed, or invoices that are issued. > > All of these better fit the HTML+ ZIP proposal, which needs a very strict > sandbox. > > Whereas with PWP that better suits: > > - A writer publishing a fictional story, which might contain typos to be > corrected. > > - A newspaper which includes corrections, as more information is > discovered. > > - An academic writing a paper, where the document can referred to by > others by a URL. > > - An educational book that needs to be kept up to date with the latest > information, and distributed from a central server. > > And as Nick has just pointed out, maybe these documents could have their > own cookie store / local storage, allowing the document to record your > notes and answers. > > > > > > On 26 Jan 2016, at 12:47, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > PWP is designed to cover all of those use cases, as there are many uses > for publishing content – as seen in the myriad of industries that have > adopted PDF. > > Leonard > > From: Craig Francis <craig.francis@gmail.com> > Date: Tuesday, January 26, 2016 at 7:42 AM > To: Leonard Rosenthol <lrosenth@adobe.com> > Subject: Re: Proposal: PDF alternative using HTML (ZIP/GZIP) > > Thanks for the clarification Leonard, > > I can certainly see the use cases for JavaScript, and glad to see you are > considering them. > > Personally I would like to suggest not relying on warnings to the user (as > they don't really understand what they mean), but I like that you are also > considering restricting the JavaScript. > > > > Otherwise I think the proposed HTML+ZIP and PWP documents are similar > (e.g. using HTML+CSS), but do have slight differences: > > PWP: Documents are kept up to date, where (temporary) offline copies can > be made. > > PWP: Published from a central location, so references to it can be made > (like saying book X from author Y). > > HTMl+ZIP: Copies of the document can be created, but once those copies are > made, they remain as their own entity (typically for archival purposes). > > HTML+ZIP: Seen as read-only content (in as much as any computer document > is read-only), representing a document or data at that point in time. > > Craig > > > > > > > > > On 22 Jan 2016, at 19:42, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > Nick – you should be careful to separate the file format from the reader. > You do it well for PWP and RS, but forgot for PDF. > > Yes, a PDF file can contain JavaScript which are documented (according to > the spec) to run at specific times during the load and viewing of a PDF. > This is exactly like what JS can do with HTML, which is then what would > happen when packaged in a PWP. Certain subsets of PDF restrict the > presence of scripts entirely or in limited uses – just as EPUB currently > does as an example of a PWP. > > However, there are ZERO requirements (or even recommendations) in the PDF > standard about a “conforming reader” (the PDF term for a Reading System/RS) > providing any type of warnings about the presence (or lack thereof) for > JavaScript. So any such UI that might exist in your PDF conforming > reader of choice is that application’s decision. Other conforming readers > can/do things differently vis-a-vis JavaScript – including some (such as > Apple’s Preview) that completely ignore it. > > As for JS in PWP – I think it’s much too early to make any specific > statements about that. We know that some forms of PWP (such as EPUB x.x) > might choose to restrict the JS, just as it does today – but that’s a > specific case not the general one. Same with sandboxing, I don’t see that > as a PWP requirement but might well exist for certain specific cases and > implementations. > > Leonard > > From: Nick Ruffilo <nickruffilo@gmail.com> > Date: Friday, January 22, 2016 at 12:58 PM > To: Craig Francis <craig.francis@gmail.com> > Cc: Leonard Rosenthol <lrosenth@adobe.com>, Ivan Herman <ivan@w3.org>, > W3C Digital Publishing IG <public-digipub-ig@w3.org> > Subject: Re: Proposal: PDF alternative using HTML (ZIP/GZIP) > > Craig, > > Lets nail down exactly why the PWP wouldn't work for that situation. > Currently PDF does allow you some "scripting" but before it runs, the user > is prompted: "this PDF has scripting, do you wish to turn it on" Would > something like that (the choice of the reading system) suffice? > > Additionally, it is my understanding that the HTML and Javascript would be > in a sandbox environment, and have limited access (if any) to manipulate > external files. It would be the reading system's responsibility to feed > any data that the PWP would require externally. So the security issues > then lay outside of the PWP itself, and more in the reading system - > something that PWP could possibly address as a note to implementors... > > As a note - pretty much any MS Office file can have scripting in it, and > can actually manipulate files on the filesystem (there are viruses written > in word and excel). Because of this, Microsoft warns you before you run a > script in these formats. This hasn't stopped business in any way (or IT) > from trusting the storage and download of such files. > > My understanding is that even though the contents are HTML - this is not > to be thought of as the "open web" but a package format that uses all of > the open web technology. > > -Nick > > On Fri, Jan 22, 2016 at 12:12 PM, Craig Francis <craig.francis@gmail.com> > wrote: > >> Hi Nick, >> >> Yes, I certainly like the ideas behind PWP, and I'm glad to see this is >> happening. >> >> I just don't think it works for the original proposal, which is an >> alternative to PDF's, having all the benefits of HTML, but still remaining >> read-only files that can be emailed, and IT Departments can trust being on >> their computers (ref the security restrictions that can applied). >> >> Craig >> >> >> >> >> On 21 Jan 2016, at 14:16, Nick Ruffilo <nickruffilo@gmail.com> wrote: >> >> Craig, >> >> To your point of PWP being a format that has an interaction with a server >> - I don't disagree, but I think that's only 1 of the two main use cases for >> PWP. One of those cases is to be able to be a quality container for >> ebooks. Ebooks are expected to be read in an offline mode on devices that >> may not have any connectivity to the internet. In these cases, online is >> simply not an option - therefore the PWP must work in a 100% offline mode. >> The content creator ultimately has the choice to build their PWP the way >> they see fit. >> >> I imagine a significant majority of PWPs created will be "offline" >> assuming that popular word processors adopt it as a format. Mainly because >> of the business case you brought up - an employee generating an >> offline-mode file for sharing and archival purposes. But, there will be >> many use cases where an updateable, benefiting-from-access-to-the-internet >> document format is superior. >> >> -Nick >> >> >> >> On Thu, Jan 21, 2016 at 7:02 AM, Craig Francis <craig.francis@gmail.com> >> wrote: >> >>> Hi Nick, >>> >>> I'm glad to see that you're not trying to dilute PWP with too many use >>> cases. >>> >>> With your comment about exporting it as a HTML file, and emailing that, >>> this is where the problems currently lie, and why I'm making this proposal. >>> >>> I'm not sure which mailing lists you are subscribed to, but in summary, >>> a HTML file on its own is a big security problem, and it's difficult to >>> include resources (in terms of development time/tooling)... for more info, >>> please see: >>> >>> https://lists.w3.org/Archives/Public/public-webappsec/2016Jan/0090.html >>> >>> https://lists.w3.org/Archives/Public/public-webappsec/2016Jan/0089.html >>> >>> In regards to PWP, I feel that it is a good idea, and defiantly has its >>> use cases. >>> >>> But I suspect that file format PWP becomes to be known as, will be seen >>> as something that has an interaction with a server, and allows for the >>> document to be updated. >>> >>> That defiantly has its uses, but as with PDF's, there are cases where >>> it's good to know that the file sent cannot change, or communicate with an >>> external server for any reason (instead its seen as being locked down, in a >>> read only state, via a sand box that the browser provides). >>> >>> So where you see PWP being a more versatile format than PDF, that is >>> good, but I believe we also need a second branch which takes some of the >>> strengths of PDF, and uses existing technology to fix some of its problems >>> (which I hope my previous emails explain, but I am happy to discuss if not). >>> >>> Craig >>> >>> >>> >>> >>> On 19 Jan 2016, at 14:39, Nick Ruffilo <nickruffilo@gmail.com> wrote: >>> >>> Craig, >>> >>> These are great questions, and I hope I can address some of them. First >>> off - PWP - like any potential document format - is not aimed at solving >>> all possible use cases, nor should it. That said, we also realize that >>> there is potentially a gap in what software capabilities are today and what >>> might be needed for a high-quality PWP to function as smoothly as a PDF >>> would today. >>> >>> To speak to your specific case - the PDF sales report. Using today's >>> technology, you could export that sales report as an HTML file, attach >>> that, and open that in your browser. It can be archived, the local copy >>> can only be changed by the user, etc, What is not yet native in most >>> browsers is the ability to have a package of HTML files. >>> >>> For the case of a completely offline file - something more static - PWP >>> completely allows for that, as long as the package is created referencing >>> static files that can be grabbed when making the offline package. That is >>> completely within scope and a use case that has been considered. PWP does >>> go one step further and let you have files that reference external >>> resources. This would let you keep data charts up-to-date, Make quick >>> updates to color schemes, or pretty much anything else you may want to >>> update. This is a feature - and optional. >>> >>> From my perspective - the goal for PWP is to create a package format >>> that makes sense for the future. PDF has specific use cases where it is >>> amazing - it has had many years to be adopted and honed. Outside of those >>> use cases, PWP hopes to cover many things that PDF does not do. That >>> doesn't mean that PDF will be useless, as I imagine businesses will be >>> exporting sales reports in PDF for the next 10 years (the same way people >>> are still using CSV when there is XLSX format...) But I believe that PWP >>> aims to be a more versatile format than PDF which is it's differentiation. >>> >>> -Nick >>> >>> On Tue, Jan 19, 2016 at 7:29 AM, Craig Francis <craig.francis@gmail.com> >>> wrote: >>> >>>> On 18 Jan 2016, at 20:42, Leonard Rosenthol <lrosenth@adobe.com> wrote: >>>> >>>> > Actually, Ivan is pointing out that an active work project - called >>>> PWP >>>> >>>> >>>> >>>> >>>> Hi Leonard, >>>> >>>> And yes, good point, I completely mixed up the EUPB3 and PWP (Portable >>>> Web Publication): >>>> >>>> http://www.w3.org/TR/pwp >>>> >>>> I've just read though the PWP Working Draft, and have some notes below. >>>> >>>> In summary, I think it's a good idea, but I'm not sure it really >>>> focuses on the same problem (but please let me know if I've misunderstood). >>>> >>>> Craig >>>> >>>> >>>> >>>> >>>> >>>> Just to set the tone, people like to receive PDF's for documents (e.g. >>>> sales reports) because they can be treated as an atomic document, that >>>> isn't really editable (unlike an email), and can be saved for archivable >>>> purposes (with no reliance on a website to be available to view it). >>>> >>>> Another example is someone who sees a webpage with some useful content, >>>> and they want a copy of that content on their local computer (aka "Save Web >>>> Page as"), so that they don't need to rely on an internet connection, for >>>> the website to remain available (or being able to find the page again), or >>>> the content on that page to change. >>>> >>>> Now there are defiantly some similarities to the problems we are trying >>>> to address, with the main focus for me being the archive format: >>>> >>>> https://www.w3.org/TR/pwp/#package >>>> >>>> But this seems to be a very general spec, with options to have the >>>> content unpackaged and delivered over the internet (rather than just a >>>> single file): >>>> >>>> https://www.w3.org/TR/pwp/#state_definition >>>> >>>> In contrast, the spec seems to not really focus on being a file that >>>> can be passed around/archived (e.g. emailing a PDF), but instead a central >>>> resource which allows for copies of the document to be downloaded. >>>> >>>> https://www.w3.org/TR/pwp/#identification >>>> >>>> This is useful if you want to have a central location for a document, >>>> and is kept up to date, but not so good if the primary purpose is really to >>>> have a copy that is created at one point in time, where the person who >>>> receives a copy will know that at it will stay as-is (read only). >>>> >>>> This setup seems to be confirmed in the security section: >>>> >>>> https://www.w3.org/TR/pwp/#security-models >>>> >>>> So if I was to send a report to a manager with sales figures, they will >>>> want to open it on their mobile phone (a quick read before bedtime, I >>>> assume), then later save it to their desktop computer so they can compare >>>> it later to the next months report. >>>> >>>> So when the Working Draft mentions things like JavaScript Service >>>> Workers: >>>> >>>> https://www.w3.org/TR/pwp/#arch >>>> >>>> And the concept of these documents having the ability to do things >>>> (presumably allowing the content to change, perform tracking, etc), I don't >>>> think it's fundamentally the right approach to this problem. >>>> >>>> But don't get me wrong, Portable Web Publications would be very good >>>> for Publications... I just don't think many businesses use PDF attachments >>>> in that way. >>>> >>>> :-) >>>> >>>> >>>> >>>> >>>> >>>> > On 18 Jan 2016, at 20:42, Leonard Rosenthol <lrosenth@adobe.com> >>>> wrote: >>>> > >>>> > Actually, Ivan is pointing out that an active work project - called >>>> PWP (Portable Web Publication - to address the need for having a better way >>>> to publish content using web technologies both in a packaged and unpackaged >>>> form. >>>> > >>>> > A solution that aligns with EPUB (but would not be EPUB 3.x as we >>>> know it today) is certainly something being serious considered by various >>>> folks as part of this work. >>>> > >>>> > Leonard >>>> > >>>> > >>>> > >>>> > On 1/18/16, 12:26 PM, "Craig Francis" <craig.francis@gmail.com> >>>> wrote: >>>> > >>>> >> On 18 Jan 2016, at 17:13, Leonard Rosenthol <lrosenth@adobe.com> >>>> wrote: >>>> >>> So that a user browsing PDFs on the web doesn’t need anything extra. >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> I think Ivan is suggesting that EPUB3 might do the same. >>>> >> >>>> >> I'm still not 100% convinced how well it will work (as this does >>>> depend heavily on the OS, and browsers). >>>> >> >>>> >> But in both cases (EPUB3, or using a ZIP to wrap up the HTML >>>> document+assets) most of the building blocks are already in place. >>>> >> >>>> >> Craig >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >>> On 18 Jan 2016, at 17:13, Leonard Rosenthol <lrosenth@adobe.com> >>>> wrote: >>>> >>> >>>> >>> While a PDF file does need a “reader”, it should be pointed out >>>> that EVERY MAJOR browser (Safari, Chrome, Edge, FireFox) all include PDF >>>> viewing natively. So that a user browsing PDFs on the web doesn’t need >>>> anything extra. >>>> >>> >>>> >>> Leonard >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> On 1/18/16, 11:43 AM, "Craig Francis" <craig.francis@gmail.com> >>>> wrote: >>>> >>> >>>> >>>> On 18 Jan 2016, at 16:13, Ivan Herman <ivan@w3.org> wrote: >>>> >>>> >>>> >>>>> Yeah. That will take time. On MacOS (starting from, I believe, >>>> Mavericks) the system comes with an epub reader, so files of this kind are >>>> automatically opened much like PDF files. Yes, it is an ebook reader on the >>>> OS, but that is not much different than using a PDF reader. >>>> >>>>> >>>> >>>>> To be incorporated into browsers is a big step (and would be a >>>> big step forward) which will need additional spec work. We are kept busy:-) >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Good to know, and good point about PDF files needing a reader. >>>> >>>> >>>> >>>> If I could push the format in any way (more so how the software >>>> works), I would like to be able to send a document that is opened, read, >>>> and closed without it being imported into some kind of library. >>>> >>>> >>>> >>>> Maybe some ability for email clients to open the file for a "quick >>>> look" (as per the OSX term), then optionally import. >>>> >>>> >>>> >>>> But I realise this is going away from the idea of using this >>>> format primarily for books. >>>> >>>> >>>> >>>> Anyway, thanks for the heads up. >>>> >>>> >>>> >>>> Craig >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On 18 Jan 2016, at 16:13, Ivan Herman <ivan@w3.org> wrote: >>>> >>>>> >>>> >>>>>> >>>> >>>>>> On 18 Jan 2016, at 16:58, Craig Francis <craig.francis@gmail.com> >>>> wrote: >>>> >>>>>> >>>> >>>>>> Hi Ivan, >>>> >>>>>> >>>> >>>>>> Just to follow up on this, I've been reading the spec at: >>>> >>>>>> >>>> >>>>>> http://www.idpf.org/epub/30/spec/epub30-overview.html >>>> >>>>>> >>>> >>>>>> And it does seem pretty much what I'm after. >>>> >>>>>> >>>> >>>>>> I'm not sure I like the extra meta files, but maybe they are >>>> useful (e.g. the possibility of containing multiple HTML documents, one for >>>> each language). >>>> >>>>>> >>>> >>>>> >>>> >>>>> For example. A book may also consists of many chapters each in >>>> their individual files and the order is not clear. Etc. >>>> >>>>> >>>> >>>>>> So really the only remaining problem is getting email clients, >>>> browsers, OS'es to be able to open these files quickly/easily... rather >>>> than just automatically importing the file into an ebook reader. >>>> >>>>> >>>> >>>>> Yeah. That will take time. On MacOS (starting from, I believe, >>>> Mavericks) the system comes with an epub reader, so files of this kind are >>>> automatically opened much like PDF files. Yes, it is an ebook reader on the >>>> OS, but that is not much different than using a PDF reader. >>>> >>>>> >>>> >>>>> To be incorporated into browsers is a big step (and would be a >>>> big step forward) which will need additional spec work. We are kept busy:-) >>>> >>>>> >>>> >>>>> Cheers >>>> >>>>> >>>> >>>>> Ivan >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>>> >>>> >>>>>> Craig >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>>> On 14 Jan 2016, at 11:17, Ivan Herman <ivan@w3.org> wrote: >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>>> On 14 Jan 2016, at 12:05, Craig Francis < >>>> craig@craigfrancis.co.uk> wrote: >>>> >>>>>>>> >>>> >>>>>>>> Thanks Ivan, >>>> >>>>>>>> >>>> >>>>>>>> You are right, I normally focus more on security side of >>>> things. >>>> >>>>>>>> >>>> >>>>>>>> But out of interest, EPUB3, is that likely to get the same >>>> integration as how PDFs work at the moment? >>>> >>>>>>>> >>>> >>>>>>>> As in, you can email someone an EPUB3 file, and the recipient >>>> can click/tap on it to quickly view in their email client? >>>> >>>>>>>> >>>> >>>>>>>> Or simply have the web browser open it, rather than needing a >>>> dedicated EPUB3 reader? >>>> >>>>>>> >>>> >>>>>>> In theory, all this is possible but the infrastructure is not >>>> as widespread as for PDF. Eg, you need extensions for Firefox to open an >>>> epub directly. >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> So far I've really only considered EPUB as more of a format >>>> for books (which is probably my lack of understanding of the format), so >>>> I've never really thought of its use for reports, leaflets, etc (i.e. >>>> things that PDF's tend to be used for). >>>> >>>>>>>> >>>> >>>>>>> >>>> >>>>>>> EPUB is perfectly capable of handling that out of the box. >>>> >>>>>>> >>>> >>>>>>> Ivan >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>>> In the mean time I'll have a read up on the PWP group. >>>> >>>>>>>> >>>> >>>>>>>> Craig >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>>> On 14 Jan 2016, at 10:52, Ivan Herman <ivan@w3.org> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>> Craig, >>>> >>>>>>>>> >>>> >>>>>>>>> thanks for your note. Two comments: >>>> >>>>>>>>> >>>> >>>>>>>>> - The format EPUB3, defined by IDPF, already does many of >>>> what you say. On a very high level, it takes a (slightly constrained) Web >>>> site and puts it into, essentially, a zip file. For many applications, this >>>> is a worthy replacement for PDF. Note that almost all the electronic books >>>> you buy today are in EPUB3 or its predecessor... >>>> >>>>>>>>> >>>> >>>>>>>>> - The DPUB IG also looks further down the line on a stronger >>>> integration of digital publishing and the OWP: >>>> >>>>>>>>> >>>> >>>>>>>>> http://www.w3.org/TR/pwp >>>> >>>>>>>>> >>>> >>>>>>>>> which may lead to significant changes in the future. >>>> >>>>>>>>> >>>> >>>>>>>>> Bottom line: this evolution is already happening! >>>> >>>>>>>>> >>>> >>>>>>>>> I understand you come more from the security area; there may >>>> be security issues with EPUB3 or PWP which we do not fully appreciate, so >>>> any comment is welcome of course! >>>> >>>>>>>>> >>>> >>>>>>>>> Cheers >>>> >>>>>>>>> >>>> >>>>>>>>> Ivan >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>>> On 14 Jan 2016, at 11:34, Craig Francis < >>>> craig@craigfrancis.co.uk> wrote: >>>> >>>>>>>>>> >>>> >>>>>>>>>> Hi, >>>> >>>>>>>>>> >>>> >>>>>>>>>> Recently I've been thinking of some of the problems with >>>> PDF's, which are useful for creating a document that can be archived, >>>> emailed, printed, etc. >>>> >>>>>>>>>> >>>> >>>>>>>>>> HTML has solutions for many of PDF's problems though, for >>>> example structured text (accessibility), ability to change layout depending >>>> on screen size (no need for small screen devices to zoom into a fixed A4 >>>> layout), can change font size, better indexing support (searching for >>>> documents), etc. >>>> >>>>>>>>>> >>>> >>>>>>>>>> Unfortunately you can't just email a HTML document to >>>> someone, as this causes a range of security problems, and including >>>> resources can be difficult (you can inline them, or use MHTML, but these >>>> are tricky to create). >>>> >>>>>>>>>> >>>> >>>>>>>>>> So I was wondering if we could take the approach that >>>> Microsoft Word did with the docx format, Java with JAR, PHP with PHAR, >>>> etc... >>>> >>>>>>>>>> >>>> >>>>>>>>>> Have a new file format, associated with the browser, which >>>> is just a ZIP/GZIP file that contains an index.html file, and everything >>>> else needed for the document. >>>> >>>>>>>>>> >>>> >>>>>>>>>> Then from a security point of view, it can be locked down to >>>> its own little box, so no access to other files on the file system, >>>> probably no access to cookies/localstorage, no ability to connect to >>>> another host. >>>> >>>>>>>>>> >>>> >>>>>>>>>> And from the users point of view, the document could be >>>> protected with a password (a feature that ZIP/GZIP provides already, and >>>> the browser can prompt for when opening). >>>> >>>>>>>>>> >>>> >>>>>>>>>> So would this help with the security aspects of emailing >>>> HTML files to people (e.g. reports), and be better than PDFs? >>>> >>>>>>>>>> >>>> >>>>>>>>>> Craig >>>> >>>>>>>>>> >>>> >>>>>>>>>> --- >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> https://lists.w3.org/Archives/Public/public-webappsec/2016Jan/0063.html >>>> >>>>>>>>>> >>>> >>>>>>>>>> https://code.google.com/p/chromium/issues/detail?id=575677 >>>> >>>>>>>>>> >>>> >>>>>>>>>> https://bugzilla.mozilla.org/show_bug.cgi?id=1237990 >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> https://wpdev.uservoice.com/forums/257854-microsoft-edge-developer/suggestions/11443002-webpage-zip-as-alternative-to-pdf >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> ---- >>>> >>>>>>>>> Ivan Herman, W3C >>>> >>>>>>>>> Digital Publishing Lead >>>> >>>>>>>>> Home: http://www.w3.org/People/Ivan/ >>>> >>>>>>>>> mobile: +31-641044153 >>>> >>>>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> ---- >>>> >>>>>>> Ivan Herman, W3C >>>> >>>>>>> Digital Publishing Lead >>>> >>>>>>> Home: http://www.w3.org/People/Ivan/ >>>> >>>>>>> mobile: +31-641044153 >>>> >>>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> ---- >>>> >>>>> Ivan Herman, W3C >>>> >>>>> Digital Publishing Lead >>>> >>>>> Home: http://www.w3.org/People/Ivan/ >>>> >>>>> mobile: +31-641044153 >>>> >>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >>>> >>>> >>>> >>>> >>>> >> >>>> >>>> >>>> >>> >>> >>> -- >>> - Nick Ruffilo >>> @NickRuffilo >>> Aer.io <http://aer.io/> an *INGRAM* company >>> >>> >>> >> >> >> -- >> - Nick Ruffilo >> @NickRuffilo >> Aer.io <http://aer.io/> an *INGRAM* company >> >> >> > > > -- > - Nick Ruffilo > @NickRuffilo > Aer.io <http://aer.io/> an *INGRAM* company > > > > > -- - Nick Ruffilo @NickRuffilo Aer.io an *INGRAM* company
Received on Wednesday, 27 January 2016 15:18:41 UTC