Sorry publishers, EPUB is dying

[Ironic Note: While this site is responsive, the Google Trends graph, above, which shows relative search-volume for “EPUB” has fallen back to summer 2012 levels, may not display correctly on a phone or similar-sized device.]

As it became clear to the IDPF that publishers weren’t going to give up on a crafted, beautiful, printable page, they added support for fixed-layout pages to EPUB.

Publishers had said they wanted a single deliverable format that did it all – reflow AND fixed-layout AND be capable of driving a press (after all, print isn’t dying).

Adding fixed-layout support in EPUB allowed proponents to go back to publishers with the best-possible pitch. Or so they thought.

The hype is proving… excessive.

When EPUB was just a “web page in a zip file” it wasn’t the single deliverable format the publishers wanted. Ever since EPUB proponents started to pitch EPUB’s fixed-layout option they’ve been losing. Fixed-layout vs. reflowable is very much an either-or proposition for EPUB, a fact that directly contradicts the pitch that EPUB can meet publishers’ desire for a single deliverable format while confounding users trained to expect that “EPUB means it can reflow”.

There’s a worse problem that blocks fixed-layout EPUB from government-funded publishers. Fixed-layout EPUB is far less accessible than PDF, undermining one of EPUB’s key benefits and a key selling-point for the format.

With EPUB, as with HTML, accessibility is all within one page. As such, paragraphs, tables and lists that span two pages, to take a few simple examples, are simply inaccessible in a fixed-layout EPUB file.

Developers and technical product managers who want to learn about how PDF technology addresses accessibility in fixed-layout content, and many other aspects of PDF, should consider the PDF Technical Conference, October 19-20, in San Jose.

Postscript

(added to the above following my exchange with Bill in the comments)

Yes, there are innocuous explanations for EPUB’s Google Trend. Indeed, maybe fewer are using the search term relative to searches overall because EPUB is so generally accepted it’s like Kleenex – that’s certainly possible. For those who feel I’m grossly misinterpreting the data, consider the following graph:


7 comments

  1. September 29, 2015 at 19:00

    Duff, your information about EPUB and fixed layout is incorrect and out of date. Multiple-rendition publications was recently overwhelmingly approved by the IDPF membership as a final specification (http://www.idpf.org/epub/renditions/multiple/). So fixed and reflow is no longer an either/or proposition, it is now possible to create EPUB publications that contain both types of content, so a reading system can choose the appropriate rendition (such as using the reflowable one for aural rendition or on a small screen).

    And more importantly, the EPUB specifications recommend that fixed layout be used only for content that is naturally pre-paginated (comics and manga, highly illustrated children’s books, etc.). A paragraph spanning more than one page is a clue that someone has very likely has fixed layout that should have been made available as reflowable content, not just for accessibility but also to better enable consumption on mobile devices of various sizes. And PDF band-aided accessibility, after the fact, on top of a print-stream-based content structure, which just doesn’t work in practice. While theoretically PDF files can be made accessible, including page-spanning paragraphs, in practice only a small minority are. Even PDF files created by Adobe’s own products don’t always support accessibility. The right answer for accessibility, mobile device support, and semantic analysis is reflowable content, something PDF simply can’t support due to its fundamentally pre-paginated architecture.

    Regarding popularity of EPUB, as you may be aware, it is common for technologies that become widely adopted to get lower scores on Google Trends, not because they are “dying” but simply because they have become non-controversial. HTML5 for example is also in no danger of “dying” but it’s score has fallen in a similar manner to EPUB since 2012 (see: http://www.google.com/trends/explore?hl=en-US#q=HTML5&cmpt=q).

    And publisher support for EPUB has actually grown dramatically since 2012, with the major transition to EPUB 3, based on HTML5 and other modern Web Standards, is now essentially completed. Many large publishers are now sending a single EPUB 3 file to all distribution channels (including Amazon), many publishing tools (including Adobe InDesign) are natively supporting EPUB 3 (in the case of Adobe InDesign, both fixed layout and re-flowable content). Apple iBooks Author notably also added EPUB 3 export support this summer. On the reading systems front, Adobe has also joined the party, recently moving to EPUB 3 in their eBook solutions, Adobe Digital Editions and Adobe Reader Mobile SDK.

    Last but not least, anyone seriously involved in the accessibility community knows that EPUB is at the center of efforts to promote universal availability of “born accessible” digital content and PDF is falling off the radar just as it has for eBooks.

    PDF and EPUB don’t truly compete at least not at what they’re best at. PDF is a fantastic format for replicating paper digitally: that’s what it was designed for over twenty years ago, and it still does that job really well. EPUB is a portable document format designed to represent structured content that can be adapted to different types of presentation (including aural). EPUB can represent fixed-layout content so anything that can be done in PDF can, in principle, be represented in EPUB (the converse is not true). But if you have only fixed-layout content, especially if it’s headed to a printer sooner or later, PDF may well be your best choice (it’s certainly the most popular choice for pure fixed layout content). But if you want to make content that’s accessible on many different sizes of screens and by people who have print disabilities, EPUB is likely your best choice. In practice you just can’t make a highly accessible silk purse out of a pre-formatted sow’s ear.

  2. September 30, 2015 at 12:43

    Thanks for your thoughtful reply, Bill. I have a few responses, but rather than clutter with quoting, I’ll just take your points in order.

    It seems my information was correct but out of date, since approval of 1.0 of EPUB’s Multiple Renditions specification was in August, just over 1 month ago. I was unaware of it, which is my error for sure. Congratulations! I look forward to seeing the authoring tools. I sure hope it supports switching from reflow to fixed-layout on-the-fly, without loss of context (a realistic use-case, and a pet peeve of mine).

    In any case, I look forward with interest to seeing how publishers take up this new Multiple Renditions specification. Would it be snarky of me to point out that PDF has been, in a key sense, multiple-rendition capable since 2001? Granted, only a few developers took advantage, and granted, they failed to do it well, but the specification existed…

    It’s interesting that you describe the use-case for fixed-layout in terms of “pre-pagination”. I think of it more as the case where the physical relationship (left, right, above, below, adjacent, etc.) between elements of content is part of the author’s intent.

    Regarding devices of various sizes: the increasing size and resolution of such devices is (IMHO) making fixed-layout more attractive (again). Leaving aside novels and the like, the more common use-case for reflow is emerging as a convenience-mode for the smallest of devices. Even many novels, however, include at least some content (images, tables) which suffer from deployment in a reflowable manner.

    I worry that fixed-layout is just as “tacked on” to EPUB than tags are “tacked on” to PDF.

    Yes, PDF added accessibility to a page-description model, but PDF accessibility isn’t “theoretical” – it’s real (and in practice) in millions of files worldwide, from bank statements to court documents to Medicare benefit summaries to product brochures. PDF does indeed support reflowable content via the tagged PDF model. You can get that reflow experience by exporting a well-tagged PDF to HTML – which is essentially what assistive technology does.

    Regarding popularity; controversy does not drive Google Trends results; search term density does. It’s certainly possible that searches for EPUB are declining relative to searches overall because EPUB has become a commodity technology. But then… PDF is certainly a commodity, and take a look at its trend! (see https://www.google.com/trends/explore#q=PDF&cmpt=q&tz=Etc%2FGMT%2B4).

    Regarding accessibility; I don’t see any general move towards making EPUB the standard for accessible documents in general. That would have to include the (very very large) world of fixed-layout documents, and fixed-layout EPUB isn’t accessible, as previously noted. Instead, we see more toolmakers offering support for PDF/UA and more banks offering tagged PDF statements.

    I concur that PDF and EPUB don’t really compete in the sense that EPUB can’t really do fixed-layout like PDF and certainly, can’t do it AND make the exact same content accessible. EPUB has tremendous value as a publisher’s format for distribution to reflow-required environments such as phones and Kindles, but few will ever mistake EPUB as a general purpose format for digital content.

  3. October 1, 2015 at 11:22

    Duff,

    We can have more conversation about EPUB at your PDF Technical Conference next month. Thanks very much for the invitation to be part of the conversation among all of us who work with documents.

    Meanwhile I did want to note for the record that you are incorrect in claiming that fixed-layout EPUB is inherently inaccessible. It may not be, but then again PDFs may not be accessible. If an EPUB file uses SVG for fixed-layout it can be at least as accessible as PDF (since SVG is basically the same infoset as PDF page contents and was designed to support accessibility, see e.g. http://www.w3.org/TR/SVG-access/), if an EPUB file uses HTML/CSS for fixed layout it is likely to be more accessible, with larger blocks of text already in natural reading order, with layout being separate from the content, and since developers have lots of practical experience creating accessible HTML for websites. If either a PDF or EPUB file uses bitmap page images without a text overlay then of course neither will be accessible.

    I would challenge you (or anyone else) to take a random sample of 100 of the millions of PDF files you refer to, even omitting those that have purely image-based pages, and determine what % of them have accessibility data structures that include inter-page paragraph spanning. My guess, informed by my own experiences working on Acrobat and PDF at Adobe, is that less than 10% of a random sample will have basic accessibility data structures and that less than 1% will include the inter-page paragraph spanning which you complain that EPUB had until recently no way to represent (it does now thanks to multiple renditions and guided navigation). And if you then look at these 100 files I would predict 90 of them would naturally be reflowable if represented in EPUB, which means they will be highly accessible, and of the 10 that would make sense to be fixed-layout in EPUB probably 9 of these would be more accessible than their PDF equivalents (thanks to using HTML/CSS). So you will have in PDF 90 inaccessible files, 9 partially accessible and 1 fully accessible and in EPUB 90 fully accessible, 9 partially accessible and 1 inaccessible, with the partially accessible in general more so than their PDF equivalents. Of course these are just guesstimates, I would welcome a more systematic analysis. And at the end of the day, formats are just a means to an end, the point is to deliver on accessibility wherever possible whether content is in PDF, EPUB or XYZ format is secondary.

  4. October 2, 2015 at 13:38

    Hi Bill,

    As a format, fixed-layout EPUB is inherently inaccessible simply and plainly because it cannot accommodate content that spans >1 page; case closed. Sure, some documents are very simple, or only one page long. But an enormous proportion of fixed-layout documents include at least some content that spans pages, even if it’s only the table of contents. Authors and publishers to whom accessibility matters won’t accept such an immense structural limitation to their layouts, so for fixed-layout EPUB, accessibility is a kill-shot. Sorry.

    On the other hand, PDF’s accessibility model readily accommodates page-spanning content and has done so since 2001, even if many vendors haven’t as yet supported it (many have yet to support tagged PDF at all).

    It’s entirely true, as you say, that PDF documents may not be accessible at all, most notably if the file isn’t tagged, or if it’s tagged poorly. There’s no question that almost any reflowable EPUB file is far more accessible than an untagged PDF. Nonetheless, tagged PDF remains the only fixed-layout model that’s can meaningfully claim to be accessible, as noted above. The fact that most PDF files aren’t tagged or are poorly tagged is no statement about PDF in general any more than the fact that many websites don’t bother with alt text for images is a statement about HTML. The specifications are there – the software can change.

    To address your question: the (sadly undiscloseable) data I’m aware of (about 18 months old) indicates that about 15% of PDF files (at that time) in current use included basic accessibility structures.

    Now that Bank of America, Capitol One and other major institutions are delivering tagged PDF statements as a matter of routine, and since the US Access Board appears poised to name PDF/UA as a means of conformance to their accessibility standards, I would say that recognition of accessible PDF is doing pretty well.

    From any given authoring software, the same data that’s needed to make a good EPUB file can also make a well-tagged PDF. Reflowable EPUB is great; no question, but fixed-layout EPUB cuts against the value-proposition for the format. PDF makes it (relatively) easy to ignore page-breaks and discontinuity for reuse (i.e., accessibility) purposes, and fixed-layout EPUB doesn’t allow it even in theory. Until now, it seems, on paper.

    I look forward to the very first-ever “EPUB Multiple Renditions” files, and the software to view ‘em. ☺ So far, EPUB reader implementations I’m aware of appear to have taken little advantage of many other rich features of the format; maybe this one will be different.

    Lastly, I wanted to offer another interesting (to me) graph from Google Trends… “convert EPUB to PDF” vs. “convert PDF to EPUB”.

    https://www.google.com/trends/explore#q=%22convert%20epub%20to%20pdf%22%2C%20%22convert%20pdf%20to%20epub%22&date=1%2F2009%2073m&cmpt=q&tz=Etc%2FGMT%2B4

  5. October 13, 2015 at 07:10

    Sorry, but Google search trends aren’t an indication of anything in the real world.

    By the same criteria you use, we can safely state that Duff Johnson is dying: http://www.google.com/trends/explore#q=duff%20johnson

    Although you seem to be enjoying a dead cat bounce of late…

  6. October 13, 2015 at 11:48

    Your query is wrong. To test my name in Google Trends you would have to use a phrase search (“Duff Johnson”) instead of a keyword search (Duff Johnson).

    Why would I take your point of view seriously when you don’t appear to know how to use Google in the first place?

  7. October 15, 2015 at 04:10

    Good point. They wanted to imitate pdf in epub, with horrific results.


Leave a Reply