“If document semantics require a descending sequence of headers, such a sequence shall proceed in strict numerical order and shall not skip an intervening heading level.”
From ISO 14289 (PDF/UA) 7.4.2
The PWS article, PDF/UA and Skipped Headings, critiques the above-quoted provision in PDF/UA. The article presents the position that it was wrong for PDF/UA to exclude skipping heading levels from the PDF accessibility standard, and includes five arguments and five examples to make its case.
The Basic Point
The PWS article says: “… in some circumstances, skipping a heading level can actually lead to greater clarity and intelligibility, and hence better accessibility,” and that “…indiscriminately following the no-skipped-headings rule can create problems that would not otherwise exist.” The problem is so severe, according to the article, that following PDF/UA risks a dichotomy: “…make it accessible or make it PDF/UA conforming.”
Let’s take a closer look.
To keep this to a semi-reasonable length I’m not going to re-iterate the arguments. If you’re interested in following along I suggest you open section 1 of Ted’s piece in a separate browser window.
Section 1: Arguments
You are in the minority
This isn’t an argument about the subject at hand, so we won’t stop long here. The Committees that wrote PDF/UA never received a single formal comment proposing that we soften section 7.4.2., nor did (to my recollection) any of the participating experts make the PWS article’s suggestion that skipping levels could improve clarity or that failing to do so could itself reduce accessibility.
Headings as navigation or decoration
In the quotes selected in the PWS article I’m simply claiming that in terms of PDF, AT users cannot rely on headings to deliver the document’s structure in a world (our world) where headings are commonly derived from styling instead of being dictated by the document’s structure. This is the point I was making, no dichotomy (false or otherwise) was intended.
Skipped headings are random headings
I understood this argument better once I’d read through the examples given in the PWS article, where I found heading levels assigned to specific semantic roles such as “Chapter”, “Case Study”, “Environmental Issues”, and so on.
I agree that this usage isn’t random, but nor is it explicit; it’s only discernable by inference. This usage of headings cannot be programmatically determined; there’s no way in the PDF language to specify “…in this document, H4 is reserved for the title of Case Study sections.” The user would have to figure this out by scanning through the headings, or by learning of the assignment via an author’s note.
PDF is different
The PWS article begins this argument with a review of how headings are regarded in the HTML world, and in WCAG 2.0. While it’s true that WCAG is ambiguous about heading levels, WCAG 2.0 isn’t the accessibility standard for PDF any more than it’s the accessibility standard for PowerPoint.
Certainly, many WCAG 2.0 success criteria may be neatly applied to PDF, but the criteria for navigation are not among them. WCAG 2.0 discusses navigation entirely in terms of links; an HTML-centric viewpoint.
Quite unlike HTML, in PDF links are (a) rare, (b) can’t provide navigation to specific content (such as a heading) and (c) aren’t even part of the document! For AT users reading PDF files, links are a very poor substitute for heading-based navigation.
This is (part of) why technology-specific standards such as PDF/UA are essential to reliable accessibility, but I digress.
Likewise, Eric Meyer’s much overtaxed post (it’s the only 3rd party argument offered by WCAG 2.0 to permit skipped headings), as I pointed out in an earlier article represents Eric’s “gut feelings,” not a developed position. I honestly don’t believe that Eric intended to be making normative claims outside of web-pages here, and his post is clearly restricted to a discussion about HTML in any event.
The next point in the PWS article is that heading levels must mean the same thing across technologies; that surely the meaning of headings don’t change if identical content is produced in HTML and PDF.
No argument from me; if a document’s structure is difficult to navigate in HTML it will probably also be difficult to navigate in PDF.
Even so, the PWS article seems to miss the point. While PDF and HTML may be used for identical content, that’s rarely the case (that’s why we have the two technologies in the first place). Standards for PDF accessibility have to accommodate the commonplace and very different use-cases for PDF that almost never occur in HTML (such as long or deeply structured documents). Given the deep technical differences between HTML and PDF it’s natural that technical standards for each would differ.
How PDF files may be navigated using Assistive Technology
The PWS article challenges the claim that headings are critical for document structuring purposes in PDF as compared to HTML due to the distinctive realities of navigating PDF documents. The PWS article suggests there are many ways to navigate a PDF, including bookmarks, tables of contents and page numbering. Let’s look at each of these suggestions, because the point is revealing.
- Bookmarks. An optional feature in PDF, bookmarks can be used like a table of contents and are helpful to many users. While it’s possible to connect bookmarks to heading elements in the document’s text, PDF creation software requires headings to do so. Accordingly, bookmarks aren’t a meaningful alternative to headings for structure navigation purposes.
- Table of contents. Authors may create a PDF with a linked table of contents, but they the link can only deliver the reader to the top of the page, not to the target content. There’s no analog for HTML’s anchor concept in PDF; no way for a link to target, say, a heading that starts half-way down a page. For AT users, links in PDF take you to the top of the target page, and that’s it.
- Page Numbering. I’m not sure if the PWS article is referring to the printed page number on a PDF page, the page’s label as displayed in the user interface of many PDF viewers, or the physical number of pages in the PDF file. In PDF, page numbers themselves are not a navigational mechanism; they are simply labels. The use of page numbers for navigational purposes occurs when the page itself is the navigation target, as in “everyone, please turn to page 42.” This is also the one case in which a navigation option that delivers the user to the top of the physical page, as opposed to the target content, would be appropriate.
Unlike HTML, where links can do the job just as well as headings, in PDF, heading tags are the only reliable means of conveying document structure to AT users.
PDF is distinctive (2)
Unlike HTML, PDF’s role is that of electronic paper. It’s value-proposition derives from its reliability and consistency.
It’s true, as the PWS article says, that PDF/UA zeroed in on the need to ensure the only accessible document structure-based navigation mechanism in PDF was reliably useful on longer and complex documents. Such reliability, while (arguably) less important, on shorter or less complex documents, the PDF/UA committee judged unlikely to hurt. But that wasn’t the only consideration.
PDF is electronic paper; reliability and consistency is key to the format’s purpose. Historically, PDF files have seemed anything but consistent to AT users. PDF/UA’s main value is to provide consistency.
The committee that wrote PDF/UA considered the idea that strictly descending heading levels might not be critical to AT navigation on documents of under 5 pages, for instance, or those with fewer than 10 headings, but such distinctions would be arbitrary. Some sort of clear-cut rules were required. Given the significance of headings to navigation in PDF, our priority was to assure that headings, whatever other purpose they might serve, nonetheless reflected the organization of the document regardless of length or complexity. Writing distinct rules for long vs. short or complex vs. simple documents was a minefield the ISO committee quite rightly avoided.
The effect on AT users
The PWS article cites an AT user convinced by argument in favor of the skipped heading given in example 1. But could that user have determined the significance of the skipped heading without the author’s explanation? There are many possibilities, all of which are perfectly plausible within model suggested by PWS:
- The H3 content is “off the main narrative” (the meaning in Example 1)
- The H3 content is “minor in importance compared to the H2s”
- The H3 content is “assigned a specific meaning other than ‘section heading'”
- The heading skips because the designer uses headings when they are thinking about style
- The heading skips because the author is ignorant of accessibility and just wasn’t thinking about it
- The heading skips because it’s a free country, ok?
There’s no way to indicate any of these extra “meanings” in PDF. An AT user encountering example 1 would have to guess from one of the above (or other) explanations for the skipped heading.
NOTE: Heading tags mean something very specific in documents that conform to PDF/UA. They provide an AT-navigable map to the document’s section headings, pure and simple. If an author uses heading levels to indicate importance rather than structure (or if they mix both uses together), there’s no way to communicate that information reliably to the AT user.
Finally, the PWS article returns to the main bugbear; an objection to my use of the term “logical” to describe heading structures that descend levels in numerical order, as required by PDF/UA. Rather than take on this point now, let’s see how it fares once we address the examples provided in the PWS article.
The article author concludes this section by asserting:
“…it is, of course, possible to have a hierarchical relationship between two members of a set (such as H1 and H4) without the need for the other members of the set (such as H2 and H3) to be present.”
To emphasize the obviousness of the point, as he sees it, an analogy is provided:
“By way of illustration, in the Army, if a General, a Major, a Sergeant and a Private are in a room together and the Major and the Sergeant leave the room, does the hierarchical relationship between the General and the Private leave with them? No, of course it doesn’t.”
Rooms and ranks are not a good analogy for dependent sections and subsections, as we’ll see.
While accepting that ISO 32000-1 states that headings are only for use in specifying the document’s structure, the PWS article highlights that ISO 32000 itself doesn’t prohibit skipping heading levels. That’s true, but ISO 32000 is not an accessibility specification; it’s the PDF specification. That’s why we have ISO 14289 (PDF/UA) – to specify the correct use of ISO 32000 (PDF) for accessibility purposes.
Section 2: Use Cases
So far the PWS article has attacked the logic underpinning PDF/UA prohibition on skipping heading levels. Since the article concedes that PDF/UA’s rule is consistent with common best-practice, there’s an awareness that arguing in favor of skipping heading levels won’t be very interesting unless it can be shown that skipping heading levels can improve accessibility.
In the PWS article’s second section are offered five use-cases that (it is claimed) demonstrate the logical and accessible use of skipped headings. You may want two browser windows open side-by-side if you want to follow the discussion while referencing each example in turn.
The PWS article says that “[Tagging the H? as H2] would be sending out the message that they are structurally the equivalent of each chapter, which they clearly are not.” The H? should be an H3, according to PWS. Skipping from H1 to H3 means, it is claimed, “[this is] peripheral content, not part of the primary narrative.”
The article’s author wants to force H2 to be semantically equivalent to “chapter,” but that’s not how document hierarchy works.
If it was true that tagging the Author, Organization and Contact headings with H2 would cause users to confuse those heading with chapter headings, then by the same token, using H3 would simply cause confusion with H3 headings elsewhere in the document.
Let’s look at some options for tagging this example in conformance with PDF/UA:
Option 1: As the PWS article says, AT users lose the ability to include these sections when they jump from heading to heading. It’s up to the author to decide whether these sections warrant heading tags, or should be grouped within some other heading, or what. Neither PDF/UA nor best-practice requires that each instance of large text gets a heading tag.
Option 2: The headings are appropriately structured in terms of the text provided, but now the document’s structure may appear clunky.
Option 3: The H? represents an additional element not present in the example. It’s an option the article’s author acknowledges, but dislikes on stylistic grounds.
Option 4: Since PDF/UA requires the document’s title be provided in the file’s metadata, the standard permits the document’s title to be tagged as a P. Let’s face it: titles aren’t really part of the document’s structure anyhow.
PWS: Given the plethora of ways to handle this situation and still conform to PDF/UA, it’s not clear what value the PWS way adds. Moreover, the meaning ascribed to this use of H3 in this case isn’t programmatically ascertainable. Users would have to guess.
The Meaning of Skipping
The PWS article asks the question: “How does going from H4 to H2 (which is legal in PDF/UA) describe anything?”
The answer’s simple, and it’s got nothing to do with conflating chapter or other role assignments with document structural elements. Moving from H4 to H2 indicates the end of content denoted by the previous H2. Period.
Example 2: Secondary Content in Sidebars
In this example it is suggested that defenseless little H5 be assigned to “Environmental Issues”. The PWS article implies that AT users will benefit because they’ll be able to focus on that subject by choosing to skip from one H5 to another H5 heading.
To follow PDF/UA and use H3 instead of H5 on the conclusion page, the PWS author feels, “…[implies] that the environmental issues section on the Conclusions page is somehow structurally different to all of the other environmental issues sections in the document. It isn’t, so marking it up as such is misleading.”
The article author’s wrong about this: the implication of H3 following H2 is simply that one is a part of the other. No other inference is available from the raw fact of heading level. As such, skipping from H2 to H5 merely indicates that H3 and H4 content is somehow missing.
When “Environmental Issues” is the only H3 for the Conclusion’s H2 that’s all it means: the conclusion has a single subsection. Period. That’s fully consistent with the document’s structure.
Headings don’t have semantic assignments or classes in PDF. There’s no vehicle for conveying a given meaning (such as “H5 = “Environmental Issues”) to the user, programmatically or otherwise. One option, I suppose, would be to include a statement at the beginning of the document explaining that “H2 has been reserved for chapters, H5 for Environmental Issues,” etc., but that wouldn’t be programmatically ascertainable either, it would grossly limit document design and would be hideously clunky besides.
In this example PWS would assign H4 to “Case Studies.” Once again, there’s no indication of how this assignment would be conveyed to assistive technology.
To illustrate some of the problems with this approach I’ve provided a pseudo headings-tree that develops example 3 a little further to demonstrate how burdening heading levels with semantic functions other than simply “heading level” can easily result in semantic chaos.
In this example probably all would agree that “Hx” must be an H4. Per example 1 it might be asked if that content was “peripheral,” and if it was, it might be suggested that someone may want to use H5 instead, clearing the way for the preferred assignment of H4=Case Study. Probably, though, the article author would allow that it’s fair to use an H4 here.
The first item I’ve actually marked as H4 is the first Case Study, per the PWS recommendation. All’s well (so far) with PDF/UA because this case study occurs within an H3.
Case Study 2, however, occurs directly following an H2. Regardless, since it’s a case study, PWS wants an H4. Now we have a skip from H2 (chapter) to H4 (case study). On the one hand, Case Studies are being consistently treated as H4, that’s nice. On the other hand, per the point made in example 1 regarding the “meaning” of skipped heading levels, maybe we’ve now implied (by attempting to tag all case studies “consistently” at H4) that this case study is somehow “out of the main narrative.” Oops.
The next H4 after Case Study 2 is a (proper) “document sectional” H4 rather than a “Case Study” H4. I doubt if there’s an objection to this, as it would be rather inflexible to disallow documents with H4 subsections simply because that heading had been intended for a specific meaning. I’m actually unsure, however, of how PWS would deal with this situation; perhaps by assigning Case Study to H5 instead? But what if we need that level for document organization as well? Also, if we use H5 for case study, won’t it look even more “peripheral” if it follows an H2?
You get the idea.
Case Studies 3 and 4 are subsections of that document sectional H4. They’re marked as H? because I really don’t know what to do with them at all using the approach suggested in the PWS article. How can we tag these case studies as H4 when they are not only case studies, but also logical subsections of an H4 tag? If I follow an H4 with another H4 then I’ve stayed on the same level in the document’s structure. If I mark the case studies as H5 they’re now logically subordinate to their parent H4, which is structurally correct, but now those headings are no longer assigned to the heading level that PWS wants to use for “Case Studies.” Argh!
Should we expect the end-user to distinguish the document structure uses of H4 from (in this example) “case study” uses of H4? Since there’s no way to indicate (in a standardized fashion, or otherwise) what additional meaning has been ascribed to H4 in any given instance, this approach really isn’t suitable for accessibility purposes.
Example 4: A brief history of Formula One
This time the article author wants to assign H4 to “Drivers,” but has otherwise added nothing to the previous examples.
Once again, we are assured that following PDF/UA’s heading level requirements will create confusion. Once again, the skipped heading (from H2 to H4, at the top of his example) makes it clear that PWS’s suggested model is itself confusing.
Recall that in Example 1 we were told that skipping H2 to H4 implies “out of the main narrative.” If so, why doesn’t it imply the same thing in Example 4? If it doesn’t how, exactly, is that made clear to the AT user?
Of course, in example 4 the H4 following the H2 is clearly in the main narrative – it just offends the article’s author to give it an H3 heading because he’d prefer to reserve h3’s style for “decades” and the default h4 style for “drivers.” Really, another style should be added for headings about drivers. This way H3 could easily be employed below H2 and the text may be styled to match the other “Drivers” headings.
Example 5: annual report
This example is unclear because the document is so inherently confusing.
The article’s author claims the subsection headings “do the same job” throughout the document; in this case H3 is assigned to “topic”. One page includes a skip from H2 to H4 because, in this scenario, the H2 contains only a single topic.
Once again there’s the problem of an implied straying from the narrative, but what other problems does this model create?
If a user was surfing the H3 tags in this document looking for “topics”, as this scenario would have them do, they will entirely miss the contents of “Section 2: Statement of Internal Control” because there are no H3 tags in there, only H4.
The assumption the PWS article’s author wants AT users to make; that “Section 2: Statement of Internal Control” is a single-topic section, isn’t indicated in any way. There’s no information for or against this idea provided by the skipped heading, and that’s the point.
You can’t have it both ways: If we want the reader to think of H3 as “the topic level” then we can’t blame readers for missing the contents of an entire H2 section because we decided (but failed to communicate) that the lack of an H3 should be taken to mean that the enclosing H2 is a “single topic” section.
In the PDF world consistency is key. Consistency doesn’t mean “In this document, H4 means “Case Study.” If it did, things will get confusing for users who think (quite naturally) that H4 means “a subsection of H3” or “the parent of an H5.”
In PDF tags, the entire meaning of H4 is “a subsection of the parent H3’s content.” Nothing is made “clear” by skipping heading levels – the practice simply devalues the utility of headings in describing the document’s organization.
PDF/UA promises that an H4 will be a subsection of an H3. Once the document returns to an H3, the user has the right to expect that the document’s moved onto a new section of content. That’s it. The heading level has no meaning beyond its relationship to its parent.
The PWS article consistently confuses, in each example, document structure with text styling.
But don’t take my word for it…
Simply review the discussions on this issue that crop up from time to time on WebAIM’s list. Here’s a representative thread., including several comments from users who are blind or low-vision discussing heading levels.
Where I’d like to suggest we go from here
As discussed above, there’s currently no provision in PDF for assigning document-specific roles to heading levels (or any tag). I agree that it’s an interesting idea, and worth discussing, perhaps pursuing.
To anyone who feels that a purpose-built system for role-based navigation is highly desirable and would add to accessibility in PDF; please take the time to participate in the relevant discussions in the ISO 32000-2 and ISO 14289 email listservs and meetings on part 2 of their respective standards. These documents are under development now; this is precisely the time to do so.
It would be more productive than 6,000 word artillery duels over something that’s already baked and out of the oven.