Lists: Contrasting HTML with PDF

In early 2014 a discussion on the WCAG listserv revealed that while implementation of Definition Lists (HTML 4) and Description Lists (HTML 5) is fairly widespread (including in WCAG 2.0), accessibility support for <dl> tags (as of Q1, 2014) may not be all that one might hope.

This article contrasts HTML’s list tags with PDF’s standard structure elements for lists. I show that PDF’s generic mechanism for providing accessibility for PDF content accommodates list semantics in a wider variety of scenarios.

The manner in which the following pseudo-code is presented is discussed in a note at the end of the article.

Lists in HTML

HTML 4.01 and 5 provide three types of lists:

Ordered List

     <ol>               Label value is sequential based on the type attribute
          <li>          List item content (equivalent to <LBody> in PDF)

Unordered List

     <ul>               List items use the same label based on the type attribute
          <li>          List item content (equivalent to <LBody> in PDF)

Definition List (HTML 4) / Description List (HTML 5)

     <dl>               List (equivalent to <L> in PDF)
          <dt>          List item label (equivalent to <Lbl> in PDF)
          <dd>          List item content (equivalent to <LBody> in PDF)

Lists in PDF

PDF uses a single set of standard structure types for every kind of list. These may be represented as follows:

     <L>                        Generic list
          <LI>                  Generic list item
               <Lbl>            Generic “list item label” container
               <LBody>          Generic “list item body” container

An optional ListNumbering attribute for <L> tags allows authors to specify types of labels. This attribute’s default value is “none” in which case each list item’s label (the content contained by the <Lbl> element) defines itself via appropriate mechanisms – Unicode for text glyphs, alternate text or actual text for images, etc.

Beyond HTML Lists

Screen shot of the tags in the dl list in the attached PDF.

Since the structure exists independently of the content, PDF’s list tagging model accommodates lists HTML cannot. For example, PDF lists accommodate different types of labels within a single list, can associate symbols with descriptions, or associate text labels with  content in diagrams and maps.

Irrespective of any design considerations, any two pieces of content in a PDF may be associated within a list item as list-item label and list-item body. These items could be spread across a page, or on different pages, with item sequence and list-nesting determined by the sequence of <L> and <LI> tags.

PDF lists are extremely flexible, unambiguous, and even better, every type of list is accessibility-supported in PDF today, and has been for years (under Windows).

This PDF file demonstrates dl/dt/dd tags delivered in an accessibility-supported manner via PDF. The dl/dt/dd tags (PDF accommodates custom structure elements) are role-mapped to PDF standard structure types (L/LI/Lbl/LBody) for export to HTML and use by assistive technology (AT).

Why authors need generic list structures

It’s the nature of the HTML (and thus EPUB) beast that list item labels are typographically associated with list item content. Some content, however, requires more flexibility. This is especially true in Science, Technical, Engineering and Mathematics (STEM) publishing and for providing accessible solutions for graphically rich or illustrative content generally.

Two general thoughts occur:

  • HTML does not define the scope of all possible document content; it simply defines itself.
  • PDF, however, can encompass pretty much any final-form “hardcopy” content, and provides the freedom to tag that content in any way that provides complete semantic representation to consuming software.

The ordered / unordered / description list paradigm in HTML implies that list types are semantically distinct. It turns out that’s not really true. In reality a list is a list; it’s the list’s labels that contain the semantics.

Happily, PDF provides an accessible (and accessibility-supported) means of representing lists using a single model that provides total flexibility for document authors.

When it’s got to look and work in a certain way, you can trust PDF.


The above representations of pseudo-code are executed with styled text rather than <dl> tags, even though definition lists might have been preferable from an accessibility point of view. Why did I do this? Four reasons (and four nested sub-reasons).

  1. As of 2014-02-13, WordPress does not natively support <dl> tags, nor is there a current plugin providing such support. I’ll submit that’s an interesting fact right there.
  2. Sure, I can easily introduce plain HTML to overcome reason 1, but Chrome’s default representation of <dl> places the definition below and indented from the term being defined, which is inappropriate for psuedo-code.
  3. Doubtless, I could have overcome reason 2 with a CSS hack to force the <dt> and <dd> content to render on the same line and with the spacing I wanted. But:
    1. I’m no CSS genius. I’m closer to a CSS cretin.
    2. It might take a CSS genius to do it right (ie, consistently) on all browsers, platforms, mobile, etc, etc.
    3. It’s not clear that <dl> is accessibility-supported at this time
    4. I’m providing an accessibility-supported version in this PDF anyway
  4. It’s more fun to leave the exploration of the (various) ironies of this situation to the reader.

Some relevant thoughts on HTML’s <dl> tag and the limitations thereof may be found here: