PDF Structure and Tagging

What are PDF tags?

PDF tags are structural elements that define the logical organization and meaning of content within a PDF document. Similar to HTML tags, PDF tags create a hierarchical structure that assistive technologies can understand and navigate. Tags identify different types of content such as headings, paragraphs, lists, tables, and images.

Without proper tagging, a PDF appears to assistive technologies as an unstructured sequence of text and images, making it difficult or impossible for users with disabilities to understand the content's organization and meaning.

PDF tag structure

PDF tags are organized in a tree-like hierarchy, starting with a root element and branching into increasingly specific content elements. This structure mirrors the logical organization of the document:

  • Document root: The top-level container for all tagged content
  • Structure elements: Major sections like headers, main content, and sidebars
  • Content elements: Specific content types like headings, paragraphs, and lists
  • Inline elements: Text formatting and inline objects within content

This hierarchical structure allows assistive technologies to provide meaningful navigation options, such as jumping between headings or skipping to specific content sections.

Common PDF tags

PDF documents use a standardized set of tags to identify different content types:

Structural tags

  • Document: Root element containing all content
  • Part: Large divisions of a document
  • Sect: Generic container for sections
  • Div: Generic block-level container

Heading tags

  • H1-H6: Hierarchical headings from most important (H1) to least important (H6)

Content tags

  • P: Paragraph text
  • L: List container
  • LI: List item
  • Table: Table container
  • TR: Table row
  • TH: Table header cell
  • TD: Table data cell

Inline tags

  • Span: Generic inline container
  • Link: Hyperlink
  • Figure: Image or graphic

Creating tagged PDFs

There are several methods for creating tagged PDFs:

From source applications

  • Microsoft Word: Use proper heading styles and structure, then export to PDF with accessibility options enabled
  • Adobe InDesign: Apply paragraph and character styles, then export with tagged PDF options
  • HTML to PDF: Well-structured HTML with semantic markup translates to good PDF tags

Using Adobe Acrobat

  • Auto-tagging: Acrobat can attempt to automatically identify and tag content structure
  • Manual tagging: Use the Tags panel to manually create and organize tag structure
  • Tag editing: Modify existing tags to improve accessibility

Best practice: Create tags from well-structured source documents rather than relying solely on auto-tagging, which may miss important structural relationships.

Checking tag structure

It's important to verify that PDF tags are properly structured:

Using Adobe Acrobat

  • Tags panel: View and navigate the tag tree structure
  • Content panel: See how content is organized within tags
  • Accessibility Checker: Automatically identify tag-related accessibility issues

Using screen readers

  • Navigation commands: Test heading navigation and other structural features
  • Content reading: Verify that content reads in logical order
  • Element lists: Check that headings, links, and other elements are properly identified

Regular testing with assistive technologies helps ensure that tag structure provides the intended user experience.

Back to top