3.4.  Bookmarks and Named Destinations

Overview

Bookmarks are essential for a quick navigation in large PDF documents. The value of a book drops dramatically when the chapters are not available via the table of contents. Use the following tests to ensure that the bookmarks are generated correctly.

<!-- Tags for tests on bookmarks: -->

<hasNumberOfBookmarks />
<hasBookmarks />
  
<hasBookmark withLabel=".."        (One of these attributes ...
             withLinkToName=".."   ...
             withLinkToPage=".."   ...
             withLinkToURI=".."    ...
             withoutDeadLink=".."  ... has to be used)
/>

<hasBookmark withLabel=".."        (Only these two attributes ...
             linkingToPage=".."     ... can be used together.)    
/>

<!-- Nested tags of <hasBookmarks />: -->
<hasBookmarks>
  <matchingXPath />   (optional)
  <matchingXML   />   (optional)
<hasBookmarks />

We can see bookmarks as starting points and named destinations as the landing points. Named destinations can be used by bookmarks and also by HTML links. So you can jump from a website directly to a specific location within a PDF document.

For named destinations, the following tags are available:

<!-- Tags to check named destinations: -->

<hasNamedDestination />

<!-- Nested tags: -->
<hasNamedDestination>
  <withName />          (optional)
</hasNamedDestination>

Named Destinations

The names of named destinations can be tested easily:

<testcase name="hasNamedDestination_WithName">
  <assertThat testDocument="namedDestination/manyNamedDestinations.pdf">
    <hasNamedDestination>
      <withName>Seventies</withName>
      <withName>Eighties</withName>
      <withName>1999</withName>
      <withName>2000</withName>
    </hasNamedDestination>
  </assertThat>
</testcase>

Because a name also has to work with external links, it may not contain spaces. For example, if a document in LibreOffice has a label "Export to PDF" (which contains spaces) then LibreOffice creates a destination with the label "First2520Bookmark" when exporting it to PDF. A test has to use the escaped value:

<!-- 
  The convertion of the bookmarks by LibreOffice converts every 
  space in a bookmark label into "2520" in the named destination".
-->
<testcase name="hasNamedDestination_CreatedWithLibreOffice">
  <assertThat testDocument="namedDestination/problem_convert-bookmarks-to-pdf.pdf">
    <hasNamedDestination>
      <withName>First2520Bookmark</withName> 1
    </hasNamedDestination>
  </assertThat>
</testcase>

1

"2520" stands for "%20" and that corresponds to a space.

Existence of Bookmarks

It is easy test to verify the existence of bookmarks:

<testcase name="hasBookmarks">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmarks /> 
  </assertThat>
</testcase>

Number of Bookmarks

After testing whether a document contains bookmarks at all, it is worth verifying the number of bookmarks:

<testcase name="hasNumberOfBookmarks">
  <assertThat testDocument="bookmarks/manyBookmarks.pdf">
    <hasNumberOfBookmarks>19</hasNumberOfBookmarks>
  </assertThat>
</testcase>

Label of a Bookmark

An important property of a bookmark is its label. That is what the reader sees. So you should test that an expected bookmark has the expected label:

<testcase name="hasBookmark_withLabel">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLabel="Content on page 3." /> 
  </assertThat>
</testcase>

Destinations of Bookmarks

Bookmarks can have different kinds of destinations. A suitable attribute is provided for each destination inside the tag <hasBookmark />.

Does a particular bookmark point to the expected page number:

<testcase name="hasBookmark_WithLabelLinkingToPage">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLabel="Content on first page." linkingToPage="1"/>
  </assertThat>
</testcase>

The attribute linkingToPage=".." can only be used together with the attribute withLabel="..". In such a test the given label has to point to the expected page number.

Is there any bookmark pointing to an expected page number:

<testcase name="hasBookmark_WithLinkToPage">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withLinkToPage="1" />
  </assertThat>
</testcase>

Does a bookmark exist which points to an expected destination:

  
<testcase name="hasBookmark_WithLinkToName">
    <assertThat testDocument="bookmarks/twoBookmarkToSameDestination.pdf">
      <hasBookmark withLinkToName="Destination on Page 1" />
    </assertThat>
  </testcase>

Is there a bookmark pointing to a URI:

<testcase name="hasBookmark_WithLinkToURI">
  <assertThat testDocument="bookmarks/bookmarkWithURLAction.pdf">
    <hasBookmark withLinkToURI="http://www.wikipedia.org/" />
  </assertThat>
</testcase>

And finally. we can check that there is no bookmark having a dead link:

<!--
  Looking for dead internal links (GOTO) of any bookmark.
  A 'dead link' means that a bookmark is not pointing to a page.
-->
<testcase name="hasBookmark_WithoutDeadLink">
  <assertThat testDocument="bookmarks/diverseContentOnMultiplePages.pdf">
    <hasBookmark withoutDeadLink="YES" />
  </assertThat>
</testcase>

PDFUnit does not access websites. So a dead link is a bookmark that does not point to a page or any other destination.

Check Bookmarks with XML/XPath

The next tests all use an XML structure which is created with the utility program ExtractBookmarks.

The bookmarks of a PDF document can be compared with an existing XML file. Each bookmark in the PDF must match an element in the XML file.

<!-- 
  When comparing PDF parts against any XML, 
  whitespaces and comments are ignored. 
-->
<testcase name="hasBookmarks_MatchingXML_AsFileName">
  <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf">
    <hasBookmarks>
      <matchingXML file="bookmarks/bookmarksWithPdfOutline.xml"/> 1
    </hasBookmarks>
  </assertThat>
</testcase>

1

When comparing PDF parts against any XML, whitespaces and comments are ignored.

Bookmark information can also be verified using individual XPath expressions:

<testcase name="hasBookmarks_MatchingXPath_MultipleInvocation_version1">
  <assertThat testDocument="bookmarks/bookmarksWithPdfOutline.pdf">
    <hasBookmarks>
      <matchingXPath expr="count(//Title) = 5" />
      <matchingXPath expr="count(//Title[count(ancestor::*) > 2] ) = 0" />
    </hasBookmarks>
  </assertThat>
</testcase>