3.29.  Tagged Documents

Overview

The PDF standard ISO 32000-1:2008 says in chapter 14.8.1 A Tagged PDF document shall also contain a mark information dictionary (see Table 321) with a value of true for the Marked entry. (Cited from: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf.)

Although the standard says shall, PDFUnit looks in a PDF document for a dictionary with the name /MarkInfo. And if that dictionary contains the key /Marked with the value true, PDFUnit identifies the PDF document as tagged.

Following test methods are available:

// Simple tests:
.isTagged()

// Tag value tests:
.isTagged().with(..) 
.isTagged().with(..).andValue(..)

Examples

The simplest test checks whether a document is tagged.

@Test
public void isTagged() throws Exception {
  String filename = "documentUnderTest.pdf";

  AssertThat.document(filename)
            .isTagged() 
  ;
}

Further tests verify the existence of a particular tag.

@Test
public void isTagged_WithKey() throws Exception {
  String filename = "documentUnderTest.pdf";
  String tagName = "LetterspaceFlags";

  AssertThat.document(filename)
            .isTagged() 
            .with(tagName)
  ;
}

And finally you can verify values of tags:

@Test
public void isTagged_WithKeyAnValue_MultipleInvocation() throws Exception {
  String filename = "documentUnderTest.pdf";
  
  AssertThat.document(filename)
            .isTagged()
            .with("Marked").andValue("true") 
            .with("LetterspaceFlags").andValue("0") 
  ;
}