13.3.  Comparing Text

An expected text and actual text on a PDF page can be compared using the following tags:

<!-- Comparing text: -->

<startingWith />
<containing />                           1
<matchingComplete />                     2
<matchingRegex />
<endingWith />

<notStartingWith />
<notContaining />                        3
<notMatchingRegex />
<notEndingWith />

<containing       whitespaces=".." />    4
<matchingComplete whitespaces=".." />    5
<notContaining    whitespaces=".." />    6 

1 2 3

Tests without the second parameter normalize the whitespaces. That means whitespaces at the beginning and the end are removed and all sequences of any whitespace within a text are reduced to one space.

4 5 6

Use the optional attribute whitespaces=".." to treat whitespaces in a specail way. For this attribute, the constants KEEP, NORMALIZE, and IGNORE exist. These constants are explained separately in section 13.4: “Whitespace Processing”.

Comparisons with regular expressions follow the rules and possibilities of the class java.util.regex.Pattern :

<!-- Using regular expression to compare page content: -->
<testcase name="hasText_MatchingRegex">
  <assertThat testDocument="content/diverseContentOnMultiplePages.pdf">
    <hasText on="FIRST_PAGE">
      <matchingRegex>.*[Cc]ontent.*</matchingRegex>
    </hasText>
  </assertThat>
</testcase>