PDF Technique 06

Using table elements for table markup in PDF Documents

An accessible version of a PDF needs to present the same information as the visual representation. For tables, this means that the number of
rows and columns must be presented in addition to content the table contains. The layout of an accessible table is quite simple, especially for anyone who is familiar with HTML tables.
However, writing generic code to handle all PDF tables can be quite complicated (at least when using iText). I think the best approach for addressing the goals of this technique is to provide some basic code, which is a starting point for any table. I’ll also discuss some of the issues which make tables complicated.

First of all, let’s take a look at how to tag a table. Here is the structure:

<table>
    <tr>
        <td>cell:0,0</td> <td>cell:1,0</td> <td>cell:2,0</td> ... <td>cell:m,0</td>
    </tr>

    <tr>
        <td>cell:0,1</td> <td>cell:1,1</td> <td>cell:2,1</td> ... <td>cell:m,1</td>
    </tr>

    <tr>
        <td>cell:0,2</td> <td>cell:1,2</td> <td>cell:2,2</td> ... <td>cell:m,2</td>
    </tr>

    ...

    <tr>
        <td>cell:0,n</td> <td>cell:1,n</td> <td>cell:2,n</td> ... <td>cell:m,n</td>
    </tr>
</table>

A screen reader must encounter this exact format or it will not be able to interpret the table properly. We already know we can’t use the Document.add(…) API to simply add a PdfPTable to a iText document. However, we still want to use the PdfPTable data type. The only alternative is to use the PdfPTable.writeSelectedRows(…) methods. Unfortunately, we cannot even write out an entire row at once because we need to add the accessibility tags for each cell. Hence, we need to keep track of the x and y offsets ourselves and write each cell one at at time. At this point,
I should give due credit to the person who came up with this idea first (as far as I know).

Fortunately, it is possible to write some fairly generic code for creating accessible tables. Admittedly, I thought otherwise before writing up this technique. Specifically, I could not get images to be read properly when nested in a table. I had been placing the com.itextpdf.text.Image directly inside a <TD> element. This would cause my screen reader to say “graphic mc ref”, which doesn’t have much meaning and is not the intended text). However, I have since discovered that if you wrap the Image inside another element so it is not directly nested within the <TD>, it works beautifully.

Page Spill Over, is one remaining complication that I am aware of. In the simplest example of a table, we know the exact height and width and can position it easily on a single page. However, a PDF report could be multiple pages. How do we handle spill over? Do we allow the table to be split? Do we try to fit it all on one page ifpossible? Does the table require its headers and footers to be re-printed on each new page? These are problems best solved by the developer who is designing the PDF. Furthermore, solving this problem is not directly related to accessibility. If a significant number of people contact me to request a generic solution, I’ll put one together.

Here is a generic accessible PdfPCell implementation:

package ca.discotek.itext.table;

import com.itextpdf.text.Image;
import com.itextpdf.text.Phrase;
import com.itextpdf.text.pdf.PdfPCell;
import com.itextpdf.text.pdf.PdfPTable;

public class AccessibleCell extends PdfPCell {

  public final String alternativeText;

  /**
   * For use with addElement(...)
   */
  public AccessibleCell(String alternativeText) {
    this.alternativeText = alternativeText;
  }

  public AccessibleCell(Image image, String alternativeText) {
    super(image);
    if (alternativeText == null)
      throw new IllegalArgumentException("An image must have accessible text.");
    this.alternativeText = alternativeText;
  }

  public AccessibleCell(Image image, boolean fit, String alternativeText) {
    super(image, fit);
    if (alternativeText == null)
      throw new IllegalArgumentException("An image must have accessible text.");
    this.alternativeText = alternativeText;
  }

  public AccessibleCell(PdfPCell cell, String alternativeText) {
    super(cell);
    this.alternativeText = alternativeText;
  }

  /**
   * Nested PdfPTables not allowed. 
   */
  public AccessibleCell(PdfPTable table) {
    throw new UnsupportedOperationException("Nested tables are not allowed.");
  }

  /**
   * Nested PdfPTables not allowed. 
   */
  public AccessibleCell(PdfPTable table, PdfPCell style) {
    throw new UnsupportedOperationException("Nested tables are not allowed.");
  }

  public AccessibleCell(Phrase phrase, String alternativeText) {
    super(phrase);
    this.alternativeText = alternativeText;
  }

  public AccessibleCell(String text, String alternativeText) {
    this(new Phrase(text), alternativeText);
  }
}

And here is a generic accessible PdfPTable implementation:

package ca.discotek.itext.table;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.Phrase;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfDictionary;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfPCell;
import com.itextpdf.text.pdf.PdfPTable;
import com.itextpdf.text.pdf.PdfString;
import com.itextpdf.text.pdf.PdfStructureElement;

public class AccessibleTable extends PdfPTable {

  final float columnWidths[];

  List<AccessibleCell> cellList;

  public AccessibleTable(float columnWidths[]) throws DocumentException {
    super(columnWidths.length);
    float total = 0;
    for (int i = 0; i < columnWidths.length; i++)
      total += columnWidths[i];
    setTotalWidth(total);
    setWidths(columnWidths);
    this.columnWidths = columnWidths;

    cellList = new ArrayList<AccessibleCell>();
  }

  public void addCell(Image image, String alternativeText) {
    addCell(new AccessibleCell(image, alternativeText));
  }

  public void addCell(PdfPCell cell, String alternativeText) {
    addCell(new AccessibleCell(cell, alternativeText));
  }

  public void addCell(PdfPTable table) {
    throw new UnsupportedOperationException("Nested tables are not accessible.");
  }

  public void addCell(Phrase phrase, String alternativeText) {
    addCell(new AccessibleCell(phrase, alternativeText));
  }

  public void addCell(String text, String alternativeText) {
    addCell(new AccessibleCell(text, alternativeText));
  }

  public void addCell(AccessibleCell cell) {
    cellList.add(cell);
    super.addCell(cell);
  }

  public void writeTable(float upperLeftX, float upperLeftY, PdfContentByte cb,
      PdfStructureElement parentElement) throws DocumentException, IOException {

    PdfStructureElement tableElement = new PdfStructureElement(parentElement,
        PdfName.TABLE);

    float yPos = upperLeftY;

    String alternativeText;
    int rowCount = cellList.size() / columnWidths.length;
    AccessibleCell cell;
    for (int row = 0; row < rowCount; row++) {
      float xPos = upperLeftX;

      PdfStructureElement tr = new PdfStructureElement(tableElement, PdfName.TABLEROW);

      for (int column = 0; column < columnWidths.length; column++) {
        cell = cellList.get(row * columnWidths.length + column);
        alternativeText = cell.alternativeText;

        /*
         * If the given alternative text is unspecified, just use the actual
         * text for the accessible text... which means we don't have to write
         * out an ALT span.
         */
        if (alternativeText == null) {
          if ((column + 1) == columnWidths.length)
            yPos = writeCell(cb, new PdfStructureElement(tr, PdfName.TD), row,
                column, xPos, yPos);
          else {
            writeCell(cb, new PdfStructureElement(tr, PdfName.TD), row, column,
                xPos, yPos);
            xPos += columnWidths[column];
          }
        } else {
          if ((column + 1) == columnWidths.length)
            yPos = writeCell(cb, new PdfStructureElement(tr, PdfName.TD), row,
                column, xPos, yPos, alternativeText);
          else {
            writeCell(cb, new PdfStructureElement(tr, PdfName.TD), row, column,
                xPos, yPos, alternativeText);
            xPos += columnWidths[column];
          }
        }
      }
    }
  }

  float writeCell(PdfContentByte cb, PdfStructureElement element, int row,
      int column, float x, float y) throws DocumentException, IOException {

    PdfStructureElement divElement = new PdfStructureElement(element, PdfName.DIV);

    cb.beginMarkedContentSequence(divElement);

    float yOffset = writeSelectedRows(column, column + 1, row, row + 1, x, y, cb);
    cb.endMarkedContentSequence();
    return yOffset;
  }

  float writeCell(PdfContentByte cb, PdfStructureElement element, int row,
      int column, float x, float y, String alternativeText)
      throws DocumentException, IOException {

    PdfStructureElement divElement = new PdfStructureElement(element, PdfName.DIV);

    cb.beginMarkedContentSequence(divElement);

    PdfDictionary dict = new PdfDictionary();
    dict.put(PdfName.ALT, new PdfString(alternativeText));
    cb.beginMarkedContentSequence(PdfName.SPAN, dict, true);

    float yOffset = writeSelectedRows(column, column + 1, row, row + 1, x, y, cb);
    cb.endMarkedContentSequence();
    cb.endMarkedContentSequence();
    return yOffset;
  }

}

Lastly, here is an example of how you might use the above classes:

package ca.discotek.itext.guide;

import java.awt.FontMetrics;
import java.awt.Graphics2D;
import java.awt.geom.Rectangle2D;
import java.awt.image.BufferedImage;
import java.io.*;
import java.net.URL;

import ca.discotek.itext.DocumentStructure;
import ca.discotek.itext.DocumentStructure.PageStructure;
import ca.discotek.itext.table.AccessibleCell;
import ca.discotek.itext.table.AccessibleTable;

import com.itextpdf.text.*;
import com.itextpdf.text.pdf.*;

public class Pdf06Tables {

  static final int IMAGE_BORDER_SIZE = 2;

  static final int DEFAULT_IMAGE_TYPE = BufferedImage.TYPE_INT_RGB;
  static final BufferedImage DEFAULT_IMAGE = new BufferedImage(1,1, DEFAULT_IMAGE_TYPE);
  static final Graphics2D DEFAULT_GRAPHICS = (Graphics2D) DEFAULT_IMAGE.getGraphics();
  static final FontMetrics DEFAULT_FONT_METRICS = DEFAULT_GRAPHICS.getFontMetrics();

  static Rectangle2D getBounds(String text) {
    return DEFAULT_FONT_METRICS.getStringBounds(text, DEFAULT_GRAPHICS);
  }

  static class TextImage extends BufferedImage {
    public TextImage(String text, Rectangle2D bounds) {
      super( (int) bounds.getWidth() + 2 * IMAGE_BORDER_SIZE, 
             (int) bounds.getHeight()  + 2 * IMAGE_BORDER_SIZE, 
             DEFAULT_IMAGE_TYPE);
      getGraphics().drawString(text, IMAGE_BORDER_SIZE, (int) (bounds.getHeight() + IMAGE_BORDER_SIZE) );
      getGraphics().drawRect(0, 0, getWidth(), getHeight());
    }
  }

  public static void main(String[] args) {
    try {
      Document document = new Document(PageSize.LETTER);
      PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(
          "pdf-output/Pdf06Tables.pdf"));

      Rectangle rect = document.getPageSize();

      writer.setTagged();

      document.open();

      PdfContentByte cb = writer.getDirectContent();
      PdfStructureTreeRoot rootElement = writer.getStructureTreeRoot();

      DocumentStructure documentStructure = new DocumentStructure(rootElement, "en-us");
      PageStructure pageStructure = documentStructure.getPage(0);

      AccessibleTable table = new AccessibleTable(new float[]{50, 50, 50, 200});

      int numberOfRows = 5;
      AccessibleCell cell;
      String text;
      String alternativeText;
      int column;
      for (int row=0; row<numberOfRows; row++) {
        column = 0;
        // example of String value constructor
        text = "cell: " + row + ", " + column;
        alternativeText = "You reading row " + row + " column " + column++;
        cell = new AccessibleCell(text, alternativeText);
        table.addCell(cell);

        // example of Phrase value constructor
        text = "cell: " + row + ", " + column;
        alternativeText = "You reading row " + row + " column " + column++;
        cell = new AccessibleCell(new Phrase(text), alternativeText);
        table.addCell(cell);

        // example of Image value constructor
        text = "cell: " + row + ", " + column;
        alternativeText = "You reading row " + row + " column " + column++;
        Rectangle2D bounds = getBounds(text);
        Image image = Image.getInstance(new TextImage(text, bounds), null);
        cell = new AccessibleCell(image, alternativeText);
        table.addCell(cell);

        // example of addElement style constructor
        text = "cell: " + row + ", " + column;
        alternativeText = 
            "You reading content that doesn't make sense but demonstrates addElement functionality for " + row + " column " + column++;
        cell = new AccessibleCell(alternativeText);
        cell.addElement(new Phrase(text));
        bounds = getBounds(text);
        image = Image.getInstance(new TextImage(text, bounds), null);
        image.scalePercent(100);
        cell.addElement(image);
        cell.addElement(new Paragraph(text + ", " + text + ", " + text));
        table.addCell(cell);
      }

      table.writeTable(rect.getLeft() + 100f, rect.getTop() - 100f, cb, pageStructure.bodyElement);

      document.close();
    } 
    catch (Exception e) {
      e.printStackTrace();
    }
  }

}

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>