How to Work with the COS Object Model in Java

How to Work with the COS Object Model in Java

Access the Document Catalog

The PDF catalog is the root of the document’s object tree and is represented as a COSDictionary. Access it via Document.getCatalog() to inspect or modify document-level properties:

try (Document doc = new Document("input.pdf")) {
    COSDictionary catalog = doc.getCatalog();
    // Inspect catalog entries
}

Work with COSDictionary

COSDictionary is the fundamental key-value container in the PDF COS model. Use COSName.of() to create name keys and store values of any COS type:

COSDictionary dict = new COSDictionary();
dict.set(COSName.of("Title"), new COSString("My Document"));
COSString title = (COSString) dict.get(COSName.of("Title"));
System.out.println(title.getString());

Work with COSArray

COSArray is the ordered sequence type in the COS model. Elements are zero-indexed and can hold any mix of COS object types. Retrieve values and cast to the appropriate COS type:

COSArray array = new COSArray();
array.add(COSInteger.valueOf(100));
array.add(COSInteger.valueOf(200));
array.add(COSInteger.valueOf(300));
System.out.println("Array length: " + array.size());
int first = ((COSInteger) array.get(0)).intValue(); // 100

Access the Trailer Dictionary

The PDF trailer dictionary contains the cross-reference table location and references to the document catalog (/Root) and document information dictionary (/Info):

try (Document doc = new Document("input.pdf")) {
    COSDictionary trailer = doc.getTrailer();
    // Trailer contains /Root, /Info, /Encrypt entries
}

Named Numbers Tree

The COS layer exposes the PDF name tree structure used for number trees such as the parent tree for structure elements. COSDictionary and COSArray are used to traverse the Nums arrays in the number tree nodes. This is the lowest-level API for working with the logical structure of a tagged PDF document.

See Also