How to Work with the COS Object Model in Java
Access the Document Catalog
The PDF catalog is the root of the document’s object tree and is represented as a
COSDictionary. Access it via Document.getCatalog() to inspect or modify document-level
properties:
try (Document doc = new Document("input.pdf")) {
COSDictionary catalog = doc.getCatalog();
// Inspect catalog entries
}Work with COSDictionary
COSDictionary is the fundamental key-value container in the PDF COS model. Use
COSName.of() to create name keys and store values of any COS type:
COSDictionary dict = new COSDictionary();
dict.set(COSName.of("Title"), new COSString("My Document"));
COSString title = (COSString) dict.get(COSName.of("Title"));
System.out.println(title.getString());Work with COSArray
COSArray is the ordered sequence type in the COS model. Elements are zero-indexed
and can hold any mix of COS object types. Retrieve values and cast to the appropriate
COS type:
COSArray array = new COSArray();
array.add(COSInteger.valueOf(100));
array.add(COSInteger.valueOf(200));
array.add(COSInteger.valueOf(300));
System.out.println("Array length: " + array.size());
int first = ((COSInteger) array.get(0)).intValue(); // 100Access the Trailer Dictionary
The PDF trailer dictionary contains the cross-reference table location and references
to the document catalog (/Root) and document information dictionary (/Info):
try (Document doc = new Document("input.pdf")) {
COSDictionary trailer = doc.getTrailer();
// Trailer contains /Root, /Info, /Encrypt entries
}Named Numbers Tree
The COS layer exposes the PDF name tree structure used for number trees such as
the parent tree for structure elements. COSDictionary and COSArray are used
to traverse the Nums arrays in the number tree nodes. This is the lowest-level
API for working with the logical structure of a tagged PDF document.