TECH

How to Use PDF Debugger to Check the Code of a PDF File

Apple has stopped direct support for PostScript files in macOS Sonoma, but you can still look inside PDF files to see what they contain using the PDF debugger.

PDF is a common document format on the Internet that was invented by Adobe Systems in the early 1980s. At the time, laser printers were just coming of age: the Apple LaserWriter and the Macintosh Plus, which became one of the world's first commercial desktop publishing systems.

PostScript – origin of PDF

PostScript is a language that describes how a page should be placed on the screen or on paper. Although PostScript was originally used in the ROM of laser printers, it was later used in computers created by Steve Jobs' second company, NeXT Inc.

The NeXTStep operating system (later called OpenStep) overcame early screen limitations. using Display PostScript to display text, shapes, and images on the screen.

Although Adobe's original PDF file standard was not technically pure PostScript, it was built on top of it. In version 1.3, published in 2000, Adobe added support for the PostScript Language Level 3 image model.

It also supported the original and now defunct Adobe Type 1 font standard, which we'll cover in a future article.

PDF and .ps files

A few years later, Adobe introduced Portable Document Format, or PDF. which became a document and Internet standard. PDF was originally Adobe's own format, but it was standardized as ISO 32000 in 2008.

The standard was revised again in 2020.

PDF almost never saw the light of day because most of Adobe's management at the time saw no demand for it, and PostScript was still the dominant page description language in the worlds of graphic design, desktop publishing, and printing.

You can also embed forms, digital signatures, 3D objects, videos, and a variety of other content in PDF. PDF files can be encrypted and password-protected, although Adobe recently announced that it will no longer support the original PostScript font format, Type 1 fonts.

When you open a PDF file on a modern computer, the application uses operating system code or libraries to read the PDF file's instructions. It translates commands into native drawing routines for display in the OS.

For macOS and iOS, these are the Quartz platform, which contains an API for processing PDF files, and the Core Graphics platform, which provides graphics drawing contexts for displaying PDFs. Apple has split the original functionality of the Quartz platform, so Core Graphics handles most primitives and drawing contexts, while Quartz takes care of images, PDF operations, and Quick Look preview functions.

Preview, Printing and Viewing

Formerly Apple Preview app and system Print allowed for opening, displaying, and printing PostScript directly as .ps files containing PostScript code, but this support was discontinued in macOS 14 Sonoma. Preview has supported PDF files for decades.

You can still view the raw PostScript content of .ps files on Mac by simply dragging them into the TextEdit application. They will open as text files and you can read the PostScript directly.

Although most modern laser printers no longer contain PostScript interpreters in their ROMs, some consumer-grade laser printers contain PostScript emulators, such as Brother's BR-Script, which can receive, decode, and print .ps files using Printer's own rendering.

You can look inside a PDF file and see its raw contents using a Mac hex editor utility such as HexFiend or HexEdit. Hexadecimal editors are designed to display the code and content of binary files, but they can be used to view any type of file content as long as you know the file format.

View PDF content in Hex Fiend.

But for many files, including PDF files, the raw data may be encoded or stored in a way that is not human readable. For this reason, in order to understand what you are looking at in hex editors, you need to know the internal structure of the file format.

PDF files usually start with the key “%PDF” and end with “%EOF”.

Using the PDF Debugger

Most PDF files have a hierarchical tree structure. Some nodes in the tree consist of child nodes that further describe the parent nodes, while other nodes (leaf nodes) contain only file information such as page count, type, length, creator information, and other information.

PDF files can become corrupted and contain invalid tree data, rendering them unreadable in most cases. If you think you have a corrupted PDF or just want to view the PDF tree information, there is now an easy way.

PDF Debugger, a simple web tool from Ukrainian Evgeniy Gizyl, allows you to do just that. It's easy to use: just drag any PDF file from the Finder on your Mac into the drag area on the page, and it will read and display the PDF file's tree information.

Drag a PDF file onto the PDF Debugger page to display its tree information.

Although the PDF Debugger does not display all the contents of a PDF file, you can still use a hex editor or other application to read the raw data for this.

Other utilities can convert PDF files to .ps files so you can read PostScript directly.

The PDF Debugger is a quick and easy way to check the underlying structure of PDF files.

Hyzyla also has a wrapper library for the Node.js JavaScript engine on its GitHub page, which uses Google's high-performance PDFium library written in WebAssembly.

Leave a Reply

Your email address will not be published. Required fields are marked *