The Winning Font in Court Opinions

Michael Lissner

At CourtListener, we're developing a new system to convert scanned court documents to text. As part of our development we've analyzed more than 1,000 court opinions to determine what fonts courts are using.

Now that we have this information,our next step is to create training data for our OCR system so that it specializes in these fonts, but for now we've attached a spreadsheet with our findings, and a script that can be used by others to extract font metadata from PDFs.

Unsurprisingly, the top font — drumroll please — is Times New Roman.

Attachments

extract_font_metadata_from_files.py_.txt

font-analysis.ods

© 2023 Free Law Project. Content licensed under a Creative Commons BY-ND international 4.0, license, except where indicated.