Seven Deadly Sins of PDFs

Information booth for knowledge base

Portable Document Format (better known as PDF) is an incredibly popular file format, and for good reasons. Authors want to control the look of their documents, embed fonts, and make them easy to send and print. But there are other reasons for the popularity:

  • Easy to send to clients, customers, and employees
  • “Locked” in that users can’t change the text (don’t want customers changing the price in an SOW)
  • Easy to write in Word, then convert (no special skilset needed)
  • Printable
  • Design formatting and fonts enable a consistent experience across platforms
  • Great for forms!

So PDF was popularized by organizations that wanted their users to print, not change, information like product features and benefits. The addition of forms (user to could fill out and save information, re-sending to the author) extended that use. 

PDFs have become so popular, they’re almost a religion.

PDFs as religion

PDFs are good and useful and wonderful. They are a solution to several challenges. We do not hate PDFs. However…

The Dark Side of PDFs

There are some pretty significant limitations to PDF in the world of policies, procedures, manuals, guides, and playbooks. Their ease of use has led human resources, compliance officers, franchise companies, and countless other organizations to succumb to the lure of a PDF in their pursuit of employee self-service. 

Just because MS Word can build a table to display forecasts and accounting items doesn’t mean its better for that purpose than Excel. Just because PhotoShop can place text next to or on top of an image it doesn’t make it OK to use that image on a web site.

There are at least seven, major reasons why PDFs are less effective than HTML for long-form documentation like manuals and playbooks. Individual policies that have different audiences, or require version control, are also better suited to a more flexible format.

7 Deadly Sins of PDFs

We’ll list them, then we’ll explore them:

  1. PDFs are – by nature of their use – bad at version control
  2. Searching PDFs is difficult at best, misleading at worst
  3. PDFs are images, and they aren’t mobile-friendly
  4. PDFs do not scale well to multiple audiences
  5. Content re-use – a major consideration for those managing policy and process – is not possible with PDFs
  6. If accessibility is important to you, your clients, and/or your lawyers, you can do better than PDFs
  7. PDFs are heavier, slower, and require more storage than HTML

PDFs Bad for Version Control

One of the great selling points of the portable format is that you can make a PDF, then email to a user, or post it to your intranet for them to view (and download). Users download, ‘save-as’ then use their hard drives as their reference point. When you go to version 2, you need to tell all of those people. And they need to delete the old and save the new.

And you need to do this for every single version of every single PDF you make. If you’re on an intranet/filesharing system, make sure to archive/delete the old and promote the new (see the section about searchability). 

Strong governance, small user base, and some technology can mitigate this problem, but other formats have proven better for long-form documentation.

PDFs Have Poor Searchability

Finding content among a PDF forest is difficult at best, but when combined with the version control issue above, for me, it’s the deciding factor to not use PDFs as the sole (or even main) output of manuals, guides, playbooks, or even stand-alone policies. There are two searchability challenges.

First, finding specific topic information within a directory of dozens of files is not always accurate. The search is keyword based – not topic based – so you’ll get every instance of each keyword, whether it’s relevant or not. Searching for “dress code”, for example, will yield every policy and manual with those keywords, even if it’s “All team members must follow the company dress code…”.

And if you have multiple versions of the dress code, or one dress code policy for front-line workers and another one for merit-based employees in an office…

The second search problem is that the results might give you the correct file, but “dress code” might be just one part of a 250 page Office Policy Manual. Once you get the right document, you have to search again for all instances of “dress code”, tabbing through all 32 instances before reaching what you’re looking for. 

Depending on your intranet or fileshare, meta tags and good, consistent content operations can make this somewhat easier. There are far better ways to search for information.

PDFs Are Not Mobile-Friendly

A PDF is an image, that’s why you get a consistent layout across viewing devices. Whether you use a browser or a PDF reader you’ll see the same thing. Unless your users are on a tablet, or a phone. In this case, all of your text will have been shrunk down to fit on the screen, causing the dreaded “pinch and zoom”, followed by the even more dreaded “scroll left and right in order to read”. 

Web sites should not use PDFs in place of a web page, though it’s sometimes tempting to do so to make printing easier. If the primary use case is for users to download for local use, fine. Otherwise it’s a poor customer experience that will decrease conversion and effectiveness of your site.

They also limit users who search your information in the field on a tablet or mobile device.

Multiple Audiences, Multiple Documents

Remember that PDFs are images. Now let’s use an example. A manager in Human Resources for an international company creates and publishes the annual Official Company Holiday Schedule each year. The company is both international and provides 24/7/365 services to its customers, so there’s a Corporate Headquarters audience in addition to the front-line union members who have negotiated additional holidays.

This could get out of hand quickly:

  • Master holiday schedule with all official holidays
  • Corporate versions (one for U.S., one for Canada, one for the German subsidiary, each with different holidays)
  • Translated versions (English, French, German)
  • Front-line, union versions (one for each country)
  • Translated versions (x3)
  • Did you publish Official Holiday Schedule – HQ – US – English 2020 ver1.3 to the intranet?
  • Did you remove Official Holiday Schedule – HQ – US – English 2019 verFINAL from the list? If not, see items #1 (version control) and #2 (searchability)

Ask any content operations manager and they’ll tell you: once it’s out there, changing information can be a never-ending process. Managing content at its source is critical – PDFs alone won’t do this.

Cat playing whack-a-mole

Live look at content managers finding all of the places where sales@ email address is used across manuals.

Limited Content Re-Use

Content re-use is the compound interest of the content world. Those who understand it will have efficient content operations, those who don’t are doomed to the costs, inefficiencies, and inaccuracies of content chaos. 

Another example, this one from the franchise world. A franchise operations manual usually includes the rights and responsibilities with regards to trademarks and logos. This information is re-used in a Marketing Field Guide, an expanded guide that gives best practices and ideas from other units. A store development, or grand opening playbook might also include this information as it applies to vendors and set-up. A simple “logo use approval process” is used in at least three different places.

It’s also in the training manual. And in the learning management system. Managing this change happens at the source, and if Word is your source you cut and paste and create a PDF for each.

This is not fun. It also opens you to inconsistent and conflicting information. Hiring more people is the only way we know to overcome this without changing your source.

PDFs Don’t (Easily) Comply With Web Accessibility Standards

It’s not that PDFs can’t be accessible, it’s that there are a lot of challenges with making them that way. Most of those challenges come from their main use case: more design-heavy, often with words over top of images which cannot be read by screen readers.

If your docs are required to be 508 compliant, or W3C compliant, you might have to be creative with your design, or change your output to something more easily maintained and searchable, like HTML.

PDFs Are 3x Heavier Than HTML

HTML better for document speed

For web sites, especially ecommerce sites, speed is critical. The even a small increase in load time leads to a reduced conversion rate (fewer sales).

If you’re concerned about sustainability – a growing consideration for both external web sites and internal intranets/filesharing – PDFs contribute more to the increased need for storage capacity, which requires more energy to maintain. The amount of data stored has grown nearly tenfold in the last 5 years.

But it isn’t just file size that makes the PDF footprint larger, it’s how they’re used. When a new version of a policy or document is created, the usual convention is to create a new version number. There are reasons for this, but it’s common for there to be versions 1.0, 1.1, 2.1-2.5, and -2020-v.3.1-FINAL.pdf.

Version problems, user confusion, inability to search, and storage capacity issues. 

WOW! You Really Hate PDFs!

I don’t, I really don’t. But when your only tool is a hammer then every problem is a nail. If you write long-form content (manuals, playbooks, guides), if you have multiple channels (web, print, app), or if you have several audiences (country-, state-, or user-specific versions) then PDFs create more challenges than they solve.

PDF is an output, the content should be managed outside of that format. Knowing your content types, frequency of change, number or audiences goes into determining the best way to source, and there are better ways of delivery answers to your users than a PDF.

Shares