The Problem With PDF

The one accessibility area I don’t like and avoid working on is PDF files. I frankly find the process people have to go through to make a PDF accessible is far too complex and far too antiquated.

It is well known in the accessibility field that PDF files can be made accessible using a multi-step process. However, that process has a lot of assumptions about what the person creating the file knows. People in the access tech space say all the time if you start with a good word doc and then save that to a PDF you have done much of the work. Well, this statement has a few problems already. For example, what is a good well structured word document. And what if you don’t have access to Microsoft word and instead use another word processor. Also, it’s not really that easy, you still need to check a few things in your PDF software just to be sure.

Creating a will structured word doc

Let’s start by breaking down the idea of a well-structured word document. What is that and how do you create it?
When you have content that you want to make into a PDF the first thing to do is generate the content. Type or dictate the text in a MS word file and take a few steps to make sure the end PDF is accessible. These steps are as follows.

  1. Use proper styles to format your document.
    Do not use colors and or font changes to show a heading but instead create actual headings using h1 through h6.
    It turns out that doing this is actually very easy if you even know what I am talking about and if you’re a keyboard only user just highlight the text you want to be a header and press ctrl alt 1 through ctrl alt 6 to indicate a aeading If you are mouse dependent the stiles are found on the formatting tool bar just highlight the text and click the proper heading in the styles list.
  2. If you have tables be sure to mark up the headers on the table. You do this in the table properties.
    This of course assumes that the author knows what table headers are and how to mark them up. I find a lot of people do not understand how to create a well-marked up table and I have heard people say that sometimes the headers on a table do not convert to the right thing when the PDF is generated. I may do another post about tables later.
  3. Make sure to use proper list styles and then make ordered or unordered lists as needed.
    But again, what does that mean and how do you create these lists? Word does make some of this process easy if you start typing list items by typing one followed by a period word will automatically create a new list item each time you hit return. don’t forget to stop numbering the list by hitting enter twice when you’re done.

You can even automatically get Word to create a table of contents and a lot of other good formatting if you actually know what you want to do. How ever I have found far too many people do not realize what they are writing is a list or a header. often, I have seen all of the style elements being used but they have no idea how to make Word create them and so often I see someone bolting something that should be styled as a header because they don’t know what headers are.

I once was in a technical writing course that told the students to use font size and bold to visually create headers. when I asked why not use styles the instructor had no idea what I was talking about. The idea of the class I was taking was to create better documents and the fact that the instructor did not know anything about styles made me concerned. Microsoft again can help you with this, you can use the built-in accessibility checker in any of the office 365 products to help make sure what you are creating is accessible. These accessibility checkers can help you and bring you through the process of making sure your documents are accessible. So be sure to use these wizards and create the best word file you can.

To Many preconceptions

I am so fed up with people that teach about accessible PDFs telling people “It’s easy” I really do not think it is easy and that people that are tasked with creating the PDF may not have the prerequisite knowledge to actually get this far. Most people do not even understand that not all PDFs are accessible let alone that they might need to take PDF remediation Classes.

The Next Step in the PDF Journey

Let’s now talk about what happens once you do have this strange thing called a well-structured word file. So now what? The people that say it’s easy just say save the file as a PDF. Well, what does save mean anyway? This step is weir many PDFs go to die. So many PDFs are created in word and then the user prints the file as a PDF and voila all that work you did to apply styles and so on now just disappear. Yes, most of you my readers know that print to PDF is not the same as save as a PDF, but many people do NOT know this. They see print to PDF and think that is the write choice and wool the file is now just an image of text. It may be obvious to some people but please save as a PDF don’t print to PDF.

Microsoft has done a lot over the years to help with this and I think now even printing to PDF is likely to create a more accessible file but do not count on it. After all Microsoft is not in the PDF business and ultimately, it’s not their fault when the output file type is so horrible.
Now< it’s time to create a tagged PDF.

OK so you have now done everything properly, and you now have a perfectly accessible and usable document. If it could only end here. However, you were told it has to be a PDF so here we go. Once the file is a PDF you need to open it in a PDF creation application like acrobat pro and go through all the elements that Microsoft created for you by converting the styles to PDF tags. But what is a tag anyway and what happens if they are wrong or missing? Well, my experience is that at least a few tags are wrong and or missing even after someone has done the work in acrobat. If the document is somewhat complicated good luck because you will need to check the reading order and modify it if it’s wrong. And oh, hay acrobat also has an accessibility checker that can walk you through this process. Witch for some reason always takes more time than it does in word. However, it is my experience that something always gets missed. Here are only a few things that I have found when reviewing a “accessible PDF”

  • Words and or lines run together for the screenreader.
  • The text appears in the PDF but the screenreader can’t read it.
  • The text is read in a strange order to include merging text from more than one column
  • Tables are not presented as tables and therefore the relationship of the data in a table is not easy to interpret.

There is much more that can go wrong but these are some of the most common problems I see. But you ask why this is a problem, why can’t you use other tools to fix these problems yourself, you’re a technologist, aren’t you? And hay now days screenreaders have built in Optical recognition to help with some of this stuff. Well yes, but a lot of people may not be able to afford the training needed to learn how to use these features and or do not know that they can get training on using these tools. After all, knowing that something exists is some time the answer to the problem and many newly blinded people may not even know about the built-in tools to help them. I would argue that they should not have to pay to learn this stuff. After all, most people that can read don’t need to take extra steps like converting a document and saving it in other formats to read these files so why should blind people need to take all these extra steps to read the same file? Here is a scenario I had myself.

Just one of thousands of Inaccessible PDFs

Today most medical practitioners ask you to use an app to give you everything from post visit notes to a referral to another doctor. I just got such a document and here is what happened.

I went to my online app and found the referral document. When I opened the message about the referral I swiped through the screen and only found the title of the file and a save button. I found out later that the text of the file was on screen but the screenreader was not able to even focus on the text and therefore could not read it. OK so I did the next logical step, and I clicked the save button. The app then gave me a warning that saving the file might expose my health data to other bad actors since it was no longer in the secure medical program. I said OK and then tried to open the file on my phone but still got nothing. I then tried to share the file with one of my AI image tools but still no text. I then moved the file off my phone to my PC and again the file appeared to have no text. Now I was getting pretty upset. Ware was the text saying what and how to use my referral. I tried quite a few tools like my screenreaders built in optical character recognition and my other image recognition apps but nothing.

This example is not a one off this kind of stuff happens to me all the time. It is very frustrating and even demoralizing because I start to wonder if it’s just me not knowing how to read this dumb file. I did show the file to two people in the end, and it turns out that the file was empty. Somehow the medical platform did not save the text when it saved the file so it turns out that no one could read this file. But it was too late, the damage was done and this hole thing just got me very upset.

My ask and a rant

I get very frustrated with people that try and tell me I am wrong that PDFs are not the problem because frankly it’s always someone that makes a living teaching PDF remediation and or creating PDFs for others that are the ones saying this. They do not live with the inaccessible documents I have in my everyday life. It’s hard to always have to fight to read something and I am Telling you we need to solve this problem now. Again, it’s unfair that when I face a problem it’s my job to find a fix but hay if I don’t do it then the problem just keeps growing. I do hate PDFs and do not want to spend much more of my time helping fix them but if I can do a little to make sure the problems are fixed I will because my blind friends in this world deserve better.

What do you think should I keep pushing to remove PDFs? Or should we work to fix this broken standard and make it something that works out of the box for everyone?

By lucy greco

Lucy is a technology enthusiast that is passionate about getting people with disabilities the best access to the same technology as their able-bodied peers.