Is My Document a Valid PDF? | PSPDFKit (2024)

PDF documents are a reliable way to represent and preserve information with high fidelity. And here on our blog, we’ve talked at length about some of the most complicated parts of the PDF spec that PSPDFKit allows you to take advantage of. In this entry, I thought I’d go back to basics and outline one of the first things we do when loading your document to display it on your users’ devices: check if the file in question is even a PDF document we can parse.

What Is a PDF?

We can answer this question from two points of view: a technical one, and a conceptual one. From the technical perspective, a PDF is a file format with a special syntax that can be read from and written to using a special kind of software. Conceptually, a PDF is a digital representation of some kind of data that’s important enough that we’d like its fidelity to be preserved when we move it from one container to another.

I believe it’s important to draw this distinction, because when asking if a PDF is valid, we need to also ask which perspective we’re interested in. After all, there’s nothing preventing us from having valid PDF syntax written in a faulty file. And in such a case, the logical part of the PDF is valid, while the file itself is not.

How Can a PDF Become Invalid?

Even in a really “simple” PDF, if the right measures are not taken when writing to it, the result can be a broken file from which data cannot be recovered.

Here are some of the more common ways in which a PDF can be deemed invalid:

The important thing to note is that the official PDF specification does not provide explicit checks for software to know how a PDF can be determined to be invalid. In the first section, Scope, it states:

This standard does not specify the following:

  • specific processes for converting paper or electronic documents to the PDF format;
  • specific technical design, user interface or implementation or operational details of rendering;
  • specific physical methods of storing these documents such as media and storage conditions;
  • methods for validating the conformance of PDF files or readers;
  • required computer hardware and/or operating system.

This leaves a gaping hole for PDF software vendors, and it requires that they use their best judgement to determine in which instances a PDF file can be considered invalid. In our case, within the context of PSPDFKit, we also deem a PDF invalid if it is encrypted, due to the fact that you effectively can’t interact with it until is unlocked.

Things can get even more complicated if we take other file format standards related to PDF into consideration. One such example of this is PDF/A, which is another ISO standard that’s specialized in the archiving and long-term preservation of electronic documents.

PDF/A comprises a set of really specific ways in which data needs to be laid out to accomplish its goal. Because of that, a whole new level of complexity is added in order for us to be able to determine whether or not a PDF/A is valid. As a result, there are even specialized tools, such as the Isartor Test Suite and veraPDF, that are tasked with developing tests that can be used as a starting point for creating validation software for this specific format.

Using PSPDFKit to Validate Documents

At PSPDFKit, we take a rather pragmatic approach to checking if we can work with a file as a PDF or not. Internally, PSPDFKit performs a series of checks to determine if a PDF is valid:

  1. Is this even a PDF? — We look for the %PDF- directive in the file header. If this is missing, we abort any subsequent operations, as we can’t rely on the file to contain PDF syntax.

  2. Is the file large enough to be a valid PDF? — We check the total file size to see if it is larger than the size of the header (%PDF) and the end-of-file marker (%%EOF) added together. If this test fails, the file is automatically deemed invalid.

  3. Do we have an end-of-file marker at all? — We’ll try to load the last 1,024 bytes of the file to look for an %%EOF marker. Not having an %%EOF marker makes the file an invalid PDF.

  4. Does the file contain more PDF syntax after %%EOF? — If this is the case, then we’re dealing with a malformed file, and trying to perform any other operations with it would be a waste of resources, so we say this case is also grounds to deem a PDF invalid.

From an end user perspective, it’s really easy to see if a PDF is valid or not: if it is, you’ll see it displayed onscreen. If it’s not, you’ll see a message like the one below.

Is My Document a Valid PDF? | PSPDFKit (1)

If you’d like to do a manual check on a document before even attempting to present it, you can do so as follows:

let url = // Document URLlet document = PSPDFDocument(url: url)// Check if the document is valid before continuing.guard document.isValid else {// Perform appropriate cleanup actions.return}
NSURL *url = // Document URLPSPDFDocument *document = [[PSPDFDocument alloc]] initWithURL:url];// Check if the document is valid before continuing.if (!document.isValid) {// Perform appropriate cleanup actions.return;}

Calling PSPDFDocument.isValid will lazily load the document. If the document is valid and we were able to parse it correctly, then the document’s pages will be available to us.

Conclusion

As we saw in this post, there are multiple aspects to consider when determining whether or not a PDF is valid. Given the broad field of applications the PDF format has, it can be very difficult to come to an agreement of what exactly constitutes a “valid” PDF.

At PSPDFKit, we interpret the PDF specification as closely as we can to make sure we can deliver the reliability that our customers expect from us. Nevertheless, as with many aspects of dealing with PDF technologies, this is an ongoing effort and we’ll always be looking to improve the ways in which we can provide the best experience possible.

Is My Document a Valid PDF? | PSPDFKit (2024)

FAQs

How do you check if a file is a valid PDF? ›

No Pages — A PDF is not valid if it does not contain information about pages that should be displayed. Encryption — A PDF is considered invalid if it is encrypted, but it becomes valid when decrypted. Missing Header — The PDF spec states that any file with the .

How to check the validity of a PDF? ›

If you go to the document properties of a PDF file (control or command d), if the proper metadata is available, it will list the creation date and time and modified date and time. This can help you determine if a pdf file has been modified since creation.

How do you tell if a document is PDF A compliant? ›

To determine whether an input PDF document is PDF/A-compliant, ensure that the DDX document contains the PDFAValidation element within a DocumentInformation element. The PDFAValidation element instructs the Assembler service to return an XML document that specifies whether the input PDF document is PDF/A-compliant.

Why is my PDF file not valid? ›

There can be many reasons why you are facing the “Format Error: not a PDF or corrupted.” By reading the message, it can be understood that either the PDF file is not genuine or is corrupted or damaged by any means. Damaged or broken PDF file due to incomplete download or transfer. Damaged Adobe Reader.

How do I prove a PDF is valid? ›

The probability density function must satisfy the following conditions: (i)f(x)≥0 for all x∈R,(ii)∞∫−∞f(x)dx=1.

How can a PDF be invalid? ›

If a PDF file is showing as invalid, it could be due to corruption during the file creation process, an incomplete download, or a compatibility issue with the software you're using to open it. You can try reopening the file in a different PDF viewer or repairing the file using a PDF repair tool to resolve the issue.

How to check if a PDF is genuine or not? ›

The easiest way to validate PDF/A files

PDF/A validation tools, like the one offered by pdf2go.com, check your documents for compliance with these standards. With a quick and easy online validation process, you can confidently archive your important documents and ensure they will be readable for years to come.

How do I test a PDF document? ›

Quick Test
  1. When the document is loaded, launch the Accessibility Check option. ...
  2. Keep the default Accessibility Checker Options that will appear in a dialog and click Start Checking.
  3. The results will appear in the report in the left hand panel.

How do you make sure a file is a PDF? ›

How Do I Convert My Documents to a PDF?
  1. The recommended way to create a PDF is to save your document as a PDF. ...
  2. Best way to create an ideal PDF is,
  3. Click FILE.
  4. SAVE AS...
  5. Name the document.
  6. Change the "Save as Type" to PDF.
  7. Click SAVE.
  8. You will now have a PDF version of your document.
Feb 7, 2024

How do I know if my PDF is certified? ›

A properly signed or certified PDF Portfolio has one or more signatures that approve or certify the PDF Portfolio. The most significant signature appears in a Signature badge in the toolbar. Details of all signatures appear on the cover sheet.

How do I digitally verify a PDF? ›

5 Steps for Validating Digital Signatures In a PDF
  1. Open the digitally signed PDF that you need to validate using Power PDF.
  2. Locate the digital signature object within the document.
  3. Right click or command-click on the signature object.
  4. Select "Verify Signature" from the context menu.

How to check if a PDF file is valid? ›

How to validate PDF/A files
  1. Choose or drop the PDF/A file you would like to validate.
  2. A notification pops up that shows if your file is a valid PDF/A.
  3. If the file is not a valid PDF/A, you can click on 'Details' to see the reasons.

How do I convert a PDF to a valid PDF? ›

How to archive PDF to PDF/A in three simple steps
  1. Upload the document that you want to archive for a long term preservation.
  2. Select your preferred PDF/A standard: PDF/A-1b, PDF/A-2b or PDF/A-3b.
  3. Click the Convert button. Your document will be ready to download in seconds!

How to make a valid PDF file? ›

Open Acrobat and choose “Tools” > “Create PDF”. Select the file type you want to create a PDF from: single file, multiple files, scan or other option. Click “Create” or “Next” depending on the file type. Follow the prompts to convert to PDF and save to your desired location.

How to check authenticity of a PDF file? ›

How to validate PDF/A?
  1. Upload the PDF/A file you wish to validate by either dragging and dropping it into the upload box or by browsing for it on your computer. ...
  2. Click on 'Start' to begin the analysis for compliance with PDF/A standards. ...
  3. When the analysis is complete, the validation results will be displayed.

How to check if a PDF file is corrupted or not? ›

To check if a PDF file is corrupted, you can try the following methods: Open the PDF file: The first step is to try opening the PDF file in a PDF viewer or editor. If the file opens without any issues, it is likely not corrupted. However, if the file does not open or displays an error message, it may be corrupted.

How do I view a valid PDF? ›

To show it is a valid pdf, we have to show the following:
  1. f ( x ) > 0 . We can see that is greater than or equal to 0 for all values of .
  2. ∫ S f ( x ) d x = 1 . ...
  3. If ( a , b ) ⊂ S , then P ( a < X < b ) = ∫ a b f ( x ) d x .

Top Articles
Latest Posts
Article information

Author: Fr. Dewey Fisher

Last Updated:

Views: 6564

Rating: 4.1 / 5 (42 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Fr. Dewey Fisher

Birthday: 1993-03-26

Address: 917 Hyun Views, Rogahnmouth, KY 91013-8827

Phone: +5938540192553

Job: Administration Developer

Hobby: Embroidery, Horseback riding, Juggling, Urban exploration, Skiing, Cycling, Handball

Introduction: My name is Fr. Dewey Fisher, I am a powerful, open, faithful, combative, spotless, faithful, fair person who loves writing and wants to share my knowledge and understanding with you.