We distinguish 4 Types of PDF Statements
Bank statements you receive in PDF format may be one of the following PDF types.
Text based PDF’s
Most PDF statements that you download from a bank are text based .pdf’s. The text for each transaction is selectable, and you can copy and paste it into a document or spreadsheet. These statements may also contain images, used for logos, check images, and advertisements.
You can process text based .pdfs with any PDF Convert products.
Image based and scanned PDF’s
Image based .pdf’s usually come from scanned paper statements, and are often called “scanned .pdfs”. Scanning is just like taking a picture and creates a large image for each page in the statement. A scanned .pdf may look the same as a text based .pdf, but you can only select a region. You cannot select lines of text, and so cannot copy/paste content into another document. Additionally, a small percentage of banks do create image based .pdf’s where each page is composed of one or more images.
You can process image based/scanned .pdf’s with Convert+ or the PDF+ AddOn.
Searchable .pdf’s combine an image based .pdf with a layer containing text characters. Searchable .pdf’s are frequently created by doing OCR on a scanned image and saving the text. The OCR process can be done by the scanner software, or by another program. You can select text in a searchable .pdf and copy/paste the text into another document. If the OCR was completely accurate, then a searchable .pdf can be processed just like a text based .pdf statement. However, because the text characters are ‘under’ the image, the only way you can validate their accuracy is to copy/paste into another document and check it carefully. In the example below the amount was incorrectly recognized, and therefore the transaction would not be valid.
You can process searchable .pdf’s with PDF Convert+ products, either as a text-based or image-based .pdfs, depending on text accuracy.
Encrypted .pdf’s are text based .pdf’s, but when you copy/paste text from them into another document, you get what looks like gibberish characters, rather than what you expect. This is because the text has been internally encrypted. A very few banks create these type of encrypted .pdf’s. Due to the fact the encryption is done with the actual text characters in the .pdf statement, the encryption is different than password protection, and cannot be removed.
Encrypted .pdf’s can be successfully converted by processing them as image based .pdfs with Convert+ products or with the PDF+ AddOn.