In this article:

Extract Invoice Data from a PDF (Easiest Way in 2024)

In this article, we will show you how to automate PDF invoice extraction using the Lido app. Simply follow the steps below!

How to Extract Invoice Data from a PDF

To extract invoice data from a PDF efficiently, we'll use Lido, a tool designed to streamline and automate repetitive tasks in spreadsheets. You can register for a free account at Lido here: https://www.lido.app/go/signup.

Method 1: PDF Importer Tool

Here's how to extract invoice data from a PDF using Lido's PDF importer tool: 

Step 1: Create a New Spreadsheet

Log into your Lido account, navigate to the Files page, and click "New file" to start a new spreadsheet. This will help you organize and analyze the data extracted from your PDF invoices.

extract invoice data from pdf

Step 2: Use the PDF Importer Tool

In your new spreadsheet, go to the "File" menu at the top. Select "Import from PDF" from the dropdown menu to convert the PDF data of your invoices into a structured spreadsheet format.

pdf invoice data extraction

Step 3: Upload Your PDF Invoice

Click on "Click to Upload" in the file importer tool and either select the PDF invoice from your computer or drag and drop the file into the uploader.

pdf invoice extraction

Step 4: Select and Extract Your Invoice Data

Once the PDF is uploaded, an interface will appear. Use it to select the specific area of the invoice you want to extract data from.

Adjust the selection box by dragging the blue corners to cover all necessary parts of the invoice, then click "Extract data" to start the extraction process.

Step 5: Review and Insert Extracted Data

Check that the data in the new window is complete and correctly extracted.

If the selected area contains text, each line will be placed into individual cells. Tabular data will be formatted as tables. If both types of data are present, plain text will be ignored.

After verifying the data, click "Insert at active cell" to add it to your spreadsheet. If more data needs to be extracted, click "Back" to adjust and repeat the extraction process.

Once inserted, the extracted data should automatically appear in your spreadsheet.

Method 2: Using the EXTRACTTABLESFROMPDF Formula

Below, we explain how to use Lido's custom formula, EXTRACTTABLESFROMPDF, to extract tabular invoice data from a PDF file: 

Step 1: Upload Your PDF Invoice to Google Drive

Sign in to Google Drive and upload the PDF invoice you want to extract data from by clicking on "New" then "File upload."

Step 2: Create a New Spreadsheet in Lido

Go to the Files page in your Lido account and click "New file" at the top right to create a new spreadsheet.

Step 3: Insert a New Worksheet

Next to the default sheet, click the plus (+) icon at the top left corner to add a new worksheet.

Step 4: Input the EXTRACTTABLESFROMPDF Formula

In the new worksheet, go to cell A1 and type "=EXTRACTTABLESFROMPDF(".

Step 5: Link Your Google Drive to Lido

Click "Add Credential" and then "Connect to Google Drive" to link your Google Drive with Lido. Just follow the on-screen steps and click on "Continue" to complete the setup and grant Lido access to your Google Drive.

Step 6: Open the File Dialog

Press the comma key to move to the next section of the formula, then click "Select a file." This will open the file selector.

Step 7: Select the Uploaded PDF Invoice

Navigate through Google Drive and select the PDF invoice you uploaded. This will link your PDF to the formula.

Step 8: Complete the Formula

Finish the formula by typing ",Sheet1!B2)" to specify that the extracted data should start at cell B2 in Sheet1. Press ENTER to apply the formula.

Step 9: Run the Formula for Extraction

Right-click on cell A1 and choose "Run action" from the context menu. This will start the formula, extracting table data from your PDF invoice into the spreadsheet.

Step 10: Verify the Extracted Invoice Data

Go to Sheet1 to check the extracted data. Confirm that the tables have been extracted accurately and represent the invoice content correctly. Note that only tabular data will be extracted.

We hope that you now know how to extract invoice data from a PDF.