In this article, we will show you how to extract hyperlinks from a PDF. Simply follow the steps below!
To extract links from a PDF, we will use Lido, a spreadsheet app designed to simplify and automate tasks. You can create a free account using this link: https://www.lido.app/go/signup.
Here's how to extract links from your PDF using Lido's custom formula, IMPORTPDF:
Log into your Google Drive account and upload the PDF by clicking on "New" and then "File upload." This step ensures that Lido can access your file online.
Log into your Lido account and navigate to the Files page. Click "New file" to create a new spreadsheet for organizing the data extracted from your PDF.
In your Lido spreadsheet, click the plus (+) icon to add a new worksheet.
In cell A1, type "=IMPORTPDF(" (without the quotation marks).
Click "Add Credential" and follow the instructions to link the Google account where you uploaded the PDF. This will enable Lido to access your file. Ensure you complete all the steps to set up your account correctly.
After linking your Google account, press the comma key to proceed to the next part of the formula. Click "Select a file" to access your Google Drive files.
Navigate through your Google Drive and select the PDF you uploaded. This will link the chosen PDF directly to the formula in your spreadsheet.
Finish the formula by typing ",Sheet1!B2)" to specify where the extracted data should be placed, starting at cell B2 in Sheet1. Press ENTER to finalize and apply the formula.
Right-click on cell A1 and select "Run action" from the context menu. This will run the IMPORTPDF formula and start extracting data from your PDF.
Switch to Sheet1 and review the extracted data. Make sure all information is accurately displayed and correctly represented in the cells.
Select the range of cells containing the extracted text in your Lido spreadsheet. Right-click and choose "Copy" or use the keyboard shortcut (Ctrl+C or Cmd+C).
Open your Google Sheets document. Select the cell where you want to start pasting the data. Right-click and choose "Paste" or use the keyboard shortcut (Ctrl+V or Cmd+V).
After pasting the text into Google Sheets, you can search for patterns that represent hyperlinks using Google Sheets functions applied to the entire sheet. Add a new worksheet by clicking on the plus (+) icon at the bottom-left corner.
In the new worksheet (Sheet2), use the UNIQUE function to list all unique URLs found: "=UNIQUE(FILTER(FLATTEN(Sheet1!A:Z), REGEXMATCH(FLATTEN(Sheet1!A:Z), "(https?://[^\s]+)")))".
We hope you now know how to extract all links from a PDF.