Extract Content from PDF (OCR)
With the Text To Table Converter Add-On, you can quickly extract text elements, tables, and even mathematical formulas (as LaTeX) directly from PDF files into your Google Docs™, Google Slides™, and Google Sheets™ documents.
Extract Content from PDF (OCR)
This feature allows you to extract text elements, tables, and mathematical formulas (in LaTeX format) from up to three PDF files simultaneously and insert them directly into your currently open Google Workspace™ document.
-
Open the Extract Content from PDF tool Navigate through the Google Workspace™ menu:
Extensions
>Text To Table Converter
>🪄 Extract Content from PDF
. -
Grant File Access (If Prompted - Google Sheets™ & Slides™ only) If you are using Google Sheets™ or Google Slides™, a dialog may appear requesting permission for the Add-On to access the currently active file. Please confirm to proceed.
-
Using the Extract Content from PDF Tool The tool’s interface will guide you through the extraction:
- Add Files: You can add up to 3 PDF files. Choose files from your Google Drive™ or upload them directly from your device.
- Select Page: For each uploaded PDF, specify the page number from which you want to extract content.
- Choose Content & Formatting:
- Extract Text & Formulas: Check this to extract all non-tabular text. Mathematical equations are automatically detected and converted into standard LaTeX notation (e.g., $$E=mc^2$$).
- Extract Tables: Check this to extract all tables found on the page.
- Preserve Basic Formatting: When enabled, this option retains simple styles like bold, italics, headings, and lists. When disabled, all content is formatted as plain text.
- Extract: Once you’ve configured your files and options, click the ‘Extract’ button. The Add-On will process your PDFs, and the extracted content will be inserted into your document.