This post is for analysts who (1) would rather not copy-paste balance sheets and income statements from pdf files as they start to make their model, and (2) would like to get the principal topics from wordy sections.

Nothing beats practical examples for me so the rest of the post makes use of various Python packages on PNB’s publicly available annual reports.

Part 1. Extracting financials from pdf to csv files

Step 1: Save all the relevant annual reports in the same folder as the Jupyter notebook.

In my case, I saved 5 different annual reports ahead.

