PDF and other document-wrangling tools on PythonAnywhere

So you need to read or write PDFs? Or maybe convert some HTML to PDF, or the other way round? Or read Word documents and extract text from them? Hopefully PythonAnywhere already has a tool preinstalled which can help.

This isn't a comprehensive guide, but here a few pointers:

Python packages in our Batteries included:

Try doing a search for "pdf" on the PythonAnywhere "Batteries Included" list of Python modules

A couple of tips:

  • "pdftk" will only work in consoles, websites, always-on and scheduled tasks -- not from Jupyter notebooks or over SSH.

  • We also have "weasyprint" installed, which is meant to have PDF capabilities

Preinstalled binaries

Open a Bash console and run:

ls /usr/bin/*pdf*

You should see a whole bunch of potentially useful binaries which you can call out to.

We also have Abiword installed, and it has some command-line options for converting word documents and others. Check out this article about using abiword at the command-line for example.

If you find something useful, let us know! support@pythonanywhere.com