PDF documents may contain information which an author may remove before publication or sharing with specific target groups. While PDF editors (eg, Linux/Xournal) or graphic programs(eg Linux/GIMP) allow “blacking out” information, PDF files with vectorized information keep the hidden information, eg text, which can be marked and copied and pasted into another document, and further processed. Therefore, I would like to describe the following step to remove the hidden information by blacking out information and converting the files from a vectorized (text-based) to a rasterized (pixel-based, image) format.
- Linux/Xournal > menu > Tools
- > Pen
- > Colour > Black
- > Pen Options > Very thick
- Linux/Shell/ImageMagick/
- Single file: convert -density 150 in.pdf out.pdf
- Batch of files: mogrify -density 150 *.pdf
Note:
If the ImageMagtick commands convert or mogrify give a “not authorized” error, the file permissions in the ImageMagick Config file have to be adjusted [1].
Mini-Tutorial:
- sudo emacs /etc/ImageMagick-6/policy.xml
- Change policy domain=”coder” rights=”none” pattern=”PDF” to policy domain=”coder” rights=”read|write” pattern=”PDF”
- Apply to other file types, eg PS, EPS, as feasible.
- Save file (Ctrl-x Ctrl-s)
Reference:
[¹] https://stackoverflow.com/questions/42928765/convertnot-authorized-aaaa-error-constitute-c-readimage-453