Blacking out and removing hidden information in PDFs by conversion to raster format

PDF documents may contain information which an author may remove before publication or sharing with specific target groups. While PDF editors (eg, Linux/Xournal) or graphic programs(eg Linux/GIMP) allow “blacking out” information, PDF files with vectorized information keep the hidden information, eg text, which can be marked and copied and pasted into another document, and further processed. Therefore, I would like to describe the following step to remove the hidden information by blacking out information and converting the files from a vectorized (text-based) to a rasterized (pixel-based, image) format.

  1. Linux/Xournal  > menu > Tools
    1. > Pen
    2. > Colour > Black
    3. > Pen Options > Very thick
  2. Linux/Shell/ImageMagick/
    1. Single file: convert -density 150 in.pdf out.pdf
    2. Batch of files: mogrify -density 150 *.pdf

Note:
If the ImageMagtick commands convert or mogrify  give a “not authorized” error, the file permissions in the ImageMagick Config file have to be adjusted [1].
Mini-Tutorial:

  1. sudo emacs /etc/ImageMagick-6/policy.xml
  2. Change policy domain=”coder” rights=”none” pattern=”PDF”  to  policy domain=”coder” rights=”read|write” pattern=”PDF”
  3. Apply to other file types, eg PS, EPS, as feasible.
  4. Save file (Ctrl-x Ctrl-s)

Reference:
[¹] https://stackoverflow.com/questions/42928765/convertnot-authorized-aaaa-error-constitute-c-readimage-453

http://wilmarigl.de

en_USEnglish