Blog: Example

Convert pdf to text (OCRmyPDF)

It relies on tesseract for its OCR (https://ocrmypdf.readthedocs.io/en/latest/languages.html), so you need tesseract's language packs (http://tttthis.com/blog/convert-image-to-text-tesseract-ocr) to do other languages.

"OCRmyPDF that will add a text layer to a scanned PDF making it searchable"

FOSS

sudo apt-get install ocrmypdf

ocrmypdf input.pdf output.pdf

SPANISH (characters, otherwise it won't be able to copy-paste ¿)

ocrmypdf -l spa input.pdf output-spa.pdf


TTTThis

Colorizing black and white photos in (Gnu Image)

  • Make a gradient of dark blue and light blue
  • Colors > Map > Gradient
  • (NOTE you can also do this by Windows > Dockable dialogs > Palette, and drag over some colors dark to light, then do Map > Palette, which gives you a more rangy color effect)
  • (NOTE or you can do Colors > Colorize) THEN
  • r-click blue layer > Add layer mask > Black full transparency
  • Now you can use any tool (pen, airbrush, square and fill) and create that color

TTTThis

Speech to text (and mp3?) (nerd-dictation)

https://news.ycombinator.com/item?id=29972579


TTTThis

Create a photo montage (imageMagick)

montage -geometry +0+0 -tile 10x *.jpg result.jpg

creates a montage of all the jpgs in the folder, 10 images wide (you can't specify by how many height.

https://stackoverflow.com/questions/37709879/how-to-generate-a-collage-image-like-shown


TTTThis

Lighten, darken, increase contrast on (text) images, for readability (ImageMagick)

Lighten those pages

  • convert output.pdf -function polynomial 1,0,0,0 darkened.pdf
  • mogrify output.pdf -contrast-stretch 2%x20% music1C.pdf
  • convert -density 600 output.pdf output-%02d.jpg

(using rm to add security permissions then remove them after https://stackoverflow.com/questions/52861946/imagemagick-not-authorized-to-convert-pdf-to-an-image)

sudo mv /etc/ImageMagick-6/policy.xml /etc/ImageMagick-6/policy.xml.off

-----When done, you can restore the original with

sudo mv /etc/ImageMagick-6/policy.xml.off /etc/ImageMagick-6/policy.xml

3 step process:

  • convert your_pdf_filename.pdf output-%02d.jpg
  • convert output*.jpg -level 25% final-%02d.jpg
  • convert final*.jpg very_readable.pdf

(change the level value)

With the arg -threshold you get a "black and white" (only) image. But I want to keep the gray scale, which is possible with the arg -level: you keep the gray, letting the image with a darker or lighter gray scale. (referring to something like <<< convert output*.jpg -normalize -threshold 80% final-%02d.jpg >>>

TTTThis