[Trisquel-users] Re : finding particular pages within PDFs

magicbanana at gmail.com magicbanana at gmail.com
Fri Aug 29 04:25:18 CEST 2014

You could first split the PDFs into individual pages ('pdfjam' can do that)  
that you could put in a "pages" directory, enumerate those pages with a Shell  
'for file in pages/*.pdf' loop, test 'if pdftotex "$file" - | grep -i  
regexps' (where "regexps" is a file with one regexp per line) and, if the  
test passes, append the file to a Shell variable so that, outside the loop,  
you 'pdfjoin' them all. That is for a "or" query. For  "and" query you need  
to pipe several 'grep's.

