[Trisquel-users] finding particular pages within PDFs

legimet.calc at gmail.com legimet.calc at gmail.com
Fri Aug 29 20:56:16 CEST 2014


"for file in pages/*" is a for loop. That means that it will execute the body  
of the loop for each file in the directory pages/*, setting the variable file  
to the filename each time.

'if pdftotext "$file" - | grep -i regexps': the 'pdftotext "$file" -' part  
outputs the text of the pdf to standard output. However, this is piped to  
grep. When you pipe it, the standard output of the first command becomes  
standard input of the second command. So the file will be searched for your  
regexp, and the "if" will check if there were any matches.

You can append the filename to a variable using something like '$foo="$foo  
$file"'


More information about the Trisquel-users mailing list