Split PDF by multiple pages using PDFTK?
This PowerShell script will
- use pdftk to get the number of pages
- loop in steps building a range string
- use the range to extract the pages into a new pdf with appended range to the base name (and store in the same folder).
Change the first two vars to fit your environment.
## Q:\Test\2017\05\06\Split-Pdf.ps1$pdfPath = 'Q:\Test\2017\05\06\'$pdfFile = Join-Path $pdfPath "test.pdf"$SetsOfPages = 3$Match = 'NumberOfPages: (\d+)'$NumberOfPages = [regex]::match((pdftk $pdfFile dump_data),$Match).Groups[1].Value"{0,2} pages in {1}" -f $NumberOfPages, $pdfFilefor ($Page=1;$Page -le $NumberOfPages;$Page+=$SetsOfPages){ $File = Get-Item $pdfFile $Range = "{0}-{1}" -f $page,[math]::min($Page+$SetsOfPages-1,$NumberOfPages) $OutFile = Join-Path $pdfPath ($File.BaseName+"_$Range.pdf") "processing: {0}" -f $OutFile pdftk $pdfFile cat $Range output $OutFile}
Edited to work with variable sets of pages and to properly handle the overhang.
Edited again: found a much easier way do shorten the last set of pages.
Sample output
> .\Split-Pdf.ps110 pages in Q:\Test\2017\05\06\test.pdfprocessing: Q:\Test\2017\05\06\test_1-3.pdfprocessing: Q:\Test\2017\05\06\test_4-6.pdfprocessing: Q:\Test\2017\05\06\test_7-9.pdfprocessing: Q:\Test\2017\05\06\test_10-10.pdf
You can use sejda-console
, it's open source under AGPLv3 and can be downloaded from the project GitHub page.
You can use the splitbyevery
command which
Splits a given PDF document every 'n' pages creating documents of 'n' pages each.
In you case the command line will be something like:
sejda-console splitbyevery -n 2 -f /tmp/input_file.pdf -o /out_dir
You can use the cat keyword to generate files from the desired pages.
pdftk in.pdf cat 1-2 output out1.pdfpdftk in.pdf cat 3-4 output out2.pdf
A bash script can be added in order to be easier to use:
#!/bin/bash COUNTER=0 while [ $COUNTER -lt $NUMBEROFPAGES ]; do pdftk in.pdf cat $COUNTER-$COUNTER+1 output out1.pdf let COUNTER=COUNTER+2 done