Merge Existing PDF into new ReportLab PDF via flowables
I just had a similar task in a project. I used reportlab (open source version) to generate pdf files and pyPDF to facilitate the merge. My requirements were slightly different in that I just needed one page from each attachment, but I'm sure this is probably close enough for you to get the general idea.
from pyPdf import PdfFileReader, PdfFileWriterdef create_merged_pdf(user): basepath = settings.MEDIA_ROOT + "/" # following block calls the function that uses reportlab to generate a pdf coversheet_path = basepath + "%s_%s_cover_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f")) create_cover_sheet(coversheet_path, user, user.performancereview_set.all()) # now user the cover sheet and all of the performance reviews to create a merged pdf merged_path = basepath + "%s_%s_merged_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f")) # for merged file result output = PdfFileWriter() # for each pdf file to add, open in a PdfFileReader object and add page to output cover_pdf = PdfFileReader(file( coversheet_path, "rb")) output.addPage(cover_pdf.getPage(0)) # iterate through attached files and merge. I only needed the first page, YMMV for review in user.performancereview_set.all(): review_pdf = PdfFileReader(file(review.pdf_file.file.name, "rb")) output.addPage(review_pdf.getPage(0)) # only first page of attachment # write out the merged file outputStream = file(merged_path, "wb") output.write(outputStream) outputStream.close()
I used the following class to solve my issue. It inserts the PDFs as vector PDF images.It works great because I needed to have a table of contents. The flowable object allowed the built in TOC functionality to work like a charm.
Is there a matplotlib flowable for ReportLab?
Note: If you have multiple pages in the file, you have to modify the class slightly. The sample class is designed to just read the first page of the PDF.
I know the question is a bit old but I'd like to provide a new solution using the latest PyPDF2
.
You now have access to the PdfFileMerger
, which can do exactly what you want, append PDFs to an existing file. You can even merge them in different positions and choose a subset or all the pages!
The official docs are here: https://pythonhosted.org/PyPDF2/PdfFileMerger.html
An example from the code in your question:
import tempfileimport PyPDF2from django.core.files import File# Using a temporary file rather than a buffer in memory is probably bettertemp_base = tempfile.TemporaryFile()temp_final = tempfile.TemporaryFile()# Create document, add what you want to the story, then builddoc = SimpleDocTemplate(temp_base, pagesize=letter, ...)...doc.build(...)# Now, this is the fancy part. Create merger, add extra pages and savemerger = PyPDF2.PdfFileMerger()merger.append(temp_base)# Add any extra document, you can choose a subset of pages and add bookmarksmerger.append(entry.document.file, bookmark='Attachment')merger.write(temp_final)# Write the final file in the HTTP responsedjango_file = File(temp_final)resp = HttpResponse(django_file, content_type='application/pdf')resp['Content-Disposition'] = 'attachment;filename=logbook.pdf'if django_file.size is not None: resp['Content-Length'] = django_file.sizereturn resp