Merge Existing PDF into new ReportLab PDF via flowables Merge Existing PDF into new ReportLab PDF via flowables django django

Merge Existing PDF into new ReportLab PDF via flowables


I just had a similar task in a project. I used reportlab (open source version) to generate pdf files and pyPDF to facilitate the merge. My requirements were slightly different in that I just needed one page from each attachment, but I'm sure this is probably close enough for you to get the general idea.

from pyPdf import PdfFileReader, PdfFileWriterdef create_merged_pdf(user):    basepath = settings.MEDIA_ROOT + "/"    # following block calls the function that uses reportlab to generate a pdf    coversheet_path = basepath + "%s_%s_cover_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))    create_cover_sheet(coversheet_path, user, user.performancereview_set.all())    # now user the cover sheet and all of the performance reviews to create a merged pdf    merged_path = basepath + "%s_%s_merged_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))    # for merged file result    output = PdfFileWriter()    # for each pdf file to add, open in a PdfFileReader object and add page to output    cover_pdf = PdfFileReader(file( coversheet_path, "rb"))    output.addPage(cover_pdf.getPage(0))    # iterate through attached files and merge.  I only needed the first page, YMMV    for review in user.performancereview_set.all():        review_pdf = PdfFileReader(file(review.pdf_file.file.name, "rb"))        output.addPage(review_pdf.getPage(0)) # only first page of attachment    # write out the merged file    outputStream = file(merged_path, "wb")    output.write(outputStream)    outputStream.close()


I used the following class to solve my issue. It inserts the PDFs as vector PDF images.It works great because I needed to have a table of contents. The flowable object allowed the built in TOC functionality to work like a charm.

Is there a matplotlib flowable for ReportLab?

Note: If you have multiple pages in the file, you have to modify the class slightly. The sample class is designed to just read the first page of the PDF.


I know the question is a bit old but I'd like to provide a new solution using the latest PyPDF2.

You now have access to the PdfFileMerger, which can do exactly what you want, append PDFs to an existing file. You can even merge them in different positions and choose a subset or all the pages!

The official docs are here: https://pythonhosted.org/PyPDF2/PdfFileMerger.html

An example from the code in your question:

import tempfileimport PyPDF2from django.core.files import File# Using a temporary file rather than a buffer in memory is probably bettertemp_base = tempfile.TemporaryFile()temp_final = tempfile.TemporaryFile()# Create document, add what you want to the story, then builddoc = SimpleDocTemplate(temp_base, pagesize=letter, ...)...doc.build(...)# Now, this is the fancy part. Create merger, add extra pages and savemerger = PyPDF2.PdfFileMerger()merger.append(temp_base)# Add any extra document, you can choose a subset of pages and add bookmarksmerger.append(entry.document.file, bookmark='Attachment')merger.write(temp_final)# Write the final file in the HTTP responsedjango_file = File(temp_final)resp = HttpResponse(django_file, content_type='application/pdf')resp['Content-Disposition'] = 'attachment;filename=logbook.pdf'if django_file.size is not None:    resp['Content-Length'] = django_file.sizereturn resp