python-pdfkit (wkhtmltopdf) TOC overflow python-pdfkit (wkhtmltopdf) TOC overflow python python

python-pdfkit (wkhtmltopdf) TOC overflow


Code review

I made a quick code review in your XSL (and CSS) file.Even if it doesn’t solve your problem, it help reproducing and understanding it.Here is my comments:

  • Your XSL has a typo: <! begin LI> is not a valid XML tab. Is it a comment?

  • I prefer using the concat() XPath function to append characters directly. Because, if you re-indent your code, you may introduce extra whitespaces.

    So, I replaced:

    <xsl:attribute name="href"><xsl:value-of select="@link"/> . </xsl:attribute>

    By:

    <xsl:attribute name="href">  <xsl:value-of select="concat(@link, ' . ')"/></xsl:attribute>
  • I added a xs:if to prevent generating an empty <ul> if it is not necessary:

    <xsl:if test="count(outline:item)">  <ul>    <xsl:comment>added to prevent self-closing tags in QtXmlPatterns</xsl:comment>    <xsl:apply-templates select="outline:item"/>  </ul></xsl:if>
  • I also fixed duplicate or mal-formed CSS entries, I replaced:

    li {  border-bottom: 1px dashed rgb(45, 117, 183);}span {  float: right;}li {  list-style: none;  margin-top: 30px;}ul ul {font-size: 80%; padding-top:0px;}ul {padding-left: 0em; padding-top:0px;}ul ul {padding-left: 1em; padding-top:0px;}a {text-decoration:none; color: color:#2d75b7;}

    by:

    span {  float: right;}li {  list-style: none;  margin-top: 30px;  border-bottom: 1px dashed rgb(45, 117, 183);}ul {    font-size: 70px;    font-family: arial;    color: #2d75b7;}ul ul {    font-size: 80%;    padding-left: 1em;    padding-top: 0px;}a {    text-decoration: none;    color: #2d75b7;}
  • If you target XHTML, the <style> tag has a mandatory type attribute. Same remark for the <script> attribute.

    <style type="text/css">...</style><script type="text/javascript">...</script>

Reproducing the problem

It was a little hard to reproduce your bug, because of a lack of information. So I guess it.

First, I create a sample TOC file, which look like this:

outline.xml

<?xml version="1.0" encoding="UTF-8"?><outline xmlns="http://wkhtmltopdf.org/outline">  <item>    <item title="Lorem ipsum dolor sit amet, consectetur adipiscing elit." page="2"/>    <item title="Cras at odio ultrices, elementum leo at, facilisis nibh." page="8"/>    <item title="Vestibulum sed libero bibendum, varius massa vitae, dictum arcu." page="19"/>    ...    <item title="Sed semper augue quis enim varius viverra." page="467"/>  </item></outline>

This file contains 70 items so that I can see the page breaks.

To build the HTML and PDF I used your (fixed) XSL file and run pdfkit:

import ioimport osimport pdfkitfrom lxml import etreeHERE = os.path.dirname(__file__)def layout(src_path, dst_path):    # load the XSL    xsl_path = os.path.join(HERE, "layout.xsl")    xsl_tree = etree.parse(xsl_path)    # load the XML source    src_tree = etree.parse(src_path)    # transform    transformer = etree.XSLT(xsl_tree)    dst_tree = transformer.apply(src_tree)    # write the result    with io.open(dst_path, mode="wb") as f:        f.write(etree.tostring(dst_tree, encoding="utf-8", method="html"))if __name__ == '__main__':    layout(os.path.join(HERE, "outline.xml"), os.path.join(HERE, "outline.html"))    pdfkit.from_file(os.path.join(HERE, "outline.html"),                     os.path.join(HERE, "outline.pdf"),                     options={'page-size': 'A1', 'orientation': 'landscape'})

note: your page size looks very huge…

Solution

You are right, wkhtmltopdf doesn't take into account the margin in your CSS:

li {  list-style: none;  border-bottom: 1px dashed rgb(45, 117, 183);  margin-top: 30px;  # <-- not working after page break}

This is a normal behavior, consider for instance the header paragraphs (h1, h2, etc.).A header can have a top margin in order to add white space between a paragraph and the following header,but, if the header starts a new page we want to get rid of the margin, and have the heading touching to top margin of the page.

For your TOC, there is a solution. You can use padding (instead of margin):

li {  border-bottom: 1px dashed rgb(45, 117, 183);  list-style: none;  padding-top: 30px;}

Actually, the TOC content (#toc element) is fixed:

#toc {  width: 50%;  margin-top: 150px;  margin-left: 300px;}

So, you can reduce the margin-top to match your need, for instance:

#toc {  width: 50%;  margin-top: 120px;  margin-left: 300px;}