Automated PDF reports using ReportLab, z3c.rml and Preppy

By Andreas Schickedanz Jun 12, 2013

The generation of reports is part of the daily business for everybody, who participates in a development or research team or works as part of a sales force team. You always have to report your current project state, your latest updates and fixes or the sales and order intake of the last quarter as well as the forecast of the coming quarter. Most of this reports always consist of the same graphs, texts and forms and could therefore be generated using something like a template. In other words they could be generated completely automatically if the data that changes over time is stored in something like a database.

Imagine a smaller web shop you are running. To get a better feeling of whether the shop is profitable or not or to check the response to a marketing campaign, you have to monitor your business. Thus, a report that contains some statistics about the evolution of your daily and/or monthly sales, as well as the analysis of the targets defined in the forecast, could help you to improve your marketing strategy.

However, I am a computer scientist and do not care about those stuff, but I am the one to ask how to automate things. And that is what I did to monitor different things, like my Flattr and Wordpress account, as well as the monthly power consumption of different house electronics at home. The range of applications is enormous, if you just think about it for a moment.

ReportLab, z3c.rml and Preppy

In order to automatically generate such reports, you could use Python with something like the reportlab library. ReportLab is a open-source engine for creating PDF documents, written in Python. The basic reportlab engine is free and open-source as well as the text pre-processor Preppy, that I introduced in my last post “Templating using Python and Preppy”.

Along with these two packages ReportLab provides the rlextra package. This extension adds the possibility to generate documents based on templates that are designed with ReportLab’s Report Markup Language (RML), which is an XML-style language for describing the layout of documents. The advantage of using this XML dialect, is that you separate the design from the code. Using Preppy you could insert dynamic content into the static layout. That’s great, but there is one disadvantage in using ReportLab’s rlextra package, that should not be neglected. It’s a proprietary package.

But - as so often - there is a very good alternative implementation of this package, that is maintained under a open-source license. This implementation is called z3m.rml and is provided as part of the free and open-source web application server Zope, which is entirely written in Python.

Package installation

To get started you should install the needed packages, namely z3c.rml, reportlab and preppy. All these packages are published over the Python Package Index (PyPI), so you could install them using pip or easy_install.

sudo pip install z3c.rml
sudo pip install preppy
sudo pip install reportlab

And thats it.

Using reportlab

Let’s take a look at the abilities of reportlab without using RML or Preppy. Using reportlab you could easily generate *.pdf files by declaring the structure directly in Python. All objects like images, statistics, tables and normal text are defined and filled within your Python script. But just have a look at the code:

import time
from reportlab.lib.enums import TA_CENTER
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Image
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import cm

# Setup the document template ...
doc = SimpleDocTemplate("firstDoc.pdf",
    rightMargin=1.5*cm, leftMargin=1.5*cm, topMargin=1.5*cm, bottomMargin=1.5*cm)

# ... and initialize the content block.
story=[]

# Add your logo to the page head.
story.append(Image('logo.jpg', 2*cm, 2*cm))

# Fetch the document stylesheet ...
styles = getSampleStyleSheet()

# ... and add the justify style.
styles.add(ParagraphStyle(name='Center', alignment=TA_CENTER))

# Add the document title to the content block.
story.append(Spacer(0.1*cm, 2*cm))
story.append(Paragraph('<font size="16">My first Report</font>', styles["Center"]))
story.append(Spacer(0.1*cm, 0.5*cm))

# Fetch the current date ...
timeStr = '<font size="12">{time}</font>'.format(time = time.ctime())

# ... and append it to the content block followed by some space.
story.append(Paragraph(timeStr, styles["Center"]))
story.append(Spacer(0.1*cm, 1*cm))

# Setup some normal text ...
text = """This is my first PDF report generated with ReportLab. I think it looks really great
for a quick and dirty solution. But this is just a first, quick example you could great
more complex documents using this library.
"""

# ... and add it to the document.
story.append(Paragraph(text, styles["Normal"]))
story.append(Spacer(0.1*cm, 3*cm))

# And some greetings.
story.append(Paragraph("Best regardsAndreas Schickedanz", styles["Normal"]))

# To generate the content and write it to
# the *.pdf file (in this case firstDoc.pdf)
# just call the build method.
doc.build(story)

If you place an image named logo.jpg within the same directory as this script and execute this script, you will get something like this. I think this example is pretty self explaining and if you would like to learn more about using reportlab as standalone solution, just have a look at ReportLab’s documentation. However, personally I do not like to define layouts within code. I prefer the separation of the actual template and the dynamic content. This also simplifies the declaration of recurring elements like a headline, footer or watermark. That’s the reason, why I use Preppy and RML.

Using RML templates with Preppy

The Report Markup Language (RML) is a XML dialect defined and used by ReportLab in order to separate dynamic from static content. A RML template is subdivided into three parts. The first part defines templates, the second part is used to define style sheets that are used to simplify the formatting of elements and the last part contains the actual content. This structure allows the user to define static parts like a headline or footer that appears repeatedly on different pages.

The following example is really complex and therefore includes some graphical elements like lines and centered strings. Furthermore there are two templates used, one for the title page and one for content pages. The dynamic content like author information and the table content is an other example for an application of Preppy. The process of inserting this content into the template is pretty the same as for the examples in my last post “Templating using Python and Preppy”. However, just get a first impression:

<!--?xml version="1.0" encoding="utf-8" standalone="no" ?-->
{{def(date, name, website, email, table)}}

<document>
  <!-- Don't remove any of the following main blocks, -->
  <!-- otherwise the document will not compile. -->
  <template></template>
</document>

<pagetemplate id="main">
  <pagegraphics>
    <fill color="#3b5b86">
    <rect fill="1" height="2cm" stroke="0" width="18cm" x="1.5cm" y="26cm">
    <fill color="#ffffff">
    <setfont name="Helvetica-Bold" size="18">
    <drawcenteredstring x="10.5cm" y="27cm">Avedo's Report</drawcenteredstring>
    <setfont name="Helvetica" size="12">
    <drawcenteredstring x="10.5cm" y="26.5cm">{{website}}</drawcenteredstring>

    <fill color="#3b5b86">
    <setfont name="Helvetica-Bold" size="8">
    <drawcenteredstring x="10.5cm" y="25.5cm">{{date}}</drawcenteredstring>
    </setfont>
    </fill>

    <linemode width="0.1">
    <fill color="#333333">
    <lines>1.5cm 2cm 19.5cm 2cm</lines>
    <setfont name="Helvetica" size="9">
    <drawcentredstring x="10.5cm" y="1.5cm">-
    <pagenumber> -</pagenumber>
    </drawcentredstring>
    </setfont>
    </fill>
    </linemode>
    </setfont></setfont></fill></rect></fill></pagegraphics>

    <pagetemplate id="contentPage">
    <pagegraphics>
    <linemode width="0.1">
    <fill color="#333333">

    <lines>1.5cm 27cm 19.5cm 27cm</lines>

    <fill color="#333333">
    <setfont name="Helvetica" size="8">
    <drawstring x="1.5cm" y="27.1cm">Avedo's Report ({{website}})</drawstring>
    <drawcenteredstring x="18.7cm" y="27.1cm">{{date}}</drawcenteredstring>
    <fill color="#ff0000">
    <circle radius="0.1cm" x="16.5cm" y="27.1cm"></circle></fill></setfont></fill>

    <linemode width="0.1">
    <fill color="#333333">

    <lines>1.5cm 2cm 19.5cm 2cm</lines>
    <setfont name="Helvetica" size="9">
    <drawcentredstring x="10.5cm" y="1.5cm">-
    <pagenumber> -</pagenumber></drawcentredstring>
    </setfont></fill></linemode></fill></linemode></pagegraphics>

    <stylesheet>
    <!-- Conatins the style information for the document. -->
    <blocktablestyle id="bornageekTable">
    <blockvalign value="TOP">
    <blockalign value="LEFT">
    <blocktoppadding length="2">
    <blockbottompadding length="2">
    <blockleftpadding length="3">
    <blockrightpadding length="3">

    <linestyle colorname="silver" kind="LINEBELOW" start="0,1" stop="-1,-2">
    <linestyle colorname="silver" kind="LINEAFTER" start="0,1" stop="-2,-1">
    <blockfont name="Helvetica" size="9" start="0,1" stop="-1,-1">

    <blocktoppadding length="3" start="0,0" stop="-1,0">
    <blockbottompadding length="3" start="0,0" stop="-1,0">
    <blockleftpadding length="7" start="0,0" stop="-1,0">
    <blockrightpadding length="7" start="0,0" stop="-1,0">
    <blockfont name="Helvetica-Bold" size="11" start="0,0" stop="-1,0">
    <blocktextcolor colorname="white" start="0,0" stop="-1,0">
    <blockbackground colorname="#3b5b86" start="0,0" stop="-1,0">
    </blockbackground></blocktextcolor></blockfont></blockrightpadding></blockleftpadding></blockbottompadding></blocktoppadding>

    <parastyle
            name="style.centered"
            fontName="Helvetica"
            fontSize="8"
            alignment="center" />
    </parastyle<br></blockfont></linestyle></linestyle></blockrightpadding></blockleftpadding></blockbottompadding></blocktoppadding></blockalign></blockvalign></blocktablestyle></stylesheet>

    <story>
        <!-- Contains all flowable elements of the document. -->
        <!-- They fill up the frames defined in the template section. --></story>
<para style="style.centered">
            {{name}}
            <font color="#3b5b86" size="6">{{email}}</font>
        </para>

        <setnexttemplate name="contentPage">
        <nextframe></nextframe></setnexttemplate>

<para>
            And here is some more content on a normal page ... In this case it is
            a table that shows the release history of the Ubuntu operating system:
        </para>

        <spacer length="1cm"/>

        <blocktable style="bornageekTable">
            {{script}}header = False{{endscript}}
            {{for row in table}}
                {{if header == False:}}</blocktable>
                        {{for col in row}}
{{col.replace("_", " ").title()}}
                        {{endfor}}
                        {{script}}header = True{{endscript}}
                {{else}}
                        {{for col in row}}
{{col}}
                        {{endfor}}
                {{endif}}
            {{endfor}}
</pagetemplate></pagetemplate>

I think it is easy to understand how RML works. The only thing you have to understand is the organisation of the document. It starts with the normal XML header (

1	<xml ...="">

) followed by the doctype (in case of RML documents ``). The next step is to specify the root node of the document, the

1
<document></document>

block, which contains three other main blocks, namely the

1
<template></template>,

the

1
<stylesheet></stylesheet>

and the

1
<story></story>

block.

The last three blocks are the container for your static page content, your used style sheets and dynamic and/or flowable content.

You start with defining all templates for your sites within the

1	<template></template>

block. Each of this templates is declared using the

1
<pagetemplate></pagetemplate>

element, which holds all content that should be displayed of all sites using this template. So here you would place your headlines and your footers.

Within the next block, the

1	<stylesheet></stylesheet>

block, you could define styles, that could be used with the corresponding elements within the

1
<template></template>

and

1
<story></story>

block. It is a bit like creating Cascading Style Sheets for your HTML pages.

The last block, the

1	<story></story>

block, contains the actual content, like plain text, tables, illustrations, diagrams or program code. The content placed within this block is automatically inserted into the first template specified within the

1
<template></template>

block. So in the example above the template with the ID “

1
main

” is used. In order to switch to an other template you could just embed

1
<setnexttemplate ...=""></setnexttemplate>

with the name attribute set to the corresponding template ID and the new template will be used on the next page. In the example above I forced a page break, so the following content is directly embedded within the specified template.

However, that’s all about the basic structure of RML documents and since I just like to show you the abilities of reportlab, z3c.rml and preppy, I do not explain the above document any longer and head over to the Python code:

#!/usr/bin/python
from z3c.rml import rml2pdf
import datetime
import preppy
import csv
import sys

def fetchTable():
  # Initialize the result array ...
  data = []

  # ... and parse the content.
  with open('data.csv', 'r') as csvFile:
    for row in csv.reader(csvFile, delimiter=','):
      rowData = []

      for key, col in enumerate(row):
        rowData.append(col)

      data.append(rowData)

  return data

def main(argv):
  # Load the rml template into the preprocessor, ...
  template = preppy.getModule('testDoc.prep')

  # ... fetch the table data ...
  table = fetchTable()

  # ... and do the preprocessing.
  rmlText = template.get(
    datetime.datetime.now().strftime("%Y-%m-%d"), 'Andreas Schickedanz',
    'www.bornageek.com', 'info@bornageek.com', table)

  # Finally generate the *.pdf output ...
  pdf = rml2pdf.parseString(rmlText)

  # ... and save it.
  with open('rmlReport.pdf', 'w') as pdfFile:
    pdfFile.write(pdf.read())

if __name__ == '__main__':
  main(sys.argv[1:])

An advanced example using ReportLab and RML

The interesting stuff regarding this script you find within the main function. First of all a new Python module is created from the RML document using Preppy, which is then fed with the dynamic data, like the author information and the table data. The resulting RML document contains all this information and could now be parsed using z3c.rml. This could be achieved by calling the parseString() method of the rml2pdf class. The result of this call is already the content of the PDF file, which just have to be written to the file itself. That’s all, you are done. The PDF file should look like this.

The *.csv file, I used in the above example could be downloaded here.

Conclusion

I think that the combination of ReportLab, the Report Markup Language (RML) and the Preppy text pre-processor is a very powerful solution for automatically generate *.pdf files. As I already mentioned, the applications for such automated reports are unlimited. You could create personalized form letters, documentations or business reports in just a couple of minutes.

The z3c.rml package as alternative to the proprietary rlextra package of ReportLab has some limits, but it packs enough features to keep up with the competition. It enables the user to separate the template from the dynamic content, so designers and developers could work in parallel.

I hope you enjoyed this short introduction. I would be happy to hear of your experience and to see your results. In one of my next posts I will show you how to use these libraries to generate reports feeded with data from a sensor network connected with my Beaglebone Black.

Stay tuned and until next time - happy coding!

Phidelux is a Computer Science MSc. interested in hardware hacking, embedded Linux, compilers, etc.