The generation of reports is part of the daily business for everybody, who participates in a development or research team or works as part of a sales force team. You always have to report your current project state, your latest updates and fixes or the sales and order intake of the last quarter as well as the forecast of the coming quarter. Most of this reports always consist of the same graphs, texts and forms and could therefore be generated using something like a template. In other words they could be generated completely automatically if the data that changes over time is stored in something like a database.
Imagine a smaller web shop you are running. To get a better feeling of whether the shop is profitable or not or to check the response to a marketing campaign, you have to monitor your business. Thus, a report that contains some statistics about the evolution of your daily and/or monthly sales, as well as the analysis of the targets defined in the forecast, could help you to improve your marketing strategy.
However, I am a computer scientist and do not care about those stuff, but I am the one to ask how to automate things. And that is what I did to monitor different things, like my Flattr and Wordpress account, as well as the monthly power consumption of different house electronics at home. The range of applications is enormous, if you just think about it for a moment.
ReportLab, z3c.rml and Preppy
In order to automatically generate such reports, you could use Python with something like the reportlab library. ReportLab is a open-source engine for creating PDF documents, written in Python. The basic reportlab engine is free and open-source as well as the text pre-processor Preppy, that I introduced in my last post “Templating using Python and Preppy”.
Along with these two packages ReportLab provides the rlextra package. This extension adds the possibility to generate documents based on templates that are designed with ReportLab’s Report Markup Language (RML), which is an XML-style language for describing the layout of documents. The advantage of using this XML dialect, is that you separate the design from the code. Using Preppy you could insert dynamic content into the static layout. That’s great, but there is one disadvantage in using ReportLab’s rlextra package, that should not be neglected. It’s a proprietary package.
But - as so often - there is a very good alternative implementation of this package, that is maintained under a open-source license. This implementation is called z3m.rml and is provided as part of the free and open-source web application server Zope, which is entirely written in Python.
To get started you should install the needed packages, namely z3c.rml, reportlab and preppy. All these packages are published over the Python Package Index (PyPI), so you could install them using pip or easy_install.
And thats it.
Let’s take a look at the abilities of reportlab without using RML or Preppy. Using reportlab you could easily generate *.pdf files by declaring the structure directly in Python. All objects like images, statistics, tables and normal text are defined and filled within your Python script. But just have a look at the code:
If you place an image named logo.jpg within the same directory as this script and execute this script, you will get something like this. I think this example is pretty self explaining and if you would like to learn more about using reportlab as standalone solution, just have a look at ReportLab’s documentation. However, personally I do not like to define layouts within code. I prefer the separation of the actual template and the dynamic content. This also simplifies the declaration of recurring elements like a headline, footer or watermark. That’s the reason, why I use Preppy and RML.
Using RML templates with Preppy
The Report Markup Language (RML) is a XML dialect defined and used by ReportLab in order to separate dynamic from static content. A RML template is subdivided into three parts. The first part defines templates, the second part is used to define style sheets that are used to simplify the formatting of elements and the last part contains the actual content. This structure allows the user to define static parts like a headline or footer that appears repeatedly on different pages.
The following example is really complex and therefore includes some graphical elements like lines and centered strings. Furthermore there are two templates used, one for the title page and one for content pages. The dynamic content like author information and the table content is an other example for an application of Preppy. The process of inserting this content into the template is pretty the same as for the examples in my last post “Templating using Python and Preppy”. However, just get a first impression:
I think it is easy to understand how RML works. The only thing you have to understand is the organisation of the document. It starts with the normal XML header (
) followed by the doctype (in case of RML documents ``). The next step is to specify the root node of the document, the
block, which contains three other main blocks, namely the
The last three blocks are the container for your static page content, your used style sheets and dynamic and/or flowable content.
You start with defining all templates for your sites within the
block. Each of this templates is declared using the
element, which holds all content that should be displayed of all sites using this template. So here you would place your headlines and your footers.
Within the next block, the
block, you could define styles, that could be used with the corresponding elements within the
block. It is a bit like creating Cascading Style Sheets for your HTML pages.
The last block, the
block, contains the actual content, like plain text, tables, illustrations, diagrams or program code. The content placed within this block is automatically inserted into the first template specified within the
block. So in the example above the template with the ID “
” is used. In order to switch to an other template you could just embed
with the name attribute set to the corresponding template ID and the new template will be used on the next page. In the example above I forced a page break, so the following content is directly embedded within the specified template.
However, that’s all about the basic structure of RML documents and since I just like to show you the abilities of reportlab, z3c.rml and preppy, I do not explain the above document any longer and head over to the Python code:
The interesting stuff regarding this script you find within the main function. First of all a new Python module is created from the RML document using Preppy, which is then fed with the dynamic data, like the author information and the table data. The resulting RML document contains all this information and could now be parsed using z3c.rml. This could be achieved by calling the parseString() method of the rml2pdf class. The result of this call is already the content of the PDF file, which just have to be written to the file itself. That’s all, you are done. The PDF file should look like this.
The *.csv file, I used in the above example could be downloaded here.
I think that the combination of ReportLab, the Report Markup Language (RML) and the Preppy text pre-processor is a very powerful solution for automatically generate *.pdf files. As I already mentioned, the applications for such automated reports are unlimited. You could create personalized form letters, documentations or business reports in just a couple of minutes.
The z3c.rml package as alternative to the proprietary rlextra package of ReportLab has some limits, but it packs enough features to keep up with the competition. It enables the user to separate the template from the dynamic content, so designers and developers could work in parallel.
I hope you enjoyed this short introduction. I would be happy to hear of your experience and to see your results. In one of my next posts I will show you how to use these libraries to generate reports feeded with data from a sensor network connected with my Beaglebone Black.
Stay tuned and until next time - happy coding!