Skip to content

Getting Started

Note

This page illustrates the usage of tablefill with Stata. However, any programming language can produce output to be used by tablefill. See the sample programs section for details.

tablefill allows the user to update text and numbers in LaTeX, LyX, and Markdown. Its main purpose is to aid the user in creating reproducible reports that can be automatically updated.

Installation

To install tablefill, run

pip install git+https://github.com/mcaceresb/tablefill

You can the use tablefill from the command-line via

tablefill -i input1.txt [input2.txt ...] -o output.tex template.tex

Or directly from python via

from tablefill import tablefill

tablefill(
    input    = 'input1.txt input2.txt ...',
    output   = 'output.tex',
    template = 'template.tex'
)

If you do not wish to install tablefill system-wide, you can simply download tablefill.py and place it in your project's folder. The above snippet will import it into python correctly; however, the command-line call is would change to:

python paht/to/tablefill.py -i input1.txt [input2.txt ...] -o output.tex template.tex

Overview

tablefill replaces named placeholders inside LaTeX, LyX, or Markdown documents. While the initial setup is more complex than, say, estout or tabout, tablefill is much more flexible. The workflow is typically as follows:

  1. Create a LaTeX (or LyX or Markdown) document with placeholders that you want filled with numeric (or text) output.

    • Placeholders must be either inside a labeled table or inside commented-out tablefill tags. More on this below.
  2. Create a matrix of values that correspond to the table's placeholders.

    • Values will be read in order from the topmost row, left to right.
  3. Export that matrix to a text file.

    • The matrix is preceded by a label that must match the label in your LaTeX document.
    • Numeric matrices are most common, but tablefill will also take text input.
  4. tablefill replaces the placeholders with the matrix values.

The strength of this workflow is its flexibility. The user can format their documents however they see fit, without imposing any restrictions on where the values will be filled, as long as they are inside a labeled environment. Optionally, the user can create a file with various mappings to allow multiple Stata matrices to be appended as a single LaTeX table, or different portions of a single matrix to be appended to several tables. This is covered in the XML engine section.

Basic Example in LaTeX

Template

First you need to create a file with a table that you want. This can be anything from summary statistics to regression results to paragraphs that refer to a specifics or text which need to be updated. Consider, for instance, template.tex below:

% template.tex
\documentclass{article}
\usepackage{booktabs}
\begin{document}

Tablefill will look for the label \verb'tab:example' inside
\verb'input.txt' and fill the table below:

\begin{table}
  \caption{Table caption (e.g. summary stats)}
  \label{tab:example} % name must match label in input1.txt
  \begin{tabular}{p{4.25cm}ccc}
    \toprule
    Outcomes
    & N
    & Mean
    & (Std.)
    \\\midrule
    Outcomes \#\#\# & \#0,\# & \#1\# & (\#2\#) \\
    Outcomes \#\#\# & \#0,\# & \#1\# & (\#2\#) \\
    Outcomes \#\#\# & \#0,\# & \#1\# & (\#2\#) \\
    Outcomes \#\#\# & \#0,\# & \#1\# & (\#2\#) \\
    \bottomrule
    \multicolumn{4}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}

% tablefill:start tab:paragraph
Placeholders do not need to be inside a table. You can also have
placeholders in the text:
\begin{itemize}
    \item $N = \#0,\#$
    \item This is the \#\#\# sample.
\end{itemize}

Note \verb'% tablefill:start tab:paragraph' tells tablefill to start
looking for placeholders using the matrix labeled \verb'tab:paragraph'
in the input file. \verb'% tablefill:end' tells tablefill to stop.

Tablefill provides several placeholders types (more on this below),
but advanced users can use any format allowed by python via \verb'{}':
\#{}\#. Here is another table, using python-style formatting:
% tablefill:end

\begin{table}
  \caption{Table caption (e.g. regression results)}
  \label{tab:anotherExample} % must match label in input.txt
  \begin{tabular}{p{4.25cm}cc}
    \toprule
    Outcomes
    & Coef
    & (SE)
    \\\midrule
    Variable 1 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
    Variable 2 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
    Variable 3 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
             N & \#{:,.0f}\#
    \\\midrule
    \bottomrule
    \multicolumn{3}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}
\end{document}

Placeholders

  • Placeholders are of the form ### or \#\#\# (the latter since LaTeX requires the # character to be escaped), with the middle placeholder varying depending on whether you wish to customize the printing format. The following constructs are available:
Placeholder Format Description
### Replace as is; input can be text (all other placeholders require numbers).
#\d+# Round to \d+ digits.
#\d+,# Round to \d+ digits; add thousands comma separator.
#*# Interpret input as p-value and replce with a star corresponding so significance. Detault is * 0.1, **0.05, ***0.01.
#\d+%# Round to \d+ digits; interpret as percentage.
#|#|# Get the absolute value of the number.
#{.*}# Arbitrary python format. Anything that string.format() will accept is allowed. In Python 2.6, you must prepend 0:, that is {0:.+}.
  • Each table contains somewhere in it a label{tab:...} statement.

    • This is required and will be used to identify the input to use to fill in that table.
    • It must match an entry in input.txt (created below) or it must match a name provided in the optional mapping file.
  • There must be at most one table line per code line (hence all the \\, but that also makes the table more readable).

  • The LaTeX environment must be a table environment, not tabular. See the LyX and Markdown sections below for details on how to set up tables in those formats.

Input

In order to fill this template, we need data. Consider this example input file, called input.txt

<tab:paragraph>
    5708
    'tablefill example'
    'python formatting'

<tab:example>
    1     1237.1234      1       2
    2     2234.4         3       2.4
    3     3.345345       2       2.456
    4     2234.4         3       2.4

<tab:anotherExample>
     -1.25         -1.18     0.1447266
     2.756         -0.53     9.964426e-08
      1.13       0.57235     0.02417291
      5708

See below for an example of how to create this file directly from Stata. Note that the matrices do not have to be in the same order they appear in the template. As long as the label matches, tablefill will correctly match it to the template.

Output

To get the filled output, run

tablefill -i input.txt -o filled.tex template.tex

This produces filled.tex:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This file was produced by 'tablefill.py'
%   Template file: /home/mauricio/Documents/projects/dev/code/archive/2015/tablefill/docs/usage/01basic/template.tex
%   Input file(s): ['/home/mauricio/Documents/projects/dev/code/archive/2015/tablefill/docs/usage/01basic/input.txt']
% To make changes, edit the input and template files.
% % 

% DO NOT EDIT THIS FILE DIRECTLY.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass{article}
\usepackage{booktabs}
\begin{document}

Tablefill will look for the label \verb'tab:example' inside
\verb'input.txt' and fill the table below:

\begin{table}
  \caption{Table caption (e.g. summary stats)}
  \label{tab:example} % name must match label in input.txt
  \begin{tabular}{p{4.25cm}ccc}
    \toprule
    Outcomes
    & N
    & Mean
    & (Std.)
    \\\midrule
    Outcomes 1 & 1,237 & 1.0 & (2.00) \\
    Outcomes 2 & 2,234 & 3.0 & (2.40) \\
    Outcomes 3 & 3 & 2.0 & (2.46) \\
    Outcomes 4 & 2,234 & 3.0 & (2.40) \\
    \bottomrule
    \multicolumn{4}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}

% tablefill:start tab:paragraph
Placeholders do not need to be inside a table. You can also have
placeholders in the text:
\begin{itemize}
    \item $N = 5,708$
    \item This is the 'tablefill example' sample.
\end{itemize}

Note \verb'% tablefill:start tab:paragraph' tells tablefill to start
looking for placeholders using the matrix labeled \verb'tab:paragraph'
in the input file. \verb'% tablefill:end' tells tablefill to stop.

Tablefill provides several placeholders types (more on this below),
but advanced users can use any format allowed by python via \verb'{}':
'python formatting'. Here is another table, using python-style formatting:
% tablefill:end

\begin{table}
  \caption{Table caption (e.g. regression results)}
  \label{tab:anotherExample} % must match label in input.txt
  \begin{tabular}{p{4.25cm}cc}
    \toprule
    Outcomes
    & Coef
    & (SE)
    \\\midrule
    Variable 1 & -1.2 & (-1.18) \\
    Variable 2 & 2.8 & (-0.53)*** \\
    Variable 3 & 1.1 & (0.57)** \\
             N & 5,708
    \\\midrule
    \bottomrule
    \multicolumn{3}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}
\end{document}

Explanation

The following replacements were made:

Placeholder Replacement
\#\#\# replaced as is; mainly used for text input
\#0,\# round to 0 decimal places, add thousands comma-separator
\#1\# round to 1 decimal places
\#2\# round to 2 decimal places
\#*\# Input is a p-value; replaced with significance stars
\#{}\# Passes input to "{}".format() (print as is)
\#{:.1f}\# Passes input to "{:.1f}".format() (round to 1 decimal places)
\#{:,.0f}\# Passes input to "{:,.0f}".format() (thousands comma-separator, round to 0 decimal places)

The way tablefill operates is:

  1. Per line, the program searches for the start of a table or tablefill delimiter. In LaTeX, a table starts with \begin{table}. A tablefill delimiter starts with % tablefill:start.
  2. If found, it searches for a label before the the table ends. In LaTeX \label{tab:(.+)} can appear anywhere before \end{table}. With tablefill delimiters, % tablefill:start tab:label must appear on the same line.
  3. If a label is found, it searches the input files for a label a match.
  4. Find all occurrences of placeholders (note that in LaTeX, # is a special character, so tablefill will match both # and \# as part of a placeholder construct; this is so that templates can be compiled before using tablefill to fill in the values).
  5. Repeat 3 and 4 until reaching \end{table} or % tablefill:end.
  6. Move on to next table: Repeat 1 to 5 until reaching the end of the document.

Exporting Matrices in Stata

We provide code snippets in several programming languages to illustrate the format required by tablefill as input. For this example, we will use a Stata program named saveTable. We keep a copy in this repository, and you can install it from Stata by running:

local gh_repo https://raw.githubusercontent.com/mcaceresb/tablefill
net install tablefill_example, from("`gh_repo'/master/docs/programs'")

Now you should be able to run saveTable from any Stata session. As a simple example, we create a random matrix with four rows and three columns:

matrix x = (1, 1237.1234, 1, 2)     \ ///
           (2, 2234.4,    3, 2.4)   \ ///
           (3, 3.345345,  2, 2.456)
           (4, 2234.4,    3, 2.4)

saveTable using "input.txt", outmatrix(x) tag("<tab:example>")

saveTable arguments

  • using: required, provide the filename to write to.

  • OUTMatrix: required, provide the name of the matrix intended for exporting. The capitalization OUTMatrix means that at this command can be abbreviated up to outmat, that is

saveTable using "matrix.txt", outm(x) tag("<tab:example_matrix>")
  • tag: required, string for tag for outputted matrix. Note that the format is <tab:label>. To append a matrix to the last tag in the file, provide a blank tag.
saveTable using "matrix.txt", outm(x) tag("<tab:example_matrix>")
saveTable using "matrix.txt", outm(y) tag(" ")
<tab:example_matrix>
    entries of x

    entries of y
  • Format: optional, string for numerical format for outputted data. By default the output format is %21.9f

We give the example of a numeric matrix, as this is the most common usage, but as we saw above any text that is appended after the label can be filled by tablefill (entries must be tab-delimited or appear in a separate line). To produce the input.txt file we use earlier, run the following do file:

file open fh using "input.txt", write text append
file write fh "<tab:paragraph>"     _n ///
         _tab "5708"                _n ///
         _tab "'tablefill example'" _n ///
         _tab "'python formatting'" _n
file close fh

matrix x = (1, 1237.1234, 1, 2)     \ ///
           (2, 2234.4,    3, 2.4)   \ ///
           (3, 3.345345,  2, 2.456) \ ///
           (4, 2234.4,    3, 2.4)

matrix y = (-1.25,   -1.18, 0.1447266)    \ ///
           (2.756,   -0.53, 9.964426e-08) \ ///
           (1.13,  0.57235, 0.02417291)   \ ///
           (5708,        ., .)

saveTable using "input.txt", outmatrix(x) f(%12.0g) tag("<tab:example>")
saveTable using "input.txt", outmatrix(y) f(%12.0g) tag("<tab:anotherExample>")

Now you can run

tablefill -i input.txt -o filled.tex template.tex

Basic Example in Markdown

Consider the file template.md

<!-- tablefill:start tab:paragraph -->

Sample paragraph
- $N = #0,#$
- This is the ### sample.

Python-style formatting: #{}#.

<!-- tablefill:end -->

<!-- tablefill:start tab:example -->

| Outcomes     | N    | Mean | (Std.) |
| ------------ | ---- | ---- | ------ |
| Outcomes ### | #0,# | #1,# | (#2,#) |
| Outcomes ### | #0,# | #1,# | (#2,#) |
| Outcomes ### | #0,# | #1,# | (#2,#) |
| Outcomes ### | #0,# | #1,# | (#2,#) |

<!-- tablefill:end -->

`pandoc` will compile raw LaTeX inside markdown documents, so
`tablefill` will also replace placeholders in LaTeX tables inside
markdown files. The replacement rules for LaTeX also apply here.

\begin{table}
  \caption{Table caption (e.g. regression results)}
  \label{tab:anotherExample} % must match label in input.txt
  \begin{tabular}{p{4.25cm}cc}
    \toprule
    Outcomes
    & Coef
    & (SE)
    \\\midrule
    Variable 1 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
    Variable 2 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
    Variable 3 & \#{:.1f}\# & (\#{:.2f}\#)\#*\# \\
             N & \#{:,.0f}\#
    \\\midrule
    \bottomrule
    \multicolumn{3}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}
\end{document}

Using the same input.txt file as above, run

tablefill.py -i input.txt -o filled.md template.md

This produces filled.md

<!-- This file was produced by 'tablefill.py'
    Template file: /home/mauricio/Documents/projects/dev/code/archive/2015/tablefill/docs/usage/01basic/template.md
    Input file(s): ['/home/mauricio/Documents/projects/dev/code/archive/2015/tablefill/docs/usage/01basic/input.txt']
To make changes, edit the input and template files.


DO NOT EDIT THIS FILE DIRECTLY.
 -->

<!-- tablefill:start tab:paragraph -->

Sample paragraph
- $N = 5,708$
- This is the 'tablefill example' sample.

Python-style formatting: 'python formatting'.

<!-- tablefill:end -->

<!-- tablefill:start tab:example -->

| Outcomes     | N    | Mean | (Std.) |
| ------------ | ---- | ---- | ------ |
| Outcomes 1 | 1,237 | 1.0 | (2.00) |
| Outcomes 2 | 2,234 | 3.0 | (2.40) |
| Outcomes 3 | 3 | 2.0 | (2.46) |
| Outcomes 4 | 2,234 | 3.0 | (2.40) |

<!-- tablefill:end -->

`pandoc` will compile raw LaTeX inside markdown documents, so
`tablefill` will also replace placeholders in LaTeX tables inside
markdown files. The replacement rules for LaTeX also apply here.

\begin{table}
  \caption{Table caption (e.g. regression results)}
  \label{tab:anotherExample} % must match label in input.txt
  \begin{tabular}{p{4.25cm}cc}
    \toprule
    Outcomes
    & Coef
    & (SE)
    \\\midrule
    Variable 1 & -1.2 & (-1.18) \\
    Variable 2 & 2.8 & (-0.53)*** \\
    Variable 3 & 1.1 & (0.57)** \\
             N & 5,708
    \\\midrule
    \bottomrule
    \multicolumn{3}{p{5cm}}{\footnotesize Footnotes!}
  \end{tabular}
\end{table}
\end{document}

Basic Example in LyX

Warning

This section is under construction

Comparison with other methods

There are good reasons to use tablefill, and good reasons not to.

For example, if exporting your results quickly is more important than the format in which they are presented, then the additional complexity of tablefill is probably not worth the hassle. I think estout can be very helpful for these scenarios, but it does impose a specific process to create the matrix of values that underlies the table displayed in your document. It is very good if you just want to export your results, but when you want to format them, add notes around them, or customize them in any way, all this must be done from Stata, rather than LaTeX, LyX, or Markdown.

Stata is mainly made for data analysis, not text fomatting. The main idea behind tablefill is that it allows the user to do all of the formatting in LaTeX, which is what LaTeX is built for, and all the analysis in Stata. To get the data back and forth between the two, minimal structure is required (labeled output that is tab-delimited). The key idea is flexibility: tablefill allows the user to put placeholders anywhere as long as they are inside a labeled environment. You can update titles, notes, entire paragraphs, and so on.

tablefill is also great if you want to plan out your document before generating all of your results. You can create and compile a document and see exactly how it will look before adding in results (you will just see placeholders instead of your data).