Effortless provenance tracking in Python

613 commits | Last update: March 22, 2018

What recipy can do for you

  • Keep track of what code you ran to generate results (e.g., graphs or data)
  • Add a single statement to enable provenance tracking in your Python script
  • Search your runs using a command line interface or GUI
  • Customize provenance tracking for each project

Imagine the situation: You’ve written some wonderful Python code which produces a beautiful graph as an output. You save that graph, naturally enough, as graph.png. You run the code a couple of times, each time making minor modifications. You come back to it the next week/month/year. Do you remember how you created that graph? What input data? What version of your code? Frustratingly, the answer will often be 'no'. Of course, you then waste lots of time trying to work out how you created it, or even give up and never use it in that journal paper that will win you a Nobel Prize…

ReciPy (from recipe and python) is a Python module that will save you from this situation! (Although it can’t guarantee that your paper will win a Nobel Prize!) With the addition of a single line of code to the top of your Python files, ReciPy will log each run of your code to a database, keeping track of the input files, output files and the version of your code, and then let you query this database to find out how you actually did create graph.png.

Read more
  • Optimized data handling
  • Inter-operability & linked data
Programming Language
  • Python
  • Apache-2.0

Participating organizations


  • Robin Wilson
    University of Southampton
  • Janneke van der Zwaan
    Netherlands eScience Center
Contact person
Janneke van der Zwaan
Netherlands eScience Center