Kernel Tuner

Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.

1098 commits | Last update: June 17, 2022

Cite this software

Choose a version:
[[ releases.length > 0 ? releases[selectedIndex].doi : conceptDOI ]]
Copy to clipboard
Choose a reference manager file format:
Download file

What Kernel Tuner can do for you

  • Allows developers to easily unit test and auto-tune GPU code
  • Generic auto-tuning of user-defined parameters for CUDA, OpenCL, and C kernels
  • Supports more than 20 different search optimization methods to speedup tuning
  • Successfully used in 10+ different eScience projects, across various disciplines

Kernel Tuner simplifies the development of efficient GPU programs, or kernels. It does so by making kernels written in C/C++, OpenCL, or CUDA accessible from Python, while taking care of the required synchronization between data kept in host memory and data kept in device memory.

This has a number of advantages. First, it simplifies auto-tuning of the kernel parameters. In fact, Kernel Tuner comes standard with a variety of strategies for efficiently searching the parameter space, leading to greatly improved performance of tuned kernels. Second, it allows for unit testing of GPU code from within Python.

Kernel Tuner does not add any additional dependencies to the kernel code, and does not require extensive code changes. Furthermore, it is noteworthy that kernels tuned by Kernel Tuner do not require any changes after tuning to make them production ready--tuned kernels can be used as-is from any host programming language.

Read more
  • GPU
  • High performance computing
  • Multi-scale & multi model simulations
  • Real time data analysis
  • Optimized data handling
  • Big data
Programming Language
  • Python
  • CUDA
  • OpenCL
  • Apache-2.0
Source code

Participating organizations


Kernel Tuner tutorial at Supercomputing 2021

By Ben van Werkhoven

November 25, 2021

Read the blog

Writing Testable GPU Code

By Ben van Werkhoven

April 12, 2018

Read the blog

2 Conference papers

With Kernel Tuner, we were able to accelerate our CUDA kernels by a factor of 10 in just a few weeks
– Chiel van Heerwaarden , Wageningen University & Research