Julia Micro-Benchmarks

These micro-benchmarks, while not comprehensive, do test compiler performance on a range of common code patterns, such as function calls, string parsing, sorting, numerical loops, random number generation, recursion, and array operations.

It is important to note that the benchmark codes are not written for absolute maximal performance (the fastest code to compute recursion_fibonacci(20) is the constant literal 6765). Instead, the benchmarks are written to test the performance of identical algorithms and code patterns implemented in each language. For example, the Fibonacci benchmarks all use the same (inefficient) doubly-recursive algorithm, and the pi summation benchmarks use the same for-loop. The “algorithm” for matrix multiplication is to call the most obvious built-in/standard random-number and matmul routines (or to directly call BLAS if the language does not provide a high-level matmul), except where a matmul/BLAS call is not possible (such as in JavaScript).

The data presented here is generated using this IJulia benchmarks notebook.

C Julia LuaJIT Go Fortran Java JavaScript Matlab Mathe­matica Python R Octave
gcc 4.8.5 0.6.2 scilua v1.0.0-b12 go1.9 gcc 4.8.5 1.8.0_15 V8 4.5.103.53 R 2018a 11.1.1 3.6.3 3.3.1 4.0.3
iteration_pi_sum 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.01 1.55 14.75 8.92 373.94
recursion_fibonacci 1.00 1.96 1.36 1.93 0.58 1.73 3.84 18.69 139.89 100.77 608.81 13127.38
recursion_quicksort 1.00 0.94 1.51 1.16 1.31 2.63 2.91 2.35 44.77 37.32 264.98 2559.42
parse_integers 1.00 1.35 0.99 0.97 5.29 2.82 6.41 229.56 14.52 19.98 50.90 3030.13
print_to_file 1.00 0.66 0.57 1.63 6.35 10.60 -- 110.45 65.67 4.49 170.91 165.41
matrix_statistics 1.00 1.79 1.73 6.10 1.96 4.83 11.33 8.10 7.85 17.93 20.35 48.22
matrix_multiply 1.00 0.98 1.11 1.25 1.27 7.99 24.71 1.16 1.20 1.18 8.74 1.21
userfunc_mandelbrot 1.00 0.75 1.03 0.79 0.75 1.12 1.07 10.07 19.20 132.38 333.03 6622.45

Figure: benchmark times relative to C (smaller is better, C performance = 1.0).

C and Fortran compiled with gcc 4.8.5, taking best timing from all optimization levels (-O0 through -O3). C, Fortran, Go, Julia, Lua, Python, and Octave use OpenBLAS v0.2.19 for matrix operations; Mathematica uses Intel(R) MKL. The Python environment is Anaconda Python v3.6.3. The Python implementations of rand_mat_stat and rand_mat_mul use NumPy v1.13.1 and OpenBLAS v0.2.19 functions; the rest are pure Python implementations. Raw benchmark numbers in CSV format are available here.

These micro-benchmark results were obtained on a single core (serial execution) on an Intel(R) Core(TM) i7-3960X 3.30GHz CPU with 64GB of 1600MHz DDR3 RAM, running openSUSE LEAP 42.3 Linux.

Donate Now