CERN ROOT is a powerful object oriented program in C++ data analysis framework with statistical and fitting methods used by many physicists in fundamental research. It developed and is maintained by the European Organization for Nuclear Research, known as CERN. The CERN ROOT analysis framework was used the analysis and figures presented in the discovery of the Higgs Boson.

I used ROOT in my own graduate research in astroparticle physics given the amount of particle physicist in my community. While I started my graduate research using ROOT in C/C++, I found Python to be a powerful interpreter and transitioned away from C/C++ in my analysis. Luckily, ROOT can now be imported into Python as PyRoot. Python also has the module SciPy (Scientific Python) with fitting and statistical methods for scientific computing. This is an example of how to use PyRoot and comparing it to SciPy by fitting a Gaussian distribution. A link to the raw codes and outputs in this example is in a Jupyter python notebook on my Github profile.

Reviewing the Python syntax in the Jupyter notebook linked above.

Import SciPy and PyRoot

Import the desired python modules, specifically SciPy and PyRoot.

import numpy as np
import scipy as sp
from scipy.optimize import curve_fit
from scipy.stats import chi2
from ROOT import TCanvas, TGraph, TH1F, TF1, gStyle, TH1D

The versions I used in my example are:

Scipy v1.1.0 Numpy v1.15.0
ROOT Version: 6.18/04

Creating a SciPy Histogram

Generate a test distribution using 500 random samples from a gaussian distribution using scipy.random.normal() with μ=1, σ=.1. The histogram was binned using plt.hist().

# Define Python Histogram #
mean, std = 1 , 0.1
data = sp.random.normal(mean,std,500)

# Define Bin Size #
xmin = np.floor(10.*data.min())/10.
xmax = np.ceil(10.*data.max())/10.
nbins = int((xmax-xmin)*100)
# print(xmin, xmax, nbins)

# Create Python Histogram #
hist, bin_edges, patches = plt.hist(data,nbins,(xmin,xmax),color='g',alpha=0.6)
bin_centers = (bin_edges[1:]+bin_edges[:-1])/2.

# Find Non-zero bins in Histogram
nz = hist>0
first_nz = bin_centers[nz][ 0] - 0.005
last_nz  = bin_centers[nz][-1] + 0.005

Creating the histogram in PyRoot

Define the PyRoot histogram using TH1D(). Then using the distribution sampled above and load it the histogram h with SetContent().

root_hist = np.zeros(nbins+2,dtype=float)
root_hist[1:-1] = hist
h = TH1D('h','hist',nbins,bin_edges)
h.SetContent(root_hist)
h.SetTitle("Root Histogram with Fit;X Axis;Y Axis [Counts]")

Fit the Histogram with PyRoot

Fit the PyRoot histogram with Fit() using the ROOT predefined gaus function over the range xmin to xmax. The retrieve the fit function with GetFunction(), retrieve the fit function f using GetParameter(), the fit function parameter error using GetParError(), and the fit statistics with GetNDF(), GetChisquared(), and GetProb().

# Fit histogram with root #
h.Fit('gaus','','',xmin,xmax)

# Get Root Fit and Goodness of Fit Parameters #
f = h.GetFunction('gaus')
const,mu,sigma = f.GetParameter(0), f.GetParameter(1), f.GetParameter(2)
econst,emu,esigma = f.GetParError(0), f.GetParError(1), f.GetParError(2)
ndf,chi2,prob = f.GetNDF(),f.GetChisquare(),f.GetProb()

print(chi2, ndf)
print(chi2/ndf,prob)

which outputs the fit statistics

49.08659294755991 48
1.0226373530741648 0.4293397390693532
 FCN=49.0866 FROM MIGRAD    STATUS=CONVERGED      68 CALLS          69 TOTAL
                     EDM=8.44808e-08    STRATEGY= 1      ERROR MATRIX ACCURATE
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE
   1  Constant     1.82687e+01   1.13856e+00   2.97321e-03   2.68351e-05
   2  Mean         1.00457e+00   5.02995e-03   1.73624e-05  -8.12271e-02
   3  Sigma        1.00541e-01   4.61612e-03   4.01582e-05   5.02345e-03

Plot Fitted Histogram in PyRoot

Plotting the PyRoot canvas and histogram and save the canvas as a pdf.

# Draw Fit and Histogram#
h.Draw()
c1.Draw()
c1.SaveAs('root_fit.png')
ROOT fit to the histogram

PyRoot fit in red to the gaussian sampled distribution. The fit statistics from ROOT are in the upper right.

SciPy Fit

Fit the distribution with SciPy using curve_fit() with the below defined gaus().

#SciPy Fit #

# Gaussian Fit Function :
def gaus(x, const, mu, sigma):
    return const* np.exp(-0.5*((x - mu)/sigma)**2)

# Define Range and Fit :
coeff, covar = curve_fit(gaus, bin_centers[nz], hist[nz])

PyRoot vs SciPy

Comparing PyRoot to SciPy and plot both using the retrieved gaussian fit parameters. Calculate the Χ2 value and fit probability manually to confirm the fit statistics outputs of PyRoot.

# Define Fit Curves #
x = bin_centers
root_gaus = (const,mu,sigma)
opti_gaus = coeff

# Calculate chi Squared
f_root = gaus(x,*root_gaus)
f_opti = gaus(x,*opti_gaus)
ch2_root = np.sum( (hist[nz]-f_root[nz])**2/hist[nz])
ch2_opti = np.sum( (hist[nz]-f_opti[nz])**2/hist[nz])

# Calculate Degrees of Freedom with 3 fit parameters
dof = len(x[nz])-3

# Calculate Probablity #
p_root = sp.stats.chi2.sf(ch2_root, dof)
p_opti = sp.stats.chi2.sf(ch2_opti, dof)

print('Hist\tCalc Package\tChi Sq\t Chi2/dof\tprob')
print('Root\tRoot\t{0}\t{1}\t{2}'.format(chi2, chi2/dof, prob))
print('Root\tPython\t{0}\t{1}\t{2}'.format(ch2_root, ch2_root/dof, p_root))
print('SciPy\tPython\t{0}\t{1}\t{2}'.format(ch2_opti, ch2_opti/dof, p_opti))
print('Degrees of Freedom =\t{0}'.format(str(dof)))

which outputs

Hist	Calc Package	Chi Sq	 Chi2/dof	prob
Root	Root	49.08659294755991	1.0226373530741648	0.4293397390693532
Root	Python	49.08659294755992	1.022637353074165	0.42933973906935374
Scipy	Python	54.44279541875839	1.1342249045574664	0.24270327085696833
Degrees of Freedom =	48

PyRoot achieves a better Χ2/ndf and thus a better fit. I hypothesis difference in fits stats from the same data is due the minimization methods of the two modules. SciPy has more examples online with how to use but this example will hopefully allow you to use PyRoot instead as it is more robust.

PyRoot vs SciPy

SciPy in red and PyRoot in black.

Resources

Fit data using PyROOT… and plot the result using matplotlib