HErmes.fitting package module

Provide routines for fitting charge histograms, model, startparams=None, rej_outliers=False, nbins=200, silent=False, parameter_text=(('$\\mu_{{SPE}}$& {:4.2e}\\\\', 5), ), use_minuit=False, normalize=True, **kwargs)[source]

Standardazied fitting routine.

  • charges (np.ndarray) – Charges obtained in a measurement (no histogram)
  • model ( – A model to fit to the data
  • startparams (tuple) – initial parameters to model, or None for first guess
Keyword Arguments:
  • rej_outliers (bool) – Remove extreme outliers from data
  • nbins (int) – Number of bins
  • parameter_text (tuple) – will be passed to model.plot_result
  • use_miniuit (bool) – use minuit to minimize startparams for best chi2
  • normalize (bool) – normalize data before fitting
  • silent (bool) – silence output

tuple, m=2)[source]

A simple way to remove extreme outliers from data

  • data (np.ndarray) – data with outliers
  • m (int) – number of standard deviations outside the data should be discarded


HErmes.fitting.functions module

Provide mathematical functions which can be used to create models. The functions have to be always in the form f(x, *parameters) where the paramters will be fitted and x are the input values.

HErmes.fitting.functions.calculate_chi_square(data, model_data)[source]

Very simple estimator for goodness-of-fit. Use with care. Non normalized bin counts are required.

  • data (np.ndarray) – observed data (bincounts)
  • model_data (np.ndarray) – model predictions for each bin


HErmes.fitting.functions.calculate_reduced_chi_square(data, model_data, sigma)[source]

Very simple estimator for goodness-of-fit. Use with care.

  • data (np.ndarray) – observed data
  • model_data (np.ndarray) – model predictions
  • sigma (np.ndarray) – associated errors



Get the sigma for the gauss from its peak value. Gauss is normed

Parameters:amp (float) –
HErmes.fitting.functions.exponential(x, lmbda)[source]

An exponential model, e.g. for a decay with coefficent lmbda.

  • x (float) – input
  • lmbda (float) – The exponent of the exponential


HErmes.fitting.functions.fwhm_gauss(x, mu, fwhm, amp)[source]

A gaussian typically used for energy spectra fits of radiotion, where resolutions/linewidths are typically given in full widht half maximum (fwhm)

  • x (float) – input
  • mu (float) – peak position
  • fwhm (float) – full width half maximum
  • amp (float) – amplitude

function value

Return type:


HErmes.fitting.functions.gauss(x, mu, sigma)[source]

Returns a normed gaussian.

  • x (np.ndarray) – x values
  • mu (float) – Gauss mu
  • sigma (float) – Gauss sigma
  • n


HErmes.fitting.functions.n_gauss(x, mu, sigma, n)[source]

Returns a normed gaussian in the case of n ==1. If n > 1, The gaussian mean is shifted by n and its width is enlarged by the factor of n. The envelope of a sequence of these gaussians will be an expoenential.

  • x (np.ndarray) – x values
  • mu (float) – Gauss mu
  • sigma (float) – Gauss sigma
  • n (int) – > 0, linear coefficient



Create a pandel function with the defined parameters. The pandel function is very specific, and a parametrisation for the delaytime distribution of photons from a source s measured at a reciever r after traversing a certain large (compared to the size of source or reciever) distance in a homogenous scatterint medium such as ice or water. The version here has a number of fixed parameters optimized for IceCube. This function will generate a Pandel function with a single free parameter, which is the distance between source and reciever.

Parameters:c_ice (float) – group velocity in ice in m/ns
Returns:callable (float, float) -> float
HErmes.fitting.functions.poisson(x, lmbda)[source]

Poisson probability

  • x (int) – measured number of occurences
  • lmbda (int) – expected number of occurences



The so-called Williams correction can help to correct a chi2 value in case of bins with low statistics (< 5 entries)

HErmes.fitting.model module

Provide a simple, easy to use model for fitting data and especially distributions. The model is capable of having “components”, which can be defined and fitted individually.

class HErmes.fitting.model.Model(func, startparams=None, limits=((-inf, inf), ), errors=(10.0, ), func_norm=1)[source]

Bases: object

Describe data with a prediction. The Model class allows to set a function for data prediction, and fit it to the data by the means of a chi2 fit. It is possible to use a collection of functions to describe a complex model, e.g Gaussian + some exponential tail. The individual models can be fitted independently, which results in sum_i n_i de degrees of freedom for i models with n_i parameters each, or alternatively they c can be coupled and share parameters, which results in sum_i n_i - n_ij degrees of freedom where n_ij is a shared parameters.

add_data(data, data_errs=None, bins=200, create_distribution=False, normalize=False, density=True, xs=None, subtract=None)[source]

Add some data to the model, in preparation for the fit. There are two modes of this: 1) Data needs to be histogrammed, then make sure to set

‘nbins’ appropriatly and set the ‘create_distribution’
  1. Data needs NOT to be histogrammed. In that case, bins has no meaning For a meaningful calculation of chi2, the errors of the data points need to be given to data_errs
Parameters:data (np.array) – input data
Keyword Args
data_errs (np.array) : errors of the data for chi2 calculation
(only used when not histogramming)
nbins (int/np.array) : number of bins or bin array to be passed
to the histogramming routine
create_distribution (bool) : data requires the creation of a histogram
first before fitting

subtract (callable) : ? normalize (bool) : normalize the data before adding density (bool) : if normalized, assume the data is a pdf.

if False, use bincount for normalization.

Use func to estimate better startparameters for the initialization of the fit.

Parameters:func (callable) – The function func has to have the same amount of parameters as we have startparameters.

Reset the model. This bascially deletes all components and resets the startparameters.

construct_error_function(startparams, errors, limits, errordef)[source]

Construct the error function together with the necessary parameters for minuit.

  • startparams (tuple) – A set of startparameters. 1 start parameter per function parameter. A good choice of start parameters helps the fit a lot.
  • limits (tuple) – individual limit min/max for each parameter 1 tuple (min/max) per parameter
  • errors (tuple) – One value per parameter, giving an 1sigma error estimate
  • errordef (float) – The errordef should be 1 for a least square fit (for what this all is constructed for) or 0.5 in case of a likelihood fit

tuple (callable, dict)


“Lock” the model after all components have been added. This will determiine a set of startparameters. After this, no other models can be coupled/added any more.


Couple the models by a variable, which means use the variable not independently in all model components, but fit it only once. E.g. if there are 3 models with parameters p1, p2, k each and they are coupled by k, parameters p11, p21, p12, p22, and k will be fitted instead of p11, p12, k1, p21, p22, k2.

Parameters:coupling_variable – variable number of the number in startparams. This must be the index to the respective tuple.

Assign a new set of start parameters obtained by calling the first geuss metthod

Parameters:data (np.ndarray) – input data, used to evaluate the first guess method.

Get the variable names and coupling references for the individual model components

fit_to_data(silent=False, use_minuit=True, errors=None, limits=None, errordef=1, debug_minuit=False, **kwargs)[source]

Apply this model to data. This will perform the fit with the help of either minuit or scipy.optimize.

  • data (np.ndarray) – the data, unbinned
  • silent (bool) – silence output
  • use_minuit (bool) – use minuit for fitting
  • errors (list) – errors for minuit, see miniuit manual
  • limits (list of tuples) – limits for minuit, see minuit manual
  • errordef (float) –

    typically 1 for chi2 fit and 0.5 for llh fit : this class is currently set up as a leeast square

    fit, so this should not be changed
  • debug_minuit (int) – if True, attache the iminuit instance to the model so that it can be inspected later on. Will raise error if use_minuit is set to False at the same time
  • **kwargs – will be passed on to scipy.optimize.curvefit



If a previous fit has been done with the debug_minuit instance then it now can be accessed.


The number of free parameters of this model. The free parameter in a least square fit are number of data points - fit parameters.

plot_result(ymin=1000, xmax=8, ylabel='normed bincount', xlabel='Q [C]', fig=None, log=True, figure_factory=None, axes_range='auto', model_alpha=0.3, add_parameter_text=(('$\\mu_{{SPE}}$& {:4.2e}\\\\', 0), ), histostyle='scatter', datacolor='k', modelcolor='r')[source]

Show the fit result, together with the fitted data.

  • ymin (float) – limit the yrange to ymin
  • xmax (float) – limit the xrange to xmax
  • model_alpha (float) – 0 <= x <= 1 the alpha value of the lineplot for the model
  • ylabel (str) – label for yaxis
  • log (bool) – plot in log scale
  • figure_factory (fnc) – Use to generate the figure
  • axes_range (str) – the “field of view” to show
  • fig (pylab.figure) – A figure instance
  • add_parameter_text (tuple) – Display a parameter in the table on the plot ((text, parameter_number), (text, parameter_number),…)
  • datacolor (str) – color for the data points
  • modelcolor (str) – color for the model prediction



Adding a distribution to the model. The distribution shall contain the data we want to model.

Parameters:distr (dashi.histogram) –

Inspect functions and construct a new one which returns the added result. concat_functions(A(x, apars), B(x, bpars)) -> C(x, apars,bpars) C(x, apars, bpars) returns (A(x, apars) + B(x, bpars))

Parameters:fncs (list) – The callables to concat
Returns:tuple (callable, list(pars))
HErmes.fitting.model.construct_efunc(x, data, jointfunc, joint_pars)[source]

Construct a least-squares error function. This function will then be minimized, e.g. with the help of minuit.

  • x (np.ndarray) – The x-values the fit should be evaluated on
  • data – (np.ndarray): The y-values of the data we want to describe
  • jointfunc – (callable): The full data model with all components
  • joint_pars – (tuple): The model parameters



Based on (Glenn Maynard)

Basically recreate the function f independently.

Parameters:f (callable) – the function f will be cloned
HErmes.fitting.model.create_minuit_pardict(fn, startparams, errors, limits, errordef)[source]

Construct a dictionary for minuit fitting. This dictionary contains information for the minuit fitter like startparams or limits.

  • fn (callable) – The function for which
  • startparams (tuple) – A list of startparameter. One each per parameter
  • errors (list) –


  • limits (list(tuple)) – A list of (min, max) tuples for each parameter, can be None
  • errordef (float) – The errordef should be 1 for a least square fit (for what this all is constructed for) or 0.5 in case of a likelihood fit


Module contents

Provide an easy-to-use, intuitive way of fitting models with different components to data. The focus is less on a statistical sophisticated fitting rather than on an explorative approach to data investigation. This might help answer questions of the form - “How compatible is this data with a Gaussian + Exponential?”. Out of the box, this module provides tools targeted to a least-square fit, however, in principle this could be extended to likelihood fits.

Currently the generation of the minimized error function is automatic, and it is generated only for the least-squares case, however this might be expanded in the future.