API

Core Module

Numpy API for xhistogram.

xhistogram.core.histogram(*args, bins=None, range=None, axis=None, weights=None, density=False, block_size='auto')[source]

Histogram applied along specified axis / axes.

Parameters
argsarray_like

Input data. The number of input arguments determines the dimensionality of the histogram. For example, two arguments produce a 2D histogram. All args must have the same size.

binsint, str or numpy array or a list of ints, strs and/or arrays, optional

If a list, there should be one entry for each item in args. The bin specifications are as follows:

  • If int; the number of bins for all arguments in args.

  • If str; the method used to automatically calculate the optimal bin width for all arguments in args, as defined by numpy histogram_bin_edges.

  • If numpy array; the bin edges for all arguments in args.

  • If a list of ints, strs and/or arrays; the bin specification as above for every argument in args.

When bin edges are specified, all but the last (righthand-most) bin include the left edge and exclude the right edge. The last bin includes both edges.

A TypeError will be raised if args or weights contains dask arrays and bins are not specified explicitly as an array or list of arrays. This is because other bin specifications trigger computation.

range(float, float) or a list of (float, float), optional

If a list, there should be one entry for each item in args. The range specifications are as follows:

  • If (float, float); the lower and upper range(s) of the bins for all arguments in args. Values outside the range are ignored. The first element of the range must be less than or equal to the second. range affects the automatic bin computation as well. In this case, while bin width is computed to be optimal based on the actual data within range, the bin count will fill the entire range including portions containing no data.

  • If a list of (float, float); the ranges as above for every argument in args.

  • If not provided, range is simply (arg.min(), arg.max()) for each arg.

axisNone or int or tuple of ints, optional

Axis or axes along which the histogram is computed. The default is to compute the histogram of the flattened array

weightsarray_like, optional

An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If density is True, the weights are normalized, so that the integral of the density over the range remains 1.

densitybool, optional

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.

block_sizeint or ‘auto’, optional

A parameter which governs the algorithm used to compute the histogram. Using a nonzero value splits the histogram calculation over the non-histogram axes into blocks of size block_size, iterating over them with a loop (numpy inputs) or in parallel (dask inputs). If 'auto', blocks will be determined either by the underlying dask chunks (dask inputs) or an experimental built-in heuristic (numpy inputs).

Returns
histarray

The values of the histogram.

bin_edgeslist of arrays

Return the bin edges for each input array.

See also

numpy.histogram, numpy.bincount, numpy.searchsorted

Xarray Module

Xarray API for xhistogram.

xhistogram.xarray.histogram(*args, bins=None, range=None, dim=None, weights=None, density=False, block_size='auto', keep_coords=False, bin_dim_suffix='_bin')[source]

Histogram applied along specified dimensions.

Parameters
argsxarray.DataArray objects

Input data. The number of input arguments determines the dimensonality of the histogram. For example, two arguments prodocue a 2D histogram. All args must be aligned and have the same dimensions.

binsint, str or numpy array or a list of ints, strs and/or arrays, optional

If a list, there should be one entry for each item in args. The bin specifications are as follows:

  • If int; the number of bins for all arguments in args.

  • If str; the method used to automatically calculate the optimal bin width for all arguments in args, as defined by numpy histogram_bin_edges.

  • If numpy array; the bin edges for all arguments in args.

  • If a list of ints, strs and/or arrays; the bin specification as above for every argument in args.

When bin edges are specified, all but the last (righthand-most) bin include the left edge and exclude the right edge. The last bin includes both edges.

A TypeError will be raised if args or weights contains dask arrays and bins are not specified explicitly as an array or list of arrays. This is because other bin specifications trigger computation.

range(float, float) or a list of (float, float), optional

If a list, there should be one entry for each item in args. The range specifications are as follows:

  • If (float, float); the lower and upper range(s) of the bins for all arguments in args. Values outside the range are ignored. The first element of the range must be less than or equal to the second. range affects the automatic bin computation as well. In this case, while bin width is computed to be optimal based on the actual data within range, the bin count will fill the entire range including portions containing no data.

  • If a list of (float, float); the ranges as above for every argument in args.

  • If not provided, range is simply (arg.min(), arg.max()) for each arg.

dimtuple of strings, optional

Dimensions over which which the histogram is computed. The default is to compute the histogram of the flattened array.

weightsarray_like, optional

An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If density is True, the weights are normalized, so that the integral of the density over the range remains 1. NaNs in the weights input will fill the entire bin with NaNs. If there are NaNs in the weights input call .fillna(0.) before running histogram().

densitybool, optional

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.

block_sizeint or ‘auto’, optional

A parameter which governs the algorithm used to compute the histogram. Using a nonzero value splits the histogram calculation over the non-histogram axes into blocks of size block_size, iterating over them with a loop (numpy inputs) or in parallel (dask inputs). If 'auto', blocks will be determined either by the underlying dask chunks (dask inputs) or an experimental built-in heuristic (numpy inputs).

keep_coordsbool, optional

If True, keep all coordinates. Default: False

bin_dim_suffixstr, optional

Suffix to append to input arg names to define names of output bin dimensions

Returns
histxarray.DataArray

The values of the histogram. For each bin, the midpoint of the bin edges is given along the bin coordinates.