xhistogram: Fast, flexible, label-aware histograms for numpy and xarray

Histograms (a.k.a “binning”) are much more than just a visualization tool. They are the foundation of a wide range of scientific analyses including [joint] probability distributions and coordinate transformations. Xhistogram makes it easier to calculate flexible, complex histograms with multi-dimensional data. It integrates (optionally) with Dask, in order to scale up to very large datasets and with Xarray, in order to consume and produce labelled, annotated data structures. It is useful for a wide range of scientific tasks.

Why a new histogram package?

The main problem with the standard histogram function in numpy and dask is that they automatically act over the entire input array (i.e. they “flatten” the data). Xhistogram allows you to choose which axes / dimensions you want to preserve and which you want to flatten. It also allows you to combine N arbitrary inputs to produce N-dimensional histograms. A good place to start is the Xhistogram Tutorial.