Generating histograms¶

In this example we make some histograms. This is intended for displaying distributions for raw data and possibly identifying “interesting” variables that might be able to distinguish between different classes of the data.

Assuming that the data is labelled into different classes, this information can be passed to the histogram generating method psynlig.histogram.histograms(). Specifically, this is done by passing the labels for each data point (using the parameter class_data) and a mapping from the labels to something more human-readable (using the parameter class_names).

from matplotlib import pyplot as plt
import pandas as pd
from sklearn.datasets import load_iris
from psynlig import histograms
plt.style.use('seaborn-talk')


data_set = load_iris()
data = pd.DataFrame(data_set['data'], columns=data_set['feature_names'])

class_data = data_set['target']
class_names = dict(enumerate(data_set['target_names']))

variables = ['sepal length (cm)', 'sepal width (cm)',
             'petal length (cm)', 'petal width (cm)']


kwargs = {
    'histogram1d': {
        'alpha': 0.8,
        'edgecolor': 'black',
        'bins': 10,
        'density': True
    },
}


histograms(
    data,
    variables,
    class_names=class_names,
    class_data=class_data,
    **kwargs,
)

plt.show()

Total running time of the script: ( 0 minutes 0.692 seconds)

Gallery generated by Sphinx-Gallery