psynlig.pca module

For generating plots from PCA.

Submodules

psynlig.pca.loadings

A module defining plots for contributions to principal components.

psynlig.pca.loadings._pca_1d_loadings_component(axi, coefficients, xvars, colors, text_settings=None)[source]

Plot the loadings for a single component in a 1D plot.

This plot will show the components on a single line.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot we will add the loadings to.

  • coefficients (object like numpy.ndarray) – The coefficients we are to show.

  • xvars (list of strings) – Labels for the original variables.

  • colors (list of floats or strings) – The colors used for the different labels.

  • text_settings (dict, optional) – Additional settings for creating the text.

psynlig.pca.loadings._pca_2d_loadings_component(axi, coefficients1, coefficients2, xvars, colors, adjust_labels=False, text_settings=None)[source]

Plot the loadings for two components in a 2D plot.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot we will add the loadings to.

  • coefficients1 (object like numpy.ndarray) – The coefficients for the first principal component.

  • coefficients2 (object like numpy.ndarray) – The coefficients for the second principal component.

  • xvars (list of strings) – Labels for the original variables.

  • colors (list of floats or strings) – The colors used for the different labels.

  • adjust_labels (boolean, optional) – If this is True, we will try to optimize the position of the labels so that they wont overlap.

  • text_settings (dict, optional) – Additional settings for creating the text.

psynlig.pca.loadings._pca_3d_loadings_component(axi, coefficients1, coefficients2, coefficients3, xvars, colors, text_settings=None)[source]

Show contributions to three principal components.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot we will add the loadings to.

  • coefficients1 (object like numpy.ndarray) – The coefficients for the first principal component.

  • coefficients2 (object like numpy.ndarray) – The coefficients for the second principal component.

  • coefficients3 (object like numpy.ndarray) – The coefficients for the second principal component.

  • xvars (list of strings) – Labels for the original variables.

  • colors (list of floats or strings) – The colors used for the different labels.

  • text_settings (dict, optional) – Additional settings for creating the text.

psynlig.pca.loadings.pca_1d_loadings(pca, xvars, select_components=None, plot_type='line', cmap=None, text_settings=None)[source]

Plot the loadings from a PCA in a 1D plot.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • xvars (list of strings) – Labels for the original variables.

  • select_componets (set of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • plot_type (string, optional) – Select the kind of plot we will be making. Possible values are:

    • line: For generating a 1D line with contributions.

    • bar: For generating a bar plot of the contributions.

    • bar-square: For generating a bar plot of the squared contributions.

    • bar-absolute: For generating a bar plot of the absolute value of contributions.

  • cmap (string or object like matplotlib.colors.Colormap, optional) – A colormap to use for the components/variables.

  • text_settings (dict, optional) – Additional settings for creating the text.

Returns

psynlig.pca.loadings.pca_2d_loadings(pca, xvars, select_components=None, adjust_labels=False, cmap=None, style='box', text_settings=None)[source]

Show loadings for two principal compoents.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • xvars (list of strings) – Labels for the original variables.

  • select_componets (set of tuples of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • adjust_labels (boolean, optional) – If this is True, we will try to optimize the position of the labels so that they wont overlap.

  • cmap (string or object like matplotlib.colors.Colormap, optional) – A colormap to use for the components/variables.

  • style (string, optional) – This option changes the styling of the plot. For style box, we show the axes as a normal matplotlib figure with inserted lines showing x=0 and y=0. For the style ‘center’ we place the x-axis and y-axis at the origin.

  • text_settings (dict, optional) – Additional settings for creating the text.

Returns

psynlig.pca.loadings.pca_3d_loadings(pca, xvars, select_components=None, cmap=None, text_settings=None)[source]

Show contributions to three principal components.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • xvars (list of strings) – Labels for the original variables.

  • select_componets (set of tuples of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • cmap (string or object like matplotlib.colors.Colormap, optional) – A colormap to use for the components/variables.

  • text_settings (dict, optional) – Additional settings for creating the text.

Returns

psynlig.pca.loadings.pca_loadings_bar(axi, coefficients, xvars, plot_type='bar')[source]

Plot the loadings for a single component in a bar plot.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot we will add the loadings to.

  • coefficients (object like numpy.ndarray) – The coefficients we are to show.

  • xvars (list of strings) – Labels for the original variables.

  • plot_type (string, optional) – Selects the type of plot we are making.

psynlig.pca.loadings.pca_loadings_map(pca, xvars, val_fmt='{x:.2f}', bubble=False, annotate=True, textcolors=None, plot_style=None, **kwargs)[source]

Show contributions from variables to PC’s in a heat map.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • xvars (list of strings) – The labels for the original variables.

  • val_fmt (string, optional) – The format of the annotations inside the heat map.

  • bubble (boolean, optional) – If True, we will draw bubbles to indicate the size of the given data points.

  • annotate (boolean, optional) – If True, we will write annotate the plot with values for the contributions.

  • textcolors (list of strings, optional) – Colors used for the text. The number of colors provided defines a binning for the data values, and values are colored with the corresponding color. If no colors are provided, all are colored black.

  • plot_style (string, optional) – Determines how the cofficients are plotted:

    • absolute: The absolute value of the coefficients will be plotted.

    • squared: The squared value of the coefficients will be plotted.

    Otherwise, the actual value of the coefficients will be used.

  • **kwargs (dict, optional) – Arguments used for drawing the heat map.

Returns

psynlig.pca.scores

A module defining plots for PCA scores.

psynlig.pca.scores._add_2d_loading_lines(axi, coefficients1, coefficients2, xvars, cmap=None, settings=None)[source]

Add loading lines to a 2D scores plot.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot to add the loadings to.

  • coefficients1 (object like numpy.ndarray) – The coefficients for the first principal component.

  • coefficients2 (object like numpy.ndarray) – The coefficients for the second principal component.

  • xvars (list of strings) – The labels for the original variables.

  • cmap (string or object like matplotlib.colors.Colormap, optional) – A color map to use for loadings.

  • settings (dict, optional) – Settings for adding the loadings. Possible settings are:

    • adjust_text: If this is set to True, we will attempt to adjust generated text so that they won’t overlap.

    • jiggle_text: If this is set to True, we will attempt to move text around to avoid overlap with other text boxes. This can be used as an alternativ to adjust_text.

    • add_legend: If this is True, we add a legend to the plot.

    • add_text: If this is True, we will annotate the plot with labels for the variables.

    • text: Additional settings for the text.

Returns

psynlig.pca.scores._add_loading_line_text(axi, xcoeff, ycoeff, label, color='black', settings=None)[source]

Add a loading line to a plot.

This method is used when creating biplots.

Parameters
  • axi (object like matplotlib.axes.Axes) – The plot to add the loadings to.

  • xcoeff (float) – The loading along the first principal component.

  • ycoeff (float) – The loading along the second principal component,

  • label (string) – The name of the original variable we are plotting the loadings for.

  • color (string, optional) – The color to use for the text and symbol.

  • settings (dict, optional) – Settings for adding the loadings. Possible settings are:

    • add_legend: If this is True, we add a legend to the plot.

    • add_text: If this is True, we will annotate the plot with labels for the variables.

    • text: Additional settings for the text.

Returns

psynlig.pca.scores._adjust_figure_for_legend_outside(fig, axi, legend)[source]

Adjust plot side in case legend is placed outside the axis.

Parameters
psynlig.pca.scores._pca_2d_add_loadings(fig, axi, pca, idx1, idx2, xvars=None, cmap_loadings=None, loading_settings=None)[source]

Add loadings to a 2D scatter plot.

Parameters
  • fig (object like matplotlib.figure.Figure) – Existing figure which we will add loadings to.

  • axi (object like matplotlib.axes.Axes) – Existing axis which can be used for adding loadings.

  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • idx1 (integer) – The index to use for the first principal component.

  • idx2 (integer) – The index to use for the second principal component.

  • xvars (list of strings, optional) – Labels for the original variables. If not given, we will generate names like “var1”, “var2”, etc.

  • cmap_loadings (string or object like matplotlib.colors.Colormap, optional) – A color map to use for loadings.

  • loading_settings (dict, optional) – Settings for adding the loadings.

Returns

psynlig.pca.scores.pca_1d_scores(pca, scores, xvars=None, class_data=None, class_names=None, select_components=None, add_loadings=False, cmap_class=None, cmap_loadings=None, **kwargs)[source]

Plot scores from a PCA model (1D).

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • scores (object like numpy.ndarray) – The scores we are to plot.

  • xvars (list of strings, optional) – Labels for the original variables. If not given, we will just give them names like “var1”, “var2”, etc.

  • class_data (object like pandas.core.series.Series, optional) – Class information for the points (if available).

  • class_names (dict of strings) – A mapping from the class data to labels/names.

  • select_componets (set of tuples of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • add_loadings (boolean, optional) – If this is True, we will add loadings to the plot.

  • cmap_class (string or object like matplotlib.colors.Colormap, optional) – A color map to use for classes.

  • cmap_loadings (string or object like matplotlib.colors.Colormap, optional) – A color map to use for loadings.

  • kwargs (dict, optional) – Additional settings for the plotting.

Returns

psynlig.pca.scores.pca_2d_scores(pca, scores, xvars=None, class_data=None, class_names=None, select_components=None, loading_settings=None, savefig=None, cmap_class=None, cmap_loadings=None, **kwargs)[source]

Plot scores from a PCA model anlong two PC’s.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • scores (object like numpy.ndarray) – The scores we are to plot.

  • xvars (list of strings, optional) – Labels for the original variables. If not given, we will generate names like “var1”, “var2”, etc.

  • class_data (object like pandas.core.series.Series, optional) – Class information for the points (if available).

  • class_names (dict of strings) – A mapping from the class data to labels/names.

  • select_componets (set of tuples of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • loading_settings (dict, optional) – Settings for adding the loadings.

  • savefig (string, optional) – If this is given, we will here save the figure to a file. This is included here due to potential problems with large legends and displaying them in a interactive plot.

  • cmap_class (string or object like matplotlib.colors.Colormap, optional) – A color map to use for classes.

  • cmap_loadings (string or object like matplotlib.colors.Colormap, optional) – A color map to use for loadings.

  • kwargs (dict, optional) – Additional settings for the plotting.

Returns

psynlig.pca.scores.pca_3d_scores(pca, scores, class_data=None, class_names=None, select_components=None, cmap_class=None, **kwargs)[source]

Plot scores from a PCA model anlong two PC’s.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • scores (object like numpy.ndarray) – The scores we are to plot.

  • class_data (object like pandas.core.series.Series, optional) – Class information for the points (if available).

  • class_names (dict of strings) – A mapping from the class data to labels/names.

  • select_componets (set of tuples of integers, optional) – This variable can be used to select the principal components we will create plot for. Note that the principal component numbering will here start from 1 (and not 0). If this is not given, all will be plotted.

  • cmap_class (string or object like matplotlib.colors.Colormap, optional) – A color map to use for classes.

  • kwargs (dict, optional) – Additional settings for the plotting.

Returns

psynlig.pca.variance

A module defining plots for PCA variance.

psynlig.pca.variance._create_figure_if_needed(axi, figsize=None)[source]

Create a figure if needed (axi is None).

psynlig.pca.variance.pca_explained_variance(pca, axi=None, figsize=None, **kwargs)[source]

Plot the explained variance as function of PCA components.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • axi (object like matplotlib.axes.Axes, optional) – If given, the plot will be added to the specified axis. Otherwise, a new axis (and figure) will be created here.

  • figsize (tuple of ints, optional) – A desired size of the figure, if created here.

  • kwargs (dict, optional) – Additional settings for plotting explained variance.

Returns

psynlig.pca.variance.pca_explained_variance_bar(pca, axi=None, figsize=None, **kwargs)[source]

Plot the explained variance per principal component.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • axi (object like matplotlib.axes.Axes, optional) – If given, the plot will be added to the specified axis. Otherwise, a new axis (and figure) will be created here.

  • figsize (tuple of ints, optional) – A desired size of the figure, if created here.

  • kwargs (dict, optional) – Additional settings for plotting explained variance.

Returns

psynlig.pca.variance.pca_explained_variance_pie(pca, axi=None, figsize=None, cmap=None, tol=0.001)[source]

Show the explained variance as function of PCA components in a pie.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • axi (object like matplotlib.axes.Axes, optional) – If given, the plot will be added to the specified axis. Otherwise, a new axis (and figure) will be created here.

  • figsize (tuple of ints, optional) – A desired size of the figure, if created here.

  • cmap (string or object like matplotlib.colors.Colormap, optional) – The color map to use for generating colors.

  • tol (float, optional) – A tolerance for the missing variance. If the unexplained variance is less than this tolerance, it will not be shown in the pie chart.

Returns

psynlig.pca.variance.pca_residual_variance(pca, axi=None, figsize=None, **kwargs)[source]

Plot the residual variance as function of PCA components.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • axi (object like matplotlib.axes.Axes, optional) – If given, the plot will be added to the specified axis. Otherwise, a new axis (and figure) will be created here.

  • figsize (tuple of ints, optional) – A desired size of the figure, if created here.

  • kwargs (dict, optional) – Additional settings for plotting explained variance.

Returns

psynlig.pca.variance.pca_scree(pca, axi=None, figsize=None, **kwargs)[source]

Plot the eigenvalues as function of PCA components.

Parameters
  • pca (object like sklearn.decomposition._pca.PCA) – The results from a PCA analysis.

  • axi (object like matplotlib.axes.Axes, optional) – If given, the plot will be added to the specified axis. Otherwise, a new axis (and figure) will be created here.

  • figsize (tuple of ints, optional) – A desired size of the figure, if created here.

  • kwargs (dict, optional) – Additional settings for the plotting.

Returns