transparentai.plots

All ploting functions in different submodules.

Common plots functions

transparentai.plots.plots.plot_or_figure(fig, plot=True)[source]
Parameters:
  • fig (matplotlib.figure.Figure) – figure to plot or to returns
  • plot (bool (default True)) – Whether you want to plot a figure or return it
Returns:

Figure

Return type:

matplotlib.figure.Figure

transparentai.plots.plots.plot_table_score(perf)[source]

Insert a table of scores on a matplotlib graphic

Parameters:perf (dict) – Dictionnary with computed score

Datasets variable plots functions

transparentai.datasets.variable.variable_plots.plot_datetime_var(ax, arr, color='#3498db', label=None, alpha=1.0)[source]

Plots a line plot into an matplotlib axe.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • arr (array like) – Array of datetime values
  • color (str (default DEFAULT_COLOR)) – color of the plot
  • label (str (default None)) – label of the plot
  • alpha (float (default 1.)) – opacity
Raises:
  • TypeError: – arr is not an array like
  • TypeError: – arr is not a datetime array
transparentai.datasets.variable.variable_plots.plot_number_var(ax, arr, color='#3498db', label=None, alpha=1.0)[source]

Plots an histogram into an matplotlib axe.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • arr (array like) – Array of number values
  • color (str (default DEFAULT_COLOR)) – color of the plot
  • label (str (default None)) – label of the plot
  • alpha (float (default 1.)) – opacity
Raises:
  • TypeError: – arr is not an array like
  • TypeError: – arr is not a number array
transparentai.datasets.variable.variable_plots.plot_object_var(ax, arr, top=10, color='#3498db', label=None, alpha=1.0)[source]

Plots a bar plot into an matplotlib axe.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • arr (array like) – Array of object values
  • color (str (default DEFAULT_COLOR)) – color of the plot
  • label (str (default None)) – label of the plot
  • alpha (float (default 1.)) – opacity
Raises:
  • TypeError: – arr is not an array like
  • TypeError: – arr is not a object array
transparentai.datasets.variable.variable_plots.plot_table_describe(ax, cell_text)[source]

Insert a table in a matplotlib graphic using an axis.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • cell_text (list(list)) – The texts to place into the table cells.
transparentai.datasets.variable.variable_plots.plot_variable(arr, legend=None, colors=None, xlog=False, ylog=False, **kwargs)[source]

Plots a graph with two parts given an array. First part is the plot custom plot depending on the array dtype. Second part is the describe statistics table.

First plot is:

  • Histogram if dtype is number (using plot_number_var)
  • Line plot if dtype is datetime (using plot_datetime_var)
  • Bar plot if dtype is object (using plot_object_var)

If legend array is set then automaticly plots differents values.

Parameters:
  • arr (array like) – Array of values to plots
  • legend (array like (default None)) – Array of values of legend (same length than arr)
  • colors (list (default None)) – Array of colors, used if legend is set
  • xlog (bool (default False)) – Scale xaxis in log scale
  • ylog (bool (default False)) – Scale yaxis in log scale
Raises:
  • TypeError: – arr is not an array like
  • TypeError: – legend is not an array like
  • ValueError: – arr and legend have not the same length

Classification plots functions

transparentai.models.classification.classification_plots.compute_prob_performance(y_true, y_prob, metrics)[source]

Computes performance that require probabilities

Parameters:
  • y_true (array like) – True labels
  • y_pred (array like) – Predicted labels
  • metrics (list) – List of metrics to compute
Returns:

Dictionnary of metrics computed that requires probabilities. If no metrics need those then it returns None

Return type:

dict

Raises:

TypeError: – metrics must be a list

transparentai.models.classification.classification_plots.plot_confusion_matrix(confusion_matrix)[source]

Show confusion matrix.

Parameters:confusion_matrix (array) – confusion_matrix metrics result
transparentai.models.classification.classification_plots.plot_performance(y_true, y_pred, y_true_valid=None, y_pred_valid=None, metrics=None, **kwargs)[source]

Plots the performance of a classifier. You can use the metrics of your choice with the metrics argument

Can compare train and validation set.

Parameters:
  • y_true (array like) – True labels
  • y_pred (array like (1D or 2D)) – if 1D array Predicted labels, if 2D array probabilities (returns of a predict_proba function)
  • y_true_valid (array like (default None)) – True labels
  • y_pred_valid (array like (1D or 2D) (default None)) – if 1D array Predicted labels, if 2D array probabilities (returns of a predict_proba function)
  • metrics (list (default None)) – List of metrics to plots
Raises:

TypeError: – if metrics is set it must be a list

transparentai.models.classification.classification_plots.plot_roc_curve(roc_curve, roc_auc)[source]

Show a roc curve plot with roc_auc score on the legend.

Parameters:
  • roc_curve (array) – roc_curve metrics result for each class
  • roc_auc (array) – roc_auc metrics result for each class
transparentai.models.classification.classification_plots.plot_score_function(perf, perf_prob, metric)[source]

Plots score with a specific function.

E.g. confusion_matrix or roc_auc

Parameters:
  • perf (dict) – Dictionnary with computed score
  • perf_prob (dict) – Dictionnary with computed score (using probabilities)
  • metric (str) – name of the metric
Raises:

ValueError: – metric does not have a plot function

transparentai.models.classification.classification_plots.plot_table_score_clf(perf)[source]

Insert a table of scores on a matplotlib graphic for a classifier

Parameters:perf (dict) – Dictionnary with computed score
transparentai.models.classification.classification_plots.preprocess_scores(y_pred)[source]

Preprocess y_pred for plot_performance function.

if y_pred is probabilities then y_pred become predicted class, y_prob is the probabilities else, y_prob is None

Parameters:y_pred (array like (1D or 2D)) – if 1D array Predicted labels, if 2D array probabilities (returns of a predict_proba function)
Returns:
  • np.ndarray – array with predicted labels
  • np.ndarray – array with probabilities if available else None
  • int – number of classes

Regression plots functions

transparentai.models.regression.regression_plots.plot_error_distribution(errors)[source]

Plots the error distribution with standard deviation, mean and median.

The error is calculated by the following formula :

\[error = y - \hat{y}\]
Parameters:errors (array like) – Errors of a regressor
transparentai.models.regression.regression_plots.plot_performance(y_true, y_pred, y_true_valid=None, y_pred_valid=None, metrics=None, **kwargs)[source]

Plots the performance of a regressor. You can use the metrics of your choice with the metrics argument

Can compare train and validation set.

Parameters:
  • y_true (array like) – True target values
  • y_pred (array like) – Predicted values
  • y_true_valid (array like (default None)) – True target values for validation set
  • y_pred_valid (array like (1D or 2D) (default None)) – Predicted values for validation set
  • metrics (list) – List of metrics to plots
Raises:

TypeError: – if metrics is set it must be a list

Explainer plots functions

transparentai.models.explainers.explainer_plots.plot_global_feature_influence(feat_importance, color='#3498db', **kwargs)[source]

Display global feature influence sorted.

Parameters:feat_importance (pd.Series) – Feature importance with feature as indexes and shap value as values
transparentai.models.explainers.explainer_plots.plot_local_feature_influence(feat_importance, base_value, pred, pred_class=None, **kwargs)[source]

Display local feature influence sorted for a specific prediction.

Parameters:
  • feat_importance (pd.Series) – Feature importance with feature as indexes and shap value as values
  • base_value (number) – prediction value if we don’t put any feature into the model
  • pred (number) – predicted value

Fairness plots functions

transparentai.fairness.fairness_plots.format_priv_text(values, max_char)[source]

Formats privileged (or unprivileged) values text so that it can be shown.

Parameters:
  • values (list) – List of privileged or unprivileged values
  • max_char (int) – Maximum characters allow in the returned string
Returns:

Formated string for given values

Return type:

str

Raises:

TypeError – values must be a list

transparentai.fairness.fairness_plots.get_protected_attr_values(attr, df, privileged_group, privileged=True)[source]

Retrieves all values given the privileged_group argument.

If privileged is True and privileged_group[attr] is a list then it returns the list, if it’s a function then values of df[attr] for which the function returns True.

If privileged is False and privileged_group[attr] is a list then it returns values of df[attr] not in the list else if it’s a function returns values of df[attr] for which the function returns False.

Parameters:
  • attr (str) – Protected attribute which is a key of the privileged_group dictionnary
  • df (pd.DataFrame) – Dataframe to extract privilieged group from.
  • privileged_group (dict) – Dictionnary with protected attribute as key (e.g. age or gender) and a list of favorable value (like [‘Male’]) or a function returning a boolean corresponding to a privileged group
  • privileged (bool (default True)) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False.
Returns:

List of privileged values of the protected attribyte attr if privileged is True else unprivileged values

Return type:

list

Raises:

ValueError: – attr must be in privileged_group

transparentai.fairness.fairness_plots.plot_attr_title(ax, attr, df, privileged_group)[source]

Plots the protected attribute titles with :

  • The attribute name (e.g. Gender)
  • Priviliged and unprivileged values
  • Number of privileged and unprivileged values
Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • attr (str) – Protected attribute which is a key of the privileged_group dictionnary
  • df (pd.DataFrame) – Dataframe to extract privilieged group from.
  • privileged_group (dict) – Dictionnary with protected attribute as key (e.g. age or gender) and a list of favorable value (like [‘Male’]) or a function returning a boolean corresponding to a privileged group
Raises:
  • ValueError: – attr must be in df columns
  • ValueError: – attr must be in privileged_group keys
transparentai.fairness.fairness_plots.plot_bias(y_true, y_pred, df, privileged_group, pos_label=1, regr_split=None, with_text=True, **kwargs)[source]

Plots the fairness metrics for protected attributes refered in the privileged_group argument.

It uses the 4 fairness function :

  • statistical_parity_difference
  • disparate_impact
  • equal_opportunity_difference
  • average_odds_difference

You can also use it for a regression problem. You can set a value in the regr_split argument so it converts it to a binary classification problem. To use the mean use ‘mean’. If the favorable label is more than the split value set pos_label argument to 1 else to 0.

Example

Using this function for a binary classifier:

>>> from transparentai.datasets import load_adult
>>> from sklearn.ensemble import RandomForestClassifier
>>> data = load_adult()
>>> X, Y = data.drop(columns='income'), data['income'].replace({'>50K':1, '<=50K':0})
>>> X = X.select_dtypes('number')
>>> clf = RandomForestClassifier().fit(X,Y)
>>> y_pred = clf.predict(X)
>>> privileged_group = { 'gender':['Male'] }
>>> y_pred = clf.predict(X)plot_bias(Y, y_pred, data, privileged_group, with_text=True)
Parameters:
  • y_true (array like) – True labels
  • y_pred (array like) – Predicted labels
  • df (pd.DataFrame) – Dataframe to extract privilieged group from.
  • privileged_group (dict) – Dictionnary with protected attribute as key (e.g. age or gender) and a list of favorable value (like [‘Male’]) or a function returning a boolean corresponding to a privileged group
  • pos_label (number) – The label of the positive class.
  • regr_split ('mean' or number (default None)) – If it’s a regression problem then you can convert result to a binary classification using ‘mean’ or a choosen number. both y_true and y_pred become 0 and 1 : 0 if it’s equal or less than the split value (the average if ‘mean’) and 1 if more. If the favorable label is more than the split value set pos_label=1 else pos_label=0
  • with_text (bool (default True)) – Whether it displays the explanation text for fairness metrics.
transparentai.fairness.fairness_plots.plot_bias_one_attr(ax, metric, score)[source]

Plots bias metric score bar with the indication if it’s considered not fair or fair.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • metric (str) – The name of the metric
  • score (float:) – Score value of the metric
transparentai.fairness.fairness_plots.plot_fairness_text(ax, score, metric)[source]

Plots bias metric explanation text.

The text is retrieved by the fairness_metrics_text() function.

Parameters:
  • ax (plt.axes.Axes) – axe where to add the plot
  • metric (str) – The name of the metric
  • score (float:) – Score value of the metric

Monitoring plots functions

transparentai.monitoring.monitoring_plots.plot_monitoring(y_true, y_pred, timestamp=None, interval='month', metrics=None, classification=False, **kwargs)[source]

Plots model performance over a timestamp array which represent the date or timestamp of the prediction.

If timestamp is None or interval then it just compute the metrics on all the predictions.

If interval is not None it can be one of the following : ‘year’, ‘month’, ‘day’ or ‘hour’.

  • ‘year’ : format ‘%Y’
  • ‘month’ : format ‘%Y-%m’
  • ‘day’ : format ‘%Y-%m-%d’
  • ‘hour’ : format ‘%Y-%m-%d-%r’

If it’s for a classification and you’re using y_pred as probabilities don’t forget to pass the classification=True argument !

You can use your choosing metrics. for that refer to the evaluation metrics documentation.

Parameters:
  • y_true (array like) – True labels
  • y_pred (array like (1D or 2D)) – if 1D array Predicted labels, if 2D array probabilities (returns of a predict_proba function)
  • timestamp (array like or None (default None)) – Array of datetime when the prediction occured
  • interval (str or None (default 'month')) – interval to format the timestamp with
  • metrics (list (default None)) – List of metrics to compute
  • classification (bool (default True)) – Whether the ML task is a classification or not