Model Analysis

class aethos.model_analysis.model_analysis.ModelAnalysisBase

Bases: aethos.visualizations.visualizations.Visualizations, aethos.stats.stats.Stats


Writes model to a pickle file.


>>> m = Model(df)
>>> m_results = m.LogisticRegression()
>>> m_results.to_pickle()
to_service(project_name: str)

Creates an, requirements.txt and Dockerfile in ~/.aethos/projects and the necessary folder structure to run the model as a microservice.

Parameters:project_name (str) – Name of the project that you want to create.


>>> m = Model(df)
>>> m_results = m.LogisticRegression()
>>> m_results.to_service('your_proj_name')
class aethos.model_analysis.model_analysis.SupervisedModelAnalysis(model, x_train, x_test, y_train, y_test, model_name)

Bases: aethos.model_analysis.model_analysis.ModelAnalysisBase

decision_plot(num_samples=0.6, sample_no=None, highlight_misclassified=False, output_file='', **decisionplot_kwargs)

Visualize model decisions using cumulative SHAP values.

Each colored line in the plot represents the model prediction for a single observation.

Note that plotting too many samples at once can make the plot unintelligible.

When is a decision plot useful:
  • Show a large number of feature effects clearly.
  • Visualize multioutput predictions.
  • Display the cumulative effect of interactions.
  • Explore feature effects for a range of feature values.
  • Identify outliers.
  • Identify typical prediction paths.
  • Compare and contrast predictions for several models.
  • The plot is centered on the x-axis at the models expected value.
  • All SHAP values are relative to the model’s expected value like a linear model’s effects are relative to the intercept.
  • The y-axis lists the model’s features. By default, the features are ordered by descending importance.
  • The importance is calculated over the observations plotted. This is usually different than the importance ordering for the entire dataset. In addition to feature importance ordering, the decision plot also supports hierarchical cluster feature ordering and user-defined feature ordering.
  • Each observation’s prediction is represented by a colored line.
  • At the top of the plot, each line strikes the x-axis at its corresponding observation’s predicted value. This value determines the color of the line on a spectrum.
  • Moving from the bottom of the plot to the top, SHAP values for each feature are added to the model’s base value. This shows how each feature contributes to the overall prediction.
  • At the bottom of the plot, the observations converge at the models expected value.
  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.
  • num_samples (int, float, or 'all', optional) – Number of samples to display, if less than 1 it will treat it as a percentage, ‘all’ will include all samples , by default 0.6
  • sample_no (int, optional) – Sample number to isolate and analyze, if provided it overrides num_samples, by default None
  • highlight_misclassified (bool, optional) – True to highlight the misclassified results, by default False
  • feature_order (str or None or list or numpy.ndarray) – Any of “importance” (the default), “hclust” (hierarchical clustering), “none”, or a list/array of indices. hclust is useful for finding outliers.
  • feature_display_range (slice or range) – The slice or range of features to plot after ordering features by feature_order. A step of 1 or None will display the features in ascending order. A step of -1 will display the features in descending order. If feature_display_range=None, slice(-1, -21, -1) is used (i.e. show the last 20 features in descending order). If shap_values contains interaction values, the number of features is automatically expanded to include all possible interactions: N(N + 1)/2 where N = shap_values.shape[1].
  • highlight (Any) – Specify which observations to draw in a different line style. All numpy indexing methods are supported. For example, list of integer indices, or a bool array.
  • link (str) – Use “identity” or “logit” to specify the transformation used for the x-axis. The “logit” link transforms log-odds into probabilities.
  • plot_color (str or matplotlib.colors.ColorMap) – Color spectrum used to draw the plot lines. If str, a registered matplotlib color name is assumed.
  • axis_color (str or int) – Color used to draw plot axes.
  • y_demarc_color (str or int) – Color used to draw feature demarcation lines on the y-axis.
  • alpha (float) – Alpha blending value in [0, 1] used to draw plot lines.
  • color_bar (bool) – Whether to draw the color bar.
  • auto_size_plot (bool) – Whether to automatically size the matplotlib plot to fit the number of features displayed. If False, specify the plot size using matplotlib before calling this function.
  • title (str) – Title of the plot.
  • xlim (tuple[float, float]) – The extents of the x-axis (e.g. (-1.0, 1.0)). If not specified, the limits are determined by the maximum/minimum predictions centered around base_value when link=’identity’. When link=’logit’, the x-axis extents are (0, 1) centered at 0.5. x_lim values are not transformed by the link function. This argument is provided to simplify producing multiple plots on the same scale for comparison.
  • show (bool) – Whether to automatically display the plot.
  • return_objects (bool) – Whether to return a DecisionPlotResult object containing various plotting features. This can be used to generate multiple decision plots using the same feature ordering and scale, by default True.
  • ignore_warnings (bool) – Plotting many data points or too many features at a time may be slow, or may create very large plots. Set this argument to True to override hard-coded limits that prevent plotting large amounts of data.
  • new_base_value (float) – SHAP values are relative to a base value; by default, the expected value of the model’s raw predictions. Use new_base_value to shift the base value to an arbitrary value (e.g. the cutoff point for a binary classification task).
  • legend_labels (list of str) – List of legend labels. If None, legend will not be shown.
  • legend_location (str) – Legend location. Any of “best”, “upper right”, “upper left”, “lower left”, “lower right”, “right”, “center left”, “center right”, “lower center”, “upper center”, “center”.

If return_objects=True (the default). Returns None otherwise.

Return type:



>>> # Plot two decision plots using the same feature order and x-axis.
>>> m = model.LogisticRegression()
>>> r = m.decision_plot()
>>> m.decision_plot(no_sample=42, feature_order=r.feature_idx, xlim=r.xlim)
dependence_plot(feature: str, interaction='auto', output_file='', **dependenceplot_kwargs)

A dependence plot is a scatter plot that shows the effect a single feature has on the predictions made by the mode.

  • Each dot is a single prediction (row) from the dataset.
  • The x-axis is the value of the feature (from the X matrix).
  • The y-axis is the SHAP value for that feature, which represents how much knowing that feature’s value changes the output of the model for that sample’s prediction.
  • The color corresponds to a second feature that may have an interaction effect with the feature we are plotting (by default this second feature is chosen automatically).
  • If an interaction effect is present between this other feature and the feature we are plotting it will show up as a distinct vertical pattern of coloring.
  • feature (str) – Feature who’s impact on the model you want to analyze
  • interaction ("auto", None, int, or string) – The index of the feature used to color the plot. The name of a feature can also be passed as a string. If “auto” then shap.common.approximate_interactions is used to pick what seems to be the strongest interaction (note that to find to true stongest interaction you need to compute the SHAP interaction values).
  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.
  • x_jitter (float (0 - 1)) – Adds random jitter to feature values. May increase plot readability when feature is discrete.
  • alpha (float) – The transparency of the data points (between 0 and 1). This can be useful to the show density of the data points when using a large dataset.
  • xmin (float or string) – Represents the lower bound of the plot’s x-axis. It can be a string of the format “percentile(float)” to denote that percentile of the feature’s value used on the x-axis.
  • xmax (float or string) – Represents the upper bound of the plot’s x-axis. It can be a string of the format “percentile(float)” to denote that percentile of the feature’s value used on the x-axis.
  • ax (matplotlib Axes object) – Optionally specify an existing matplotlib Axes object, into which the plot will be placed. In this case we do not create a Figure, otherwise we do.
  • cmap (str or matplotlib.colors.ColorMap) – Color spectrum used to draw the plot lines. If str, a registered matplotlib color name is assumed.


>>> m = model.LogisticRegression()
>>> m.dependence_plot()
force_plot(sample_no=None, misclassified=False, output_file='', **forceplot_kwargs)

Visualize the given SHAP values with an additive force layout

  • sample_no (int, optional) – Sample number to isolate and analyze, by default None
  • misclassified (bool, optional) – True to only show the misclassified results, by default False
  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.
  • link ("identity" or "logit") – The transformation used when drawing the tick mark labels. Using logit will change log-odds numbers into probabilities.
  • matplotlib (bool) – Whether to use the default Javascript output, or the (less developed) matplotlib output. Using matplotlib can be helpful in scenarios where rendering Javascript/HTML is inconvenient.


>>> m = model.LogisticRegression()
>>> m.force_plot() # The entire test dataset
>>> m.forceplot(no_sample=1, misclassified=True) # Analyze the first misclassified result

Displays a dashboard interpreting your model’s performance, behaviour and individual predictions.

If you have run any other interpret functions, they will be included in the dashboard, otherwise all the other intrepretable methods will be included in the dashboard.


>>> m = model.LogisticRegression()
>>> m.interpret_model()
interpret_model_behavior(method='all', predictions='default', show=True, **interpret_kwargs)

Provides an interpretable summary of your models behaviour based off an explainer.

Can either be ‘morris’ or ‘dependence’ for Partial Dependence.

If ‘all’ a dashboard is displayed with morris and dependence analysis displayed.

  • method (str, optional) – Explainer type, can either be ‘all’, ‘morris’ or ‘dependence’, by default ‘all’
  • predictions (str, optional) – Prediction type, can either be ‘default’ (.predict) or ‘probability’ if the model can predict probabilities, by default ‘default’
  • show (bool, optional) – False to not display the plot, by default True


>>> m = model.LogisticRegression()
>>> m.interpret_model_behavior()
interpret_model_performance(method='all', predictions='default', show=True, **interpret_kwargs)

Plots an interpretable display of your model based off a performance metric.

Can either be ‘ROC’ or ‘PR’ for precision, recall for classification problems.

Can be ‘regperf’ for regression problems.

If ‘all’ a dashboard is displayed with the corresponding explainers for the problem type.

ROC: Receiver Operator Characteristic PR: Precision Recall regperf: RegeressionPerf

  • method (str) – Performance metric, either ‘all’, ‘roc’ or ‘PR’, by default ‘all’
  • predictions (str, optional) – Prediction type, can either be ‘default’ (.predict) or ‘probability’ if the model can predict probabilities, by default ‘default’
  • show (bool, optional) – False to not display the plot, by default True


>>> m = model.LogisticRegression()
>>> m.interpret_model_performance()
interpret_model_predictions(num_samples=0.25, sample_no=None, method='all', predictions='default', show=True, **interpret_kwargs)

Plots an interpretable display that explains individual predictions of your model.

Supported explainers are either ‘lime’ or ‘shap’.

If ‘all’ a dashboard is displayed with morris and dependence analysis displayed.

  • num_samples (int, float, or 'all', optional) – Number of samples to display, if less than 1 it will treat it as a percentage, ‘all’ will include all samples , by default 0.25
  • sample_no (int, optional) – Sample number to isolate and analyze, if provided it overrides num_samples, by default None
  • method (str, optional) – Explainer type, can either be ‘all’, ‘lime’, or ‘shap’, by default ‘all’
  • predictions (str, optional) – Prediction type, can either be ‘default’ (.predict) or ‘probability’ if the model can predict probabilities, by default ‘default’
  • show (bool, optional) – False to not display the plot, by default True


>>> m = model.LogisticRegression()
>>> m.interpret_model_predictions()

Prints and logs all the features ranked by importance from most to least important.

Returns:Dictionary of features and their corresponding weights
Return type:dict
Raises:AttributeError – If model does not have coefficients to display


>>> m = model.LogisticRegression()
>>> m.model_weights()

Prints the sample numbers of misclassified samples.


>>> m = model.LogisticRegression()
>>> m.shap_get_misclassified_index()
summary_plot(output_file='', **summaryplot_kwargs)

Create a SHAP summary plot, colored by feature values when they are provided.

For a list of all kwargs please see the Shap documentation :

  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.
  • max_display (int) – How many top features to include in the plot (default is 20, or 7 for interaction plots), by default None
  • plot_type ("dot" (default for single output), "bar" (default for multi-output), "violin", or "compact_dot") – What type of summary plot to produce. Note that “compact_dot” is only used for SHAP interaction values.
  • color (str or matplotlib.colors.ColorMap) – Color spectrum used to draw the plot lines. If str, a registered matplotlib color name is assumed.
  • axis_color (str or int) – Color used to draw plot axes.
  • title (str) – Title of the plot.
  • alpha (float) – Alpha blending value in [0, 1] used to draw plot lines.
  • show (bool) – Whether to automatically display the plot.
  • sort (bool) – Whether to sort features by importance, by default True
  • color_bar (bool) – Whether to draw the color bar.
  • auto_size_plot (bool) – Whether to automatically size the matplotlib plot to fit the number of features displayed. If False, specify the plot size using matplotlib before calling this function.
  • layered_violin_max_num_bins (int) – Max number of bins, by default 20
  • **summaryplot_kwargs – For more info see


>>> m = model.LogisticRegression()
>>> m.summary_plot()
view_tree(tree_num=0, output_file=None, **kwargs)

Plot decision trees.

  • tree_num (int, optional) – For ensemble, boosting, and stacking methods - the tree number to plot, by default 0
  • output_file (str, optional) – Name of the file including extension, by default None


>>> m = model.DecisionTreeClassifier()
>>> m.view_tree()
>>> m = model.XGBoostClassifier()
>>> m.view_tree(2)
class aethos.model_analysis.classification_model_analysis.ClassificationModelAnalysis(model, x_train, x_test, target, model_name)

Bases: aethos.model_analysis.model_analysis.SupervisedModelAnalysis


It measures how many observations, both positive and negative, were correctly classified.

Return type:float


>>> m = model.LogisticRegression()
>>> m.accuracy()

AP summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight

Returns:Average Precision Score
Return type:float


>>> m = model.LogisticRegression()
>>> m.average_precision()

The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class.

The best value is 1 and the worst value is 0 when adjusted=False.

Returns:Balanced accuracy
Return type:float


>>> m = model.LogisticRegression()
>>> m.balanced_accuracy()

Compute the Brier score. The smaller the Brier score, the better, hence the naming with “loss”. Across all items in a set N predictions, the Brier score measures the mean squared difference between (1) the predicted probability assigned to the possible outcomes for item i, and (2) the actual outcome. Therefore, the lower the Brier score is for a set of predictions, the better the predictions are calibrated.

The Brier score is appropriate for binary and categorical outcomes that can be structured as true or false, but is inappropriate for ordinal variables which can take on three or more values (this is because the Brier score assumes that all possible outcomes are equivalently “distant” from one another)

Returns:Brier loss
Return type:float


>>> m = model.LogisticRegression()
>>> m.brier_loss()

Prints and logs the classification report.

The classification report displays and logs the information in this format:

precision recall f1-score support

1 1.00 0.67 0.80 3 2 0.00 0.00 0.00 0 3 0.00 0.00 0.00 0

micro avg 1.00 0.67 0.80 3 macro avg 0.33 0.22 0.27 3

weighted avg 1.00 0.67 0.80 3


>>> m = model.LogisticRegression()
>>> m.classification_report()

Cohen Kappa tells you how much better is your model over the random classifier that predicts based on class frequencies

This measure is intended to compare labelings by different human annotators, not a classifier versus a ground truth.

The kappa score (see docstring) is a number between -1 and 1. Scores above .8 are generally considered good agreement; zero or lower means no agreement (practically random labels).

Returns:Cohen Kappa score.
Return type:float


>>> m = model.LogisticRegression()
>>> m.cohen_kappa()
confusion_matrix(title=None, normalize=False, hide_counts=False, x_tick_rotation=0, figsize=None, cmap='Blues', title_fontsize='large', text_fontsize='medium', output_file='')

Prints a confusion matrix as a heatmap.

  • title (str) – The text to display at the top of the matrix, by default ‘Confusion Matrix’
  • normalize (bool) – If False, plot the raw numbers If True, plot the proportions, by default False
  • hide_counts (bool) – If False, display the counts and percentage If True, hide display of the counts and percentage by default, False
  • x_tick_rotation (int) – Degree of rotation to rotate the x ticks by default, 0
  • figsize (tuple(int, int)) – Size of the figure by default, None
  • cmap (str) – The gradient of the values displayed from see plt.get_cmap(‘jet’) or by default, ‘Blues’
  • title_fontsize (str) – Size of the title, by default ‘large’
  • text_fontsize (str) – Size of the text of the rest of the plot, by default ‘medium’
  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.


>>> m = model.LogisticRegression()
>>> m.confusion_matrix()
>>> m.confusion_matrix(normalize=True)
cross_validate(cv_type='strat-kfold', score='accuracy', n_splits=5, shuffle=False, **kwargs)

Runs cross validation on a Classification model.

Scoring Metrics:
  • ‘accuracy’
  • ‘balanced_accuracy’
  • ‘average_precision’
  • ‘brier_score_loss’
  • ‘f1’
  • ‘f1_micro’
  • ‘f1_macro’
  • ‘f1_weighted’
  • ‘f1_samples’
  • ‘neg_log_loss’
  • ‘precision’
  • ‘recall’
  • ‘jaccard’
  • ‘roc_auc’’
  • ‘roc_auc_ovr’
  • ‘roc_auc_ovo’
  • ‘roc_auc_ovr_weighted’
  • ‘roc_auc_ovo_weighted’
  • cv_type ({kfold, strat-kfold}, optional) – Crossvalidation type, by default “kfold”
  • score (str, optional) – Scoring metric, by default “accuracy”
  • n_splits (int, optional) – Number of times to split the data, by default 5
  • shuffle (bool, optional) – True to shuffle the data, by default False
decision_boundary(x=None, y=None, title='Decisioun Boundary')

Plots a decision boundary for a given model.

If no x or y columns are provided, it defaults to the first 2 columns of your data.

  • x (str, optional) – Column in the dataframe to plot, Feature one, by default None
  • y (str, optional) – Column in the dataframe to plot, Feature two, by default None
  • title (str, optional) – Title of the decision boundary plot, by default “Decisioun Boundary”

The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is:

F1 = 2 * (precision * recall) / (precision + recall)

In the multi-class and multi-label case, this is the average of the F1 score of each class with weighting depending on the average parameter.

Returns:F1 Score
Return type:float


>>> m = model.LogisticRegression()
>>> m.f1()
fbeta(beta=0.5, **kwargs)

The F-beta score is the weighted harmonic mean of precision and recall, reaching its optimal value at 1 and its worst value at 0. The beta parameter determines the weight of recall in the combined score. Beta < 1 lends more weight to precision, while beta > 1 favors recall (beta -> 0 considers only precision, beta -> inf only recall).

Parameters:beta (float, optional) – Weight of precision in harmonic mean, by default 0.5
Returns:Fbeta score
Return type:float


>>> m = model.LogisticRegression()
>>> m.fbeta()

The Hamming loss is the fraction of labels that are incorrectly predicted.

Returns:Hamming loss
Return type:float


>>> m = model.LogisticRegression()
>>> m.hamming_loss()

Computes the average distance between the model and the data using hinge loss, a one-sided metric that considers only prediction errors.

Returns:Hinge loss
Return type:float


>>> m = model.LogisticRegression()
>>> m.hinge_loss()

The Jaccard index, or Jaccard similarity coefficient, defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of labels in y_true.

Returns:Jaccard Score
Return type:float


>>> m = model.LogisticRegression()
>>> m.jaccard()

Log loss, aka logistic loss or cross-entropy loss.

This is the loss function used in (multinomial) logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of the true labels given a probabilistic classifier’s predictions.

Returns:Log loss
Return type:Float


>>> m = model.LogisticRegression()
>>> m.log_loss()

The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary and multiclass classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes. The MCC is in essence a correlation coefficient value between -1 and +1. A coefficient of +1 represents a perfect prediction, 0 an average random prediction and -1 an inverse prediction. The statistic is also known as the phi coefficient.

Returns:Matthews Correlation Coefficient
Return type:float


>>> m = model.LogisticRegression()
>>> m.mathews_corr_coef()

Measures how well your model performed against certain metrics.

For multiclassification problems, the ‘macro’ average is used.

If a project metrics has been specified, it will display those metrics, otherwise it will display the specified metrics or all metrics.

For more detailed information and parameters please see the following link:

Supported metrics are:

‘Accuracy’: ‘Measures how many observations, both positive and negative, were correctly classified.’,

‘Balanced Accuracy’: ‘The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class.’,

‘Average Precision’: ‘Summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold’,

‘ROC AUC’: ‘Shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.’,

‘Zero One Loss’: ‘Fraction of misclassifications.’,

‘Precision’: ‘It measures how many observations predicted as positive are positive. Good to use when False Positives are costly.’,

‘Recall’: ‘It measures how many observations out of all positive observations have we classified as positive. Good to use when catching call positive occurences, usually at the cost of false positive.’,

‘Matthews Correlation Coefficient’: ‘It’s a correlation between predicted classes and ground truth.’,

‘Log Loss’: ‘Difference between ground truth and predicted score for every observation and average those errors over all observations.’,

‘Jaccard’: ‘Defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of true labels.’,

‘Hinge Loss’: ‘Computes the average distance between the model and the data using hinge loss, a one-sided metric that considers only prediction errors.’,

‘Hamming Loss’: ‘The Hamming loss is the fraction of labels that are incorrectly predicted.’,

‘F-Beta’: ‘It’s the harmonic mean between precision and recall, with an emphasis on one or the other. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).’,

‘F1’: ‘It’s the harmonic mean between precision and recall. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).’,

‘Cohen Kappa’: ‘Cohen Kappa tells you how much better is your model over the random classifier that predicts based on class frequencies. Works well for imbalanced problems.’,

‘Brier Loss’: ‘It is a measure of how far your predictions lie from the true values. Basically, it is a mean square error in the probability space.’

Parameters:metrics (str(s), optional) – Specific type of metrics to view


>>> m = model.LogisticRegression()
>>> m.metrics()
>>> m.metrics('F1', 'F-Beta')

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives.

The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The best value is 1 and the worst value is 0.

Return type:float


>>> m = model.LogisticRegression()
>>> m.precision()

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives.

The recall is intuitively the ability of the classifier to find all the positive samples.

The best value is 1 and the worst value is 0.

Return type:float


>>> m = model.LogisticRegression()
>>> m.recall()

This metric tells us that this metric shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.

Returns:ROC AUC Score
Return type:float


>>> m = model.LogisticRegression()
>>> m.roc_auc()
roc_curve(title=True, output_file='')

Plots an ROC curve and displays the ROC statistics (area under the curve).

  • figsize (tuple(int, int), optional) – Figure size, by default (600,450)
  • title (bool) – Whether to display title, by default True
  • output_file (str, optional) – If a name is provided save the plot to an html file, by default ‘’


>>> m = model.LogisticRegression()
>>> m.roc_curve()

Return the fraction of misclassifications (float), else it returns the number of misclassifications (int).

The best performance is 0.

Returns:Zero one loss
Return type:float


>>> m = model.LogisticRegression()
>>> m.zero_one_loss()
class aethos.model_analysis.regression_model_analysis.RegressionModelAnalysis(model, x_train, x_test, target, model_name)

Bases: aethos.model_analysis.model_analysis.SupervisedModelAnalysis

cross_validate(cv_type='kfold', score='neg_root_mean_squared_error', n_splits=5, shuffle=False, **kwargs)

Runs cross validation on a Regression model.

Scoring Metrics:
  • ‘explained_variance’
  • ‘max_error’
  • ‘neg_mean_absolute_error’ –> MAE
  • ‘neg_mean_squared_error’ –> MSE
  • ‘neg_mean_squared_log_error’ –> MSLE
  • ‘neg_median_absolute_error’ –> MeAE
  • ‘r2’
  • ‘neg_mean_poisson_deviance’
  • ‘neg_mean_gamma_deviance’
  • cv_type ({kfold, strat-kfold}, optional) – Crossvalidation type, by default “kfold”
  • score (str, optional) – Scoring metric, by default “accuracy”
  • n_splits (int, optional) – Number of times to split the data, by default 5
  • shuffle (bool, optional) – True to shuffle the data, by default False
explained_variance(multioutput='uniform_average', **kwargs)

Explained variance regression score function

Best possible score is 1.0, lower values are worse.

Parameters:multioutput (string in [‘raw_values’, ‘uniform_average’, ‘variance_weighted’] or array-like of shape (n_outputs)) –

Defines aggregating of multiple output scores. Array-like value defines weights used to average scores.

‘raw_values’ :
Returns a full set of scores in case of multioutput input.
‘uniform_average’ :
Scores of all outputs are averaged with uniform weight.
‘variance_weighted’ :
Scores of all outputs are averaged, weighted by the variances of each individual output.

By default ‘uniform_average’

Returns:Explained Variance
Return type:float


>>> m = model.LinearRegression()
>>> m.explained_variance()

Returns the single most maximum residual error.

Returns:Max error
Return type:float


>>> m = model.LinearRegression()
>>> m.max_error()

Mean absolute error.

Returns:Mean absolute error.
Return type:float


>>> m = model.LinearRegression()
>>> m.mean_abs_error()

Mean squared error.

Returns:Mean squared error.
Return type:float


>>> m = model.LinearRegression()
>>> m.mean_sq_error()

Mean squared log error.

Returns:Mean squared log error.
Return type:float


>>> m = model.LinearRegression()
>>> m.mean_sq_log_error()

Median absolute error.

Returns:Median absolute error.
Return type:float


>>> m = model.LinearRegression()
>>> m.median_abs_error()

Measures how well your model performed against certain metrics.

If a project metrics has been specified, it will display those metrics, otherwise it will display the specified metrics or all metrics.

For more detailed information and parameters please see the following link:

Supported metrics are:

‘Explained Variance’: ‘Explained variance regression score function. Best possible score is 1.0, lower values are worse.’,

‘Max Error’: ‘Returns the single most maximum residual error.’,

‘Mean Absolute Error’: ‘Postive mean value of all residuals’,

‘Mean Squared Error’: ‘Mean of the squared sum the residuals’,

‘Root Mean Sqaured Error’: ‘Square root of the Mean Squared Error’,

‘Mean Squared Log Error’: ‘Mean of the squared sum of the log of all residuals’,

‘Median Absolute Error’: ‘Postive median value of all residuals’,

‘R2’: ‘R-squared (R2) is a statistical measure that represents the proportion of the variance for a dependent variable that is explained by an independent variable or variables in a regression model.’,

‘SMAPE’: ‘Symmetric mean absolute percentage error. It is an accuracy measure based on percentage (or relative) errors.’

Parameters:metrics (str(s), optional) – Specific type of metrics to view


>>> m = model.LinearRegression()
>>> m.metrics()
>>> m.metrics('SMAPE', 'Root Mean Squared Error')
plot_predicted_actual(output_file='', **scatterplot_kwargs)

Plots the actual data vs. predictions

Parameters:output_file (str, optional) – Output file name, by default “”

R^2 (coefficient of determination) regression score function.

R-squared (R2) is a statistical measure that represents the proportion of the variance for a dependent variable that is explained by an independent variable or variables in a regression model.

Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Returns:R2 coefficient.
Return type:float


>>> m = model.LinearRegression()
>>> m.r2()

Root mean squared error.

Calculated by taking the square root of the Mean Squared Error.

Returns:Root mean squared error.
Return type:float


>>> m = model.LinearRegression()
>>> m.root_mean_sq_error()

Symmetric mean absolute percentage error.

It is an accuracy measure based on percentage (or relative) errors.

Return type:float


>>> m = model.LinearRegression()
>>> m.smape()
class aethos.model_analysis.unsupervised_model_analysis.UnsupervisedModelAnalysis(model, data, model_name)

Bases: aethos.model_analysis.model_analysis.ModelAnalysisBase

filter_cluster(cluster_no: int)

Filters data by a cluster number for analysis.

Parameters:cluster_no (int) – Cluster number to filter by
Returns:Filtered data or test dataframe
Return type:Dataframe


>>> m = model.KMeans()
>>> m.filter_cluster(1)
plot_clusters(dim=2, reduce='pca', output_file='', **kwargs)

Plots the clusters in either 2d or 3d space with each cluster point highlighted as a different colour.

For 2d plotting options, see:

For 3d plotting options, see:

  • dim (2 or 3, optional) – Dimension of the plot, either 2 for 2d, 3 for 3d, by default 2
  • reduce (str {'pca', 'tvsd', 'lle', 'tsne'}, optional) – Dimension reduction strategy i.e. pca, by default “pca”
  • output_file (str) – Output file name including extension (.png, .jpg, etc.) to save image as.


>>> m = model.KMeans()
>>> m.plot_clusters()
>>> m.plot_clusters(dim=3)
class aethos.model_analysis.text_model_analysis.TextModelAnalysis(model, data, model_name, **kwargs)

Bases: aethos.model_analysis.model_analysis.ModelAnalysisBase


Displays the coherence score of the topic model.

For more info on topic coherence:

Parameters:col_name (str) – Column name that was used as input for the LDA model


>>> m = model.LDA()
>>> m.coherence_score()

Displays the model perplexity of the topic model.

Perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models.

A low perplexity indicates the probability distribution is good at predicting the sample.


>>> m = model.LDA()
>>> m.model_perplexity()
view(original_text, model_output)

View the original text and the model output in a more user friendly format

  • original_text (str) – Column name of the original text
  • model_output (str) – Column name of the model text


>>> m = model.LDA()
>>> m.view('original_text_col_name', 'model_output_col_name')
view_topic(topic_num: int, **kwargs)

View a specific topic from topic modelling model.

Parameters:topic_num (int) –
Returns:String representation of topic and probabilities
Return type:str


>>> m = model.LDA()
>>> m.view_topic(1)
view_topics(num_topics=10, **kwargs)

View topics from topic modelling model.

Parameters:num_topics (int, optional) – Number of topics to view, by default 10
Returns:String representation of topics and probabilities
Return type:str


>>> m = model.LDA()
>>> m.view_topics()

Visualize topics using pyLDAvis.

  • R (int) – The number of terms to display in the barcharts of the visualization. Default is 30. Recommended to be roughly between 10 and 50.
  • lambda_step (float, between 0 and 1) – Determines the interstep distance in the grid of lambda values over which to iterate when computing relevance. Default is 0.01. Recommended to be between 0.01 and 0.1.
  • mds (function or {'tsne', 'mmds}) – A function that takes topic_term_dists as an input and outputs a n_topics by 2 distance matrix. The output approximates the distance between topics. See js_PCoA() for details on the default function. A string representation currently accepts pcoa (or upper case variant), mmds (or upper case variant) and tsne (or upper case variant), if sklearn package is installed for the latter two.
  • n_jobs (int) – The number of cores to be used to do the computations. The regular joblib conventions are followed so -1, which is the default, will use all cores.
  • plot_opts (dict, with keys ‘xlab’ and ylab) – Dictionary of plotting options, right now only used for the axis labels.
  • sort_topics (bool) – Sort topics by topic proportion (percentage of tokens covered). Set to false to keep original topic order.


>>> m = model.LDA()
>>> m.visualize_topics()