Statistics

statistics.get_top_deviations(scores, metric='mpe', n=5)[source]

Given a matrix that each row contains scores of how well a segment fits a model, find the indices of the top most deviant segments.

Args:

scores: A 2-D numpy array (NxM) that contains M scores for each one of the N segments. metric: A string that specifies which score to consider. n: number of the deviant segments

Return:

The indices of the segments.

statistics.mape1(y_true, y_pred)[source]

Computes the Mean Absolute Percentage Error between the 2 given time series

Args:

y_true: A numpy array that contains the actual values of the time series. y_pred: A numpy array that contains the predicted values of the time series.

Return:

Mean Absolute Percentage Error value.

statistics.mpe1(y_true, y_pred)[source]

Computes the Mean Percentage Error between the 2 given time series.

Args:

y_true: A numpy array that contains the actual values of the time series. y_pred: A numpy array that contains the predicted values of the time series.

Return:

Mean Absolute Error value.

statistics.multi_corr(df, dep_column)[source]

Computation of the coefficient of multiple correlation. The input consists of a dataframe and the column corresponding to the dependent variable.

Args:

df: Date/Time DataFrame or any Given DataFrame. dep_column: The corresponding the column to the dependent variable.

Return:

The coefficient of multiple correlation between the dependant column and the rest.

statistics.score(y_true, y_pred)[source]

Computes a set of values that measure how well a predicted time series matches the actual time series.

Args:

y_true: A numpy array that contains the actual values of the time series. y_pred: A numpy array that contains the predicted values of the time series.

Return:

Returns a value for each of the following measures: r-squared, mean absolute error, mean error, mean absolute percentage error, mean percentage error