Models
Base
- class bsix.models.base.BaseSurvival[source]
Bases:
BaseEstimator,ABCAbstract Class for Survival Analysis models.
- static dinamic_discretise(y, dataset, seed=0, plot=False)[source]
Discretise data by piecewise exponential and show in kaplan meier.
- static generate_simulated_survival_data(number_rows=1000, number_columns=10, censored=0.75, relation=None, seed=0)[source]
Generate simulated survival data based.
- static plot_coefficients(coefficients, estimator_name, dataset, seed=None, progression=None)[source]
Plot XAI coefficients for the data (lollipop plot).
- static plot_individual_shap(shap_explainer, identifier_index, index, scaler, estimator_name, dataset, seed=None, progression=None)[source]
Plot SHAP values for an individual instance (horizontal bar plot).
- static plot_shap(shap_explainer, index, scaler, estimator_name, dataset, seed=None, progression=None)[source]
Plot SHAP values for the data (beeswarm plot).
Metodologies
- class bsix.models.AcceleratedFailureTime(type='WeibullAFT', penalizer=0.0, l1_ratio=0.0)[source]
Bases:
BaseSurvivalWeibull Accelerated Failure Time model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- class bsix.models.BaseCoxRegression(alpha=0.0, ties='breslow', n_iter=100)[source]
Bases:
BaseSurvivalCox Regression model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- class bsix.models.BaseCoxRegressionWithTimeVarying(penalizer=0.0, l1_ratio=0.0, formula=None)[source]
Bases:
BaseSurvivalCox Regression with Time-Varying Covariates model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- class bsix.models.BaseRandomSurvivalForest(seed, n_jobs=-1, n_estimators=100, max_depth=None, min_samples_leaf=3, min_samples_split=6)[source]
Bases:
BaseSurvivalRandom Survival Forest model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- class bsix.models.BaseSurvivalTree(seed, max_depth=5, min_samples_split=2, min_samples_leaf=1)[source]
Bases:
BaseSurvivalSurvival Tree model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- class bsix.models.CoxRegression(alpha=0.0, ties='breslow', n_iter=100)[source]
Bases:
BaseSurvivalCox Regression model.
- Parameters:
- coef_
Estimated coefficients for the model.
- Type:
array-like, shape (n_features,)
- breslow
Breslow estimator for baseline hazards.
- Type:
BreslowEstimator
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
- shap_explainer
SHAP explainer for model interpretability.
- Type:
shap.Explainer
Examples
from bsix.models.metodologies import CoxRegression model = CoxRegression(alpha=0.1, ties="efron", n_iter=200) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer (shap.Explainer) – SHAP explainer for model interpretability.
coefficients (dict) – Dictionary of feature coefficients sorted by absolute value.
- fit(X, y)[source]
Fit the model to the data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data.
y (structured array-like, shape (n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)
- class bsix.models.CoxRegressionWithTimeVarying(alpha=0.0, ties='breslow', n_iter=100)[source]
Bases:
BaseSurvivalCox Regression Time-Varying model.
- Parameters:
- coef_
Estimated coefficients for the model.
- Type:
array-like, shape (n_features,)
- breslow
Breslow estimator for baseline hazards.
- Type:
BreslowEstimator
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
- shap_explainer
SHAP explainer for model interpretability.
- Type:
shap.Explainer
Examples
from bsix.models.metodologies import CoxRegressionWithTimeVarying model = CoxRegressionWithTimeVarying(alpha=0.1, ties="efron", n_iter=200) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer (shap.Explainer) – SHAP explainer for model interpretability.
coefficients (dict) – Dictionary of feature coefficients sorted by absolute value.
- fit(X, y)[source]
Fit the model to the data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data.
y (structured array-like, shape (n_samples,)) – Target training values (event, start times, stop times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)
- class bsix.models.DeepMultiTask(num_inputs, valid_data=None, hidden_layers=None, epochs=500, learn_rate=0.0, lr_decay=0.0, l1_reg=0.0, l2_reg=0.0, cox_reg=0.0, momentum=0.9, activation='relu', dropout=0.0, standardize=True, ties='cox', device=None, validation_frequency=10, patience=500, improvement_threshold=0.99999, patience_increase=25, logger=None, verbose=True, seed=None, coef_likelihood=[1.0])[source]
Bases:
BaseSurvivalDeep Multi-Task model.
- Parameters:
num_inputs (int) – Number of input features.
valid_data (dict, default =
None) – Validation data in the form of a dictionary with keys “x”, “e”, and “t” for features, events, and times, respectively.hidden_layers (list of int, default =
None) – List specifying the number of units in each hidden layer.epochs (int, default = 500) – Number of training epochs.
learn_rate (float, default = 0.0) – Learning rate for the optimizer.
lr_decay (float, default = 0.0) – Learning rate decay factor.
l1_reg (float, default = 0.0) – L1 regularization strength.
l2_reg (float, default = 0.0) – L2 regularization strength.
cox_reg (float, default = 0.0) – Coefficient for the Cox loss in the total loss function.
momentum (float, default = 0.9) – Momentum for the optimizer.
activation (str, default =
"relu") – Activation function to use in the hidden layers.relu,selu,tanhorsigmoid.dropout (float, default = 0.0) – Dropout rate for regularization.
standardize (bool, default =
True) – Whether to standardize input features.ties (str, default =
"cox") – Method for handling tied event times."cox"or"breslow".device (torch.device, default =
None) – Device to run the model on (e.g., “cpu” or “cuda”).validation_frequency (int, default = 10) – Frequency (in epochs) to perform validation.
patience (int, default = 2000) – Number of epochs to wait for improvement before early stopping.
improvement_threshold (float, default = 0.99999) – Threshold for considering an improvement in validation loss.
patience_increase (int, default = 2) – Factor by which to increase patience when an improvement is observed.
logger (DeepSurvLogger, default =
None) – Logger for tracking training progress.verbose (bool, default =
True) – Whether to print training progress.seed (int, default =
None) – Random seed for reproducibility.coef_likelihood (list of float, default = [1.0]) – Coefficients for the likelihood loss of each progression in the total loss function.
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
Examples
from bsix.models.metodologies import DeepMultiTask model = DeepMultiTask(num_inputs=10, hidden_layers=[32,], epochs=200, learn_rate=0.01) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
list of shap.Explainer, shape (n_progressions,)
- fit(X_train, y_train, **kwargs)[source]
Fit the model to the data.
- Parameters:
X_train (array-like, shape (n_progressions, n_samples, n_features)) – Training data.
y_train (structured array-like, shape (n_progressions, n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(x)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_progressions, n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_progressions, n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard functions.
- Return type:
array-like, shape (n_progressions, n_samples, n_times)
- predict_survival_function(X, index, dataset, seed, plot=False)[source]
Predict the survival function for the given data.
- Parameters:
- Returns:
survival_function – Predicted survival functions.
- Return type:
array-like, shape (n_progressions, n_samples, n_times)
- set_fit_request(*, X_train: bool | None | str = '$UNCHANGED$', y_train: bool | None | str = '$UNCHANGED$') DeepMultiTask
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type:
- set_predict_request(*, x: bool | None | str = '$UNCHANGED$') DeepMultiTask
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class bsix.models.DeepMultiTaskMultiLoss(num_inputs, valid_data=None, hidden_layers=None, epochs=500, learn_rate=0.0, lr_decay=0.0, l1_reg=0.0, l2_reg=0.0, cox_reg=0.0, bin_reg=0.0, momentum=0.9, activation='relu', dropout=0.0, standardize=True, ties='cox', device=None, validation_frequency=10, patience=500, improvement_threshold=0.99999, patience_increase=25, logger=None, verbose=True, seed=None, coef_likelihood=[1.0], coef_binary=[1.0])[source]
Bases:
BaseSurvivalDeep Multi-Task Multi-Loss model.
- Parameters:
num_inputs (int) – Number of input features.
valid_data (dict, default =
None) – Validation data in the form of a dictionary with keys “x”, “e”, and “t” for features, events, and times, respectively.hidden_layers (list of int, default =
None) – List specifying the number of units in each hidden layer.epochs (int, default = 500) – Number of training epochs.
learn_rate (float, default = 0.0) – Learning rate for the optimizer.
lr_decay (float, default = 0.0) – Learning rate decay factor.
l1_reg (float, default = 0.0) – L1 regularization strength.
l2_reg (float, default = 0.0) – L2 regularization strength.
cox_reg (float, default = 0.0) – Coefficient for the Cox loss in the total loss function.
bin_reg (float, default = 0.0) – Coefficient for the binary loss in the total loss function.
momentum (float, default = 0.9) – Momentum for the optimizer.
activation (str, default =
"relu") – Activation function to use in the hidden layers.relu,selu,tanhorsigmoid.dropout (float, default = 0.0) – Dropout rate for regularization.
standardize (bool, default =
True) – Whether to standardize input features.ties (str, default =
"cox") – Method for handling tied event times."cox"or"breslow".device (torch.device, default =
None) – Device to run the model on (e.g., “cpu” or “cuda”).validation_frequency (int, default = 10) – Frequency (in epochs) to perform validation.
patience (int, default = 2000) – Number of epochs to wait for improvement before early stopping.
improvement_threshold (float, default = 0.99999) – Threshold for considering an improvement in validation loss.
patience_increase (int, default = 2) – Factor by which to increase patience when an improvement is observed.
logger (DeepSurvLogger, default =
None) – Logger for tracking training progress.verbose (bool, default =
True) – Whether to print training progress.seed (int, default =
None) – Random seed for reproducibility.coef_likelihood (list of float, default = [1.0]) – Coefficients for the likelihood loss of each progression in the total loss function.
coef_binary (list of float, default = [1.0]) – Coefficients for the binary loss of each progression in the total loss function.
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
Examples
from bsix.models.metodologies import DeepMultiTaskMultiLoss model = DeepMultiTaskMultiLoss(num_inputs=10, hidden_layers=[32,], epochs=200, learn_rate=0.01) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
list of shap.Explainer, shape (n_progressions,)
- fit(X_train, y_train, **kwargs)[source]
Fit the model to the data.
- Parameters:
X_train (array-like, shape (n_progressions, n_samples, n_features)) – Training data.
y_train (structured array-like, shape (n_progressions, n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(x)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_progressions, n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_progressions, n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard functions.
- Return type:
array-like, shape (n_progressions, n_samples, n_times)
- predict_survival_function(X, index, dataset, seed, plot=False)[source]
Predict the survival function for the given data.
- Parameters:
- Returns:
survival_function – Predicted survival functions.
- Return type:
array-like, shape (n_progressions, n_samples, n_times)
- set_fit_request(*, X_train: bool | None | str = '$UNCHANGED$', y_train: bool | None | str = '$UNCHANGED$') DeepMultiTaskMultiLoss
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type:
- set_predict_request(*, x: bool | None | str = '$UNCHANGED$') DeepMultiTaskMultiLoss
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_score_request(*, x: bool | None | str = '$UNCHANGED$') DeepMultiTaskMultiLoss
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class bsix.models.DeepSurv(num_inputs, valid_data=None, hidden_layers=None, epochs=500, learn_rate=0.0, lr_decay=0.0, l1_reg=0.0, l2_reg=0.0, momentum=0.9, activation='relu', dropout=0.0, standardize=True, ties='cox', device=None, validation_frequency=10, patience=2000, improvement_threshold=0.99999, patience_increase=2, logger=None, verbose=True, seed=None)[source]
Bases:
BaseSurvivalDeep Survival model.
- Parameters:
num_inputs (int) – Number of input features.
valid_data (dict, default =
None) – Validation data in the form of a dictionary with keys “x”, “e”, and “t” for features, events, and times, respectively.hidden_layers (list of int, default =
None) – List specifying the number of units in each hidden layer.epochs (int, default =500) – Number of training epochs.
learn_rate (float, default =0.0) – Learning rate for the optimizer.
lr_decay (float, default =0.0) – Learning rate decay factor.
l1_reg (float, default =0.0) – L1 regularization strength.
l2_reg (float, default =0.0) – L2 regularization strength.
momentum (float, default =0.9) – Momentum for the optimizer.
activation (str, default =
"relu") – Activation function to use in the hidden layers.relu,selu,tanhorsigmoid.dropout (float, default =0.0) – Dropout rate for regularization.
standardize (bool, default =
True) – Whether to standardize input features.ties (str, default =
"cox") – Method for handling tied event times."cox"or"breslow".device (torch.device, default =
None) – Device to run the model on (e.g., “cpu” or “cuda”).validation_frequency (int, default =10) – Frequency (in epochs) to perform validation.
patience (int, default =2000) – Number of epochs to wait for improvement before early stopping.
improvement_threshold (float, default =0.99999) – Threshold for considering an improvement in validation loss.
patience_increase (int, default =2) – Factor by which to increase patience when an improvement is observed.
logger (DeepSurvLogger, default =
None) – Logger for tracking training progress.verbose (bool, default =
True) – Whether to print training progress.seed (int, default =
None) – Random seed for reproducibility.
- breslow
Breslow estimator for baseline hazards.
- Type:
BreslowEstimator
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
- shap_explainer
SHAP explainer for model interpretability.
- Type:
shap.Explainer
Examples
from bsix.models.metodologies import DeepSurv model = DeepSurv(num_inputs=10, hidden_layers=[32,], epochs=200, learn_rate=0.01) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
shap.Explainer
- fit(X_train, y_train, **kwargs)[source]
Fit the model to the data.
- Parameters:
X_train (array-like, shape (n_samples, n_features)) – Training data.
y_train (structured array-like, shape (n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(x)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)
- predict_survival_function(X, index, dataset, seed, plot=False)[source]
Predict the survival function for the given data.
- Parameters:
- Returns:
survival_function – Predicted survival function.
- Return type:
array-like, shape (n_samples, n_times)
- set_fit_request(*, X_train: bool | None | str = '$UNCHANGED$', y_train: bool | None | str = '$UNCHANGED$') DeepSurv
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type:
- set_predict_request(*, x: bool | None | str = '$UNCHANGED$') DeepSurv
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class bsix.models.DeepTimeVarying(num_inputs, valid_data=None, hidden_layers=None, epochs=500, learn_rate=0.0, lr_decay=0.0, l1_reg=0.0, l2_reg=0.0, momentum=0.9, activation='relu', dropout=0.0, standardize=True, ties='cox', device=None, validation_frequency=10, patience=2000, improvement_threshold=0.99999, patience_increase=2, logger=None, verbose=True, seed=None)[source]
Bases:
BaseSurvivalDeep Time-Varying model.
- Parameters:
num_inputs (int) – Number of input features.
valid_data (dict, default =
None) – Validation data in the form of a dictionary with keys “x”, “e”, and “t” for features, events, and times, respectively.hidden_layers (list of int, default =
None) – List specifying the number of units in each hidden layer.epochs (int, default =500) – Number of training epochs.
learn_rate (float, default =0.0) – Learning rate for the optimizer.
lr_decay (float, default =0.0) – Learning rate decay factor.
l1_reg (float, default =0.0) – L1 regularization strength.
l2_reg (float, default =0.0) – L2 regularization strength.
momentum (float, default =0.9) – Momentum for the optimizer.
activation (str, default =
"relu") – Activation function to use in the hidden layers.relu,selu,tanhorsigmoid.dropout (float, default =0.0) – Dropout rate for regularization.
standardize (bool, default =
True) – Whether to standardize input features.ties (str, default =
"cox") – Method for handling tied event times."cox"or"breslow".device (torch.device, default =
None) – Device to run the model on (e.g., “cpu” or “cuda”).validation_frequency (int, default =10) – Frequency (in epochs) to perform validation.
patience (int, default =2000) – Number of epochs to wait for improvement before early stopping.
improvement_threshold (float, default =0.99999) – Threshold for considering an improvement in validation loss.
patience_increase (int, default =2) – Factor by which to increase patience when an improvement is observed.
logger (DeepSurvLogger, default =
None) – Logger for tracking training progress.verbose (bool, default =
True) – Whether to print training progress.seed (int, default =
None) – Random seed for reproducibility.
- breslow
Breslow estimator for baseline hazards.
- Type:
BreslowEstimator
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
- shap_explainer
SHAP explainer for model interpretability.
- Type:
shap.Explainer
Examples
from bsix.models.metodologies import DeepTimeVarying model = DeepTimeVarying(num_inputs=10, hidden_layers=[32,], epochs=200, learn_rate=0.01) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
shap.Explainer
- fit(X_train, y_train, **kwargs)[source]
Fit the model to the data.
- Parameters:
X_train (array-like, shape (n_samples, n_features)) – Training data.
y_train (structured array-like, shape (n_samples,)) – Target training values (events, start times, stop times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(x)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)
- predict_survival_function(X, index, dataset, seed, plot=False)[source]
Predict the survival function for the given data.
- Parameters:
- Returns:
survival_function – Predicted survival function.
- Return type:
array-like, shape (n_samples, n_times)
- set_fit_request(*, X_train: bool | None | str = '$UNCHANGED$', y_train: bool | None | str = '$UNCHANGED$') DeepTimeVarying
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type:
- set_predict_request(*, x: bool | None | str = '$UNCHANGED$') DeepTimeVarying
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class bsix.models.RandomSurvForest(seed, n_jobs=-1, n_estimators=100, max_depth=None, min_samples_leaf=3, min_samples_split=6)[source]
Bases:
BaseSurvivalRandom Survival Forest model.
- Parameters:
seed (int) – Random seed for reproducibility.
n_jobs (int, default =-1) – Number of jobs to run in parallel.
n_estimators (int, default =100) – The number of trees in the forest.
max_depth (int, default =´´None´´) – The maximum depth of the tree.
min_samples_leaf (int, default =3) – The minimum number of samples required to be at a leaf node.
min_samples_split (int, default =6) – The minimum number of samples required to split an internal node.
- survival_function
Estimated survival function.
- Type:
array-like, shape (n_samples, n_times)
- cumulative_hazard_function
Estimated cumulative hazard function.
- Type:
array-like, shape (n_samples, n_times)
- shap_explainer
SHAP explainer for model interpretability.
- Type:
shap.Explainer
Examples
from bsix.models.metodologies import RandomSurvForest model = RandomSurvForest(seed=42, n_estimators=100, max_depth=5) model.fit(X_train, y_train)
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
shap.Explainer
- fit(X, y)[source]
Fit the model to the data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data.
y (structured array-like, shape (n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)
- class bsix.models.SurvTree(max_depth=None, min_samples_split=6, min_samples_leaf=3, seed=0)[source]
Bases:
BaseSurvivalSurvival Tree model.
- calculate_xai(X, index, scaler, dataset, seed, feature_names, background=False, plot=False)[source]
Calculate XAI values.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
index (array-like, shape (n_samples,)) – Index for the samples.
scaler (object) – Scaler used for the data.
dataset (str) – Name of the dataset.
seed (int) – Random seed for reproducibility.
background (bool, default =
False) – Whether to use background data for SHAP.plot (bool, default =
False) – Whether to plot the XAI values.
- Returns:
shap_explainer – SHAP explainer for model interpretability.
- Return type:
shap.Explainer
- fit(X, y)[source]
Fit the model to the data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data.
y (structured array-like, shape (n_samples,)) – Target training values (events, times).
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)[source]
Predict risk scores for the given data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Input data.
- Returns:
risk – Predicted risk scores.
- Return type:
array-like, shape (n_samples,)
- predict_cumulative_hazard_function(X, index, dataset, seed, plot=False)[source]
Predict the cumulative hazard function for the given data.
- Parameters:
- Returns:
cumulative_hazard_function – Predicted cumulative hazard function.
- Return type:
array-like, shape (n_samples, n_times)