sparsesurv.utils module
Summary
Functions:
Basic CV scoring function based on the scoring function used [1]. |
|
Mean-squared error based CV scoring function. |
|
Subtract predictor values from each other as well as calculate (integrated) Gaussian kernel. |
|
Obtain result of the integration of the Gaussian kernel. |
|
Obtain result of Gaussian kernel. |
|
Subtract predictor values from each other and calculate integrated Gaussian kernel. |
|
Transform input variable into separate time and event arrays. |
|
Obtain survival times, censoring information and eta (e.g. y train) from structuted array. |
|
Subtract predictor values from each other and calculate Gaussian kernel. |
|
CV score computation using linear predictors [1, 2]. |
|
Apply log-sum-exp trick when calculating the log addition for numerical stability. |
|
Apply log-sum-exp trick when calculating the log difference for numerical stability. |
|
Apply log-sum-exp trick. |
|
Transform time and event variables into one variable. |
|
Transform survival times, censoring information and eta (e.g. y train) into one array. |
|
Verweij and Van Houwelingen CV scoring function [1, 2]. |
Reference
- inverse_transform_survival(y)[source]
Transform input variable into separate time and event arrays.
- Parameters:
y (np.array) – Structured array containing time and censoring events.
- Returns:
Survival time and event array.
- Return type:
tuple[npt.NDArray[np.float64], npt.NDArray[np.float64]]
- transform_survival(time, event)[source]
Transform time and event variables into one variable.
- Parameters:
time (npt.NDArray[np.float64]) – Survival times.
event (npt.NDArray[np.float64]) – Censoring information.
- Returns:
Structured array containing survival times and right-censored survival information.
- Return type:
np.array
- inverse_transform_survival_kd(y)[source]
Obtain survival times, censoring information and eta (e.g. y train) from structuted array.
- Parameters:
y (npt.NDArray[np.float64]) – Structured array containing survival times, censoring information.
- Returns:
survival times, censoring information, eta.
- Return type:
tuple[npt.NDArray[np.float64], npt.NDArray[np.int64], npt.NDArray[np.float64]]
- transform_survival_kd(time, event, eta_hat)[source]
Transform survival times, censoring information and eta (e.g. y train) into one array.
- Parameters:
time (npt.NDArray[np.float64]) – Survival times.
event (npt.NDArray[np.float64]) – Censoring information.
eta_hat (npt.NDArray[np.float64]) – Estimated dependent variable.
- Raises:
NotImplementedError – Checking for dimensions.
- Returns:
Structured array containing survival times and censoring information.
- Return type:
npt.NDArray
- logsubstractexp(a, b)[source]
Apply log-sum-exp trick when calculating the log difference for numerical stability.
- logaddexp(a, b)[source]
Apply log-sum-exp trick when calculating the log addition for numerical stability.
- numba_logsumexp_stable(a)[source]
Apply log-sum-exp trick.
- Parameters:
a (npt.NDArray[np.float64]) – Input array to which the sum and then
applied. (log will be) –
- Returns:
Result of log-sum-exp trick.
- Return type:
- kernel(a, b, bandwidth)[source]
Subtract predictor values from each other and calculate Gaussian kernel.
- Parameters:
a (npt.NDArray[np.float64]) – First predictor value (hazard prediction).
b (npt.NDArray[np.float64]) – Second predictor value (hazard prediction).
bandwidth (float) – Fixed kernel bandwith.
- Returns:
Kernel matrix.
- Return type:
npt.NDArray[np.float64]
- integrated_kernel(a, b, bandwidth)[source]
Subtract predictor values from each other and calculate integrated Gaussian kernel.
- Parameters:
a (npt.NDArray[np.float64]) – First predictor value (hazard prediction).
b (npt.NDArray[np.float64]) – Second predictor value (hazard prediction).
bandwidth (float) – Fixed kernel bandwith.
- Returns:
Integrated kernel matrix.
- Return type:
npt.NDArray[np.float64]
- difference_kernels(a, b, bandwidth)[source]
Subtract predictor values from each other as well as calculate (integrated) Gaussian kernel.
- Parameters:
a (npt.NDArray[np.float64]) – First predictor value (hazard prediction).
b (npt.NDArray[np.float64]) – Second predictor value (hazard prediction).
bandwidth (float) – Fixed kernel bandwith.
- Returns:
Predictor difference, kernel matrix, integrated kernel matrix
- Return type:
Tuple[npt.NDArray[np.float64],npt.NDArray[np.float64],npt.NDArray[np.float64]]
- basic_cv_fold(test_linear_predictor, test_time, test_event, score_function, test_eta_hat=None, train_linear_predictor=None, train_time=None, train_event=None)[source]
Basic CV scoring function based on the scoring function used [1].
- Parameters:
test_linear_predictor (np.array) – Linear predictors of a given test fold. X@beta.
test_time (np.array) – Sorted time points of the test fold.
test_event (np.array) – Event indicator of the test fold.
test_eta_hat (np.array) – Predicted linear predictors of a given test fold.
train_linear_predictor (np.array) – Linear predictors of the training fold.
train_time (np.array) – Sorted time points of the training fold.
train_event (np.array) – Event indicator of the training fold.
score_function (Callable) – Scoring function used to compute the negative log-likelihood.
- Returns:
Scalar value of the mean partial log-likelihood for a given test fold.
- Return type:
Notes
All unused parameters kept for overall score function signature compatibility.
References
[1] Dai, Biyue, and Patrick Breheny. “Cross validation approaches for penalized Cox regression.” arXiv preprint arXiv:1905.10432 (2019).
- basic_mse(test_linear_predictor, test_eta_hat, test_time=None, test_event=None, train_linear_predictor=None, train_time=None, train_event=None, score_function=None)[source]
Mean-squared error based CV scoring function.
- Parameters:
test_linear_predictor (np.array) – Linear predictors of a given test fold. X@beta.
test_time (np.array) – Sorted time points of the test fold.
test_event (np.array) – Event indicator of the test fold.
test_eta_hat (np.array) – Predicted linear predictors of a given test fold.
train_linear_predictor (np.array) – Linear predictors of the training fold.
train_time (np.array) – Sorted time points of the training fold.
train_event (np.array) – Event indicator of the training fold.
score_function (Callable) – Scoring function used to compute the negative log-likelihood.
- Returns:
Scalar value of the mean partial log-likelihood for a given test fold.
- Return type:
Notes
All unused parameters kept for overall score function signature compatibility.
- vvh_cv_fold(test_linear_predictor, test_time, test_event, train_linear_predictor, train_time, train_event, score_function, test_eta_hat=None)[source]
Verweij and Van Houwelingen CV scoring function [1, 2].
- Parameters:
test_linear_predictor (np.array) – Linear predictors of a given test fold. X@beta.
test_time (np.array) – Sorted time points of the test fold.
test_event (np.array) – Event indicator of the test fold.
test_eta_hat (np.array) – Predicted linear predictors of a given test fold.
train_linear_predictor (np.array) – Linear predictors of the training fold.
train_time (np.array) – Sorted time points of the training fold.
train_event (np.array) – Event indicator of the training fold.
score_function (Callable) – Scoring function used to compute the negative log-likelihood.
- Returns:
Scalar value of the mean partial log-likelihood for a given test fold.
- Return type:
Notes
All unused parameters kept for overall score function signature compatibility.
References
[1] Verweij, Pierre JM, and Hans C. Van Houwelingen. “Cross‐validation in survival analysis.” Statistics in medicine 12.24 (1993): 2305-2314.
[2] Dai, Biyue, and Patrick Breheny. “Cross validation approaches for penalized Cox regression.” arXiv preprint arXiv:1905.10432 (2019).
- linear_cv(test_linear_predictor, test_time, test_event, score_function, test_eta_hat=None, train_linear_predictor=None, train_time=None, train_event=None)[source]
CV score computation using linear predictors [1, 2].
- Parameters:
test_linear_predictor (np.array) – Linear predictors of a given test fold. X@beta.
test_time (np.array) – Sorted time points of the test fold.
test_event (np.array) – Event indicator of the test fold.
test_eta_hat (np.array) – Predicted linear predictors of a given test fold.
train_linear_predictor (np.array) – Linear predictors of the training fold.
train_time (np.array) – Sorted time points of the training fold.
train_event (np.array) – Event indicator of the training fold.
score_function (Callable) – Scoring function used to compute the negative log-likelihood.
- Returns:
Scalar value of the mean partial log-likelihood for a given test fold.
- Return type:
Notes
All unused parameters kept for overall score function signature compatibility.
References
[1] Dai, Biyue, and Patrick Breheny. “Cross validation approaches for penalized Cox regression.” arXiv preprint arXiv:1905.10432 (2019).