sparsesurv.aft module

Summary

Classes:

AFT

Linear Accelerated Failure Time (AFT) model based on kernel-smoothed PL [Zeng2007].

Reference

class AFT(bandwidth=None, tol=None, options=None)[source]

Bases: SurvivalMixin

Linear Accelerated Failure Time (AFT) model based on kernel-smoothed PL [Zeng2007].

Fits a linear AFT model based on the kernel smoothed profile likelihood as proposed by [Fletcher2000]. Uses the trust-ncg algorithm implementation from ‘scipy.optimize.minimize` for optimization using a BFGS [Fletcher2000] quasi-Newton strategy. Gradients are JIT-compiled using numba and implemented in an efficient manner (see sparsesurv.gradients).

References

Zeng, Donglin, and D. Y. Lin. “Efficient estimation for the accelerated failure time model.” Journal of the American Statistical Association 102.480 (2007): 1387-1396.

Fletcher, Roger. Practical methods of optimization. John Wiley & Sons, 2000.

Sheather, Simon J., and Michael C. Jones. “A reliable data‐based bandwidth selection method for kernel density estimation.” Journal of the Royal Statistical Society: Series B (Methodological) 53.3 (1991): 683-690.

Zhong, Qixian, Jonas W. Mueller, and Jane-Ling Wang. “Deep extended hazard models for survival analysis.” Advances in Neural Information Processing Systems 34 (2021): 15111-15124.

__init__(bandwidth=None, tol=None, options=None)[source]

Constructor.

Parameters:

bandwidth (Optional[float], optional) – Bandwidth to be used for kernel smoothing the profile likelihood. If left unspecified (i.e., None), optimal bandwidth will be estimted empirically, similar to previous work ([Sheather1991] , [Zhong2021]). Defaults to None.
tol (Optional[float], optional) – Tolerance for terminating the trust-ncg algorithm in scipy. Defaults to None.
options (Optional[Dict[str, Union[bool, int, float]]], optional) – Solver-specific configuration options of the trust-ncg solver in scipy. Defaults to None.

init_coefs(X)[source]

Initializes the coefficients of the AFT model at all zeros.

Parameters:: X (npt.NDArray[np.float64]) – Training design matrix with n rows and p columns.
Returns:: Initialized coefficients with p rows and 2 columns.
Return type:: npt.NDArray[np.float64]

fit(X, y, sample_weight=None)[source]

Fits the linear AFT model using the trust-ncg implementation from scipy.

Parameters:

X (npt.NDArray[np.float64]) – Design matrix.
y (npt.NDArray[np.float64]) – Structured array containing right-censored survival information.
sample_weight (npt.NDArray[np.float64], optional) – Sample weight used during model fitting. Currently unused and kept for sklearn compatibility. Defaults to None.

Return type:

None

predict_cumulative_hazard_function(X, time)[source]

Predict cumulative hazard function for patients in X at times time.

Parameters:

X (npt.NDArray[np.float64]) – Query design matrix with u rows and p columns.
time (npt.NDArray[np.float64]) – Query times of dimension k. Assumed to be unique and ordered.

Raises:

ValueError – Raises ValueError when the event times are not unique and sorted in ascending order.

Returns:

Query cumulative hazard function for samples 1, …, u: and times 1, …, k. Thus, has u rows and k columns.

Return type:

npt.NDArray[np.float64]

__doc__ = 'Linear Accelerated Failure Time (AFT) model based on kernel-smoothed PL [Zeng2007]_.\n\n Fits a linear AFT model based on the kernel smoothed profile likelihood\n as proposed by [Fletcher2000]_. Uses the `trust-ncg` algorithm implementation\n from \'scipy.optimize.minimize` for optimization using a BFGS [Fletcher2000]_\n quasi-Newton strategy. Gradients are JIT-compiled using numba\n and implemented in an efficient manner (see `sparsesurv.gradients`).\n\n References:\n Zeng, Donglin, and D. Y. Lin. "Efficient estimation for the accelerated failure time model." Journal of the American Statistical Association 102.480 (2007): 1387-1396.\n\n Fletcher, Roger. Practical methods of optimization. John Wiley & Sons, 2000.\n\n Sheather, Simon J., and Michael C. Jones. "A reliable data‐based bandwidth selection method for kernel density estimation." Journal of the Royal Statistical Society: Series B (Methodological) 53.3 (1991): 683-690.\n\n Zhong, Qixian, Jonas W. Mueller, and Jane-Ling Wang. "Deep extended hazard models for survival analysis." Advances in Neural Information Processing Systems 34 (2021): 15111-15124.\n '

__module__ = 'sparsesurv.aft'

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → AFT

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
Returns:: self – The updated object.
Return type:: object