scxpand.lightgbm.run_lightgbm_#

Functions

compute_sample_weights(y)

Compute balanced sample weights.

run_lightgbm_inference(model, data_format[, ...])

Run inference using a trained LightGBM model.

run_lightgbm_training(base_save_dir, prm, ...)

Train a LightGBM model for gene expression classification.

scxpand.lightgbm.run_lightgbm_.compute_sample_weights(y)#

Compute balanced sample weights.

Return type:

ndarray

scxpand.lightgbm.run_lightgbm_.run_lightgbm_inference(model, data_format, adata=None, data_path=None, eval_row_inds=None)#

Run inference using a trained LightGBM model.

Parameters:
  • model (BaseEstimator) – Trained LightGBM model

  • data_format (DataFormat) – Data format specification for preprocessing

  • adata (AnnData | None (default: None)) – AnnData object containing gene expression data (alternative to data_path)

  • data_path (str | Path | None (default: None)) – Path to data file (alternative to adata)

  • eval_row_inds (ndarray | None (default: None)) – Indices of rows to evaluate (if None, uses all rows)

Return type:

ndarray

Returns:

Array of prediction probabilities for the positive class

scxpand.lightgbm.run_lightgbm_.run_lightgbm_training(base_save_dir, prm, data_path, dev_ratio=0.2, trial=None, score_metric='harmonic_avg/AUROC', resume=False)#

Train a LightGBM model for gene expression classification.

Parameters:
  • base_save_dir (str | Path) – Directory to save model and results

  • prm (LightGBMParams) – LightGBM parameters

  • data_path (str) – Path to data file

  • dev_ratio (float (default: 0.2)) – Ratio of data to use for validation

  • trial (Trial | None (default: None)) – Optuna trial object for hyperparameter optimization

  • score_metric (str (default: 'harmonic_avg/AUROC')) – Metric to use for scoring

  • resume (bool (default: False)) – Whether to resume from existing checkpoint (not implemented for LightGBM)

Return type:

dict[str, dict[str, float]]

Returns:

Dictionary containing evaluation results