scxpand.core.evaluation

scxpand.core.evaluation#

Core evaluation logic for scXpand models.

This module contains the domain logic for evaluating model predictions and orchestrating the evaluation pipeline. It depends only on utilities and has no dependencies on model-specific training code.

Functions

evaluate_predictions_and_save(y_pred_prob, ...)

Evaluate predictions against ground truth and save results.

scxpand.core.evaluation.evaluate_predictions_and_save(y_pred_prob, obs_df, model_type, save_path, eval_name='dev', score_metric='harmonic_avg/AUROC', trial=None)#

Evaluate predictions against ground truth and save results.

This is the main evaluation function that coordinates: 1. Extracting ground truth labels from observation data 2. Saving predictions to CSV file 3. Computing and saving evaluation metrics 4. Logging results

Parameters:
  • y_pred_prob (ndarray) – Predicted probabilities from model

  • obs_df (DataFrame) – DataFrame with cell metadata and ground truth labels

  • model_type (ModelType | str) – Type of model used for predictions

  • save_path (Path | None) – Directory to save evaluation results (None to skip saving)

  • eval_name (str (default: 'dev')) – Name for this evaluation (used in filenames)

  • score_metric (str (default: 'harmonic_avg/AUROC')) – Primary metric to optimize/report

  • trial (Trial | None (default: None)) – Optional Optuna trial for hyperparameter optimization

Return type:

dict

Returns:

Dictionary containing evaluation metrics and scores, or empty dict if evaluation skipped