scxpand.core.inference

Contents

scxpand.core.inference#

Main public inference API for scXpand models.

This module provides the primary public interface for running inference with any type of scXpand model (local, registry, or URL-based).

Functions

run_inference([data_path, adata, ...])

Main public API for running inference with scXpand models.

scxpand.core.inference.run_inference(data_path=None, adata=None, model_path=None, model_name=None, model_url=None, save_path=None, batch_size=1024, num_workers=4, eval_row_inds=None)#

Main public API for running inference with scXpand models.

This is the primary entry point for running inference with any type of scXpand model. It automatically detects the model source and routes to the appropriate inference pipeline. Supports local models, registry models, and external models via URL. Metrics are automatically computed when ground truth labels are available in the data.

Parameters:
  • data_path (str | Path | None (default: None)) – Path to input data file (h5ad format). Alternative to adata.

  • adata (AnnData | None (default: None)) – In-memory AnnData object. Alternative to data_path.

  • model_path (str | Path | None (default: None)) – Path to local trained model directory (for local models).

  • model_name (str | None (default: None)) – Name of pre-trained model from registry (for registry models).

  • model_url (str | None (default: None)) – Direct URL to model ZIP file (for any external model).

  • save_path (str | Path | None (default: None)) – Directory to save prediction results (None to skip saving, just return results).

  • batch_size (int (default: 1024)) – Batch size for inference.

  • num_workers (int (default: 4)) – Number of workers for data loading.

  • eval_row_inds (default: None) – Specific cell indices to evaluate (None for all cells, only supported for local models).

Return type:

InferenceResults

Returns:

Structured results containing predictions, metrics (if available), and model info.

Raises:
  • ValueError – If model source is not specified or multiple sources are specified.

  • ValueError – If neither data_path nor adata is provided.

  • FileNotFoundError – If specified files do not exist.

Examples

>>> import scxpand
>>> # Local model inference
>>> results = scxpand.run_inference(
...     data_path="my_data.h5ad", model_path="results/mlp"
... )
>>> print(f"Generated {len(results.predictions)} predictions")
>>> if results.has_metrics:
...     print(f"AUROC: {results.get_auroc():.3f}")
>>> # Registry model inference
>>> results = scxpand.run_inference(
...     data_path="my_data.h5ad", model_name="pan_cancer_autoencoder"
... )
>>> if results.has_model_info:
...     print(f"Model type: {results.model_info.model_type}")
>>> # Direct URL inference (seamless model sharing!)
>>> results = scxpand.run_inference(
...     data_path="my_data.h5ad",
...     model_url="https://your-platform.com/model.zip",
... )
>>> # In-memory inference with any model type (no saving)
>>> import scanpy as sc
>>> adata = sc.read_h5ad("my_data.h5ad")
>>> results = scxpand.run_inference(
...     adata=adata, model_name="pan_cancer_autoencoder", save_path=None
... )
>>> # Results are returned but not saved to disk