scxpand.pretrained#

Pre-trained model management for scXpand.

This module provides functionality to download and manage pre-trained models from Google Drive, including automatic model loading for inference.

scxpand.pretrained.download_pretrained_model(model_name=None, model_url=None, cache_dir=None)#

Download a pre-trained model and return the path to the extracted model.

Uses Pooch for robust caching, automatic hash verification, and extraction. Pooch automatically computes SHA256 hashes on first download and verifies them on subsequent accesses for integrity checking. When a model is updated (different hash), Pooch automatically downloads the new version to a fresh cache directory, ensuring version updates work seamlessly. Supports both registry models and direct URLs, including DOI URLs.

By default, downloads to a .scxpand_cache directory in the current working directory, making it easy for users to manage and clean up downloaded models.

Parameters:
  • model_name (str | None (default: None)) – Name of pre-trained model from registry (alternative to model_url)

  • model_url (str | None (default: None)) – Direct URL to model file (alternative to model_name)

  • downloads (Supports HTTP/HTTPS URLs for direct)

  • cache_dir (Path | None (default: None)) – Custom cache directory (uses .scxpand_cache in current dir if None)

Return type:

Path

Returns:

Path to the extracted model directory or file

Raises:

ValueError – If neither model_name nor model_url is provided, or if both are provided

Examples

>>> # Registry model (downloads to ./.scxpand_cache/)
>>> model_path = download_pretrained_model(
...     model_name="pan_cancer_autoencoder"
... )
>>>
>>> # Direct URL (downloads to ./.scxpand_cache/)
>>> model_path = download_pretrained_model(
...     model_url="https://your-platform.com/model.zip"
... )
>>>
>>> # Custom cache directory
>>> model_path = download_pretrained_model(
...     model_url="https://figshare.com/ndownloader/files/model.zip",
...     cache_dir=Path("/my/custom/cache"),
... )
scxpand.pretrained.fetch_model_and_run_inference(model_name=None, model_url=None, data_path=None, adata=None, save_path=None, batch_size=None, num_workers=4, eval_row_inds=None)#

Internal function for running inference with pre-trained models.

This is an internal function that handles automatic model downloading, loading, and inference in a single call. Works with both file-based and in-memory data. For external use, use scxpand.run_inference() instead.

Parameters:
  • model_name (str | None (default: None)) – Name of pre-trained model from registry (alternative to model_url)

  • model_url (str | None (default: None)) – Direct URL to model ZIP file (alternative to model_name)

  • data_path (str | Path | None (default: None)) – Path to input data file (h5ad format) - alternative to adata

  • adata (AnnData | None (default: None)) – In-memory AnnData object - alternative to data_path

  • save_path (str | Path | None (default: None)) – Directory to save prediction results (optional)

  • batch_size (int | None (default: None)) – Batch size for inference (uses model default if None)

  • num_workers (int (default: 4)) – Number of workers for data loading

  • eval_row_inds (default: None) – Specific cell indices to evaluate (None for all cells)

Returns:

Structured results containing predictions, metrics (if available), and model info

Return type:

InferenceResults

Raises:
  • ValueError – If neither model_name nor model_doi provided, or neither data_path nor adata provided

  • FileNotFoundError – If data_path does not exist

Note

This is an internal function. For external use, use scxpand.run_inference() instead.

scxpand.pretrained.get_pretrained_model_info(model_name)#

Get information about a pre-trained model.

Parameters:

model_name (str) – Name of the pre-trained model

Return type:

PretrainedModelInfo

Returns:

PretrainedModelInfo object containing model metadata

Raises:

ValueError – If model_name is not found in registry

Modules

download_manager

Download manager for pre-trained models using Pooch.

inference_api

Internal inference API for pre-trained models.

model_registry

Registry for pre-trained models and their metadata.