scxpand.autoencoders.ae_params#

Classes

AutoEncoderParams([use_log_transform, ...])

Configuration parameters for autoencoder training and architecture.

class scxpand.autoencoders.ae_params.AutoEncoderParams(use_log_transform=True, n_epochs=10, early_stopping_patience=5, init_learning_rate=5e-05, ridge_lambda=0.01, l1_lambda=0.001, recon_loss_weight=1.0, cls_loss_weight=1.0, cat_loss_weight=1.0, weight_decay=0.001, max_grad_norm=1.0, lr_scheduler_config=<factory>, lr_scheduler_type=LRSchedulerType.REDUCE_LR_ON_PLATEAU, optimizer_type=OptimizerType.ADAMW, adam_betas=(0.9, 0.999), train_batch_size=2048, inference_batch_size=2048, sampler_type=SamplerType.RANDOM, latent_dim=32, encoder_hidden_dims=(64, ), decoder_hidden_dims=(64, ), classifier_hidden_dims=(16, ), dropout_rate=0.1, mask_rate=0.1, noise_std=0.0001, soft_loss_beta=1.0, soft_loss_start_epoch=None, positives_weight=1.0, train_log_interval=5, random_seed=42, model_type='standard', loss_type='mse', aux_categorical_types=<factory>)#

Configuration parameters for autoencoder training and architecture.

Contains all hyperparameters needed to configure and train an autoencoder model for T-cell expansion prediction. Includes architecture settings, training parameters, regularization, and optimization settings.

Architecture Parameters:

latent_dim: Dimensionality of the latent embedding space. encoder_hidden_dims: Hidden layer sizes for the encoder network. decoder_hidden_dims: Hidden layer sizes for the decoder network. classifier_hidden_dims: Hidden layer sizes for the classification head. dropout_rate: Dropout probability for regularization.

Training Parameters:

n_epochs: Maximum number of training epochs. early_stopping_patience: Epochs to wait for improvement before stopping. train_batch_size: Batch size for training. inference_batch_size: Batch size for inference.

Loss and Regularization:

recon_loss_weight: Weight for reconstruction loss component. cls_loss_weight: Weight for classification loss component. ridge_lambda: L2 regularization coefficient. l1_lambda: L1 regularization coefficient for latent vectors.

Example

>>> params = AutoEncoderParams(latent_dim=64, n_epochs=50)
>>> # Customize for your dataset
>>> params.encoder_hidden_dims = (128, 64)
>>> params.init_learning_rate = 1e-4
classmethod get_model_type()#

Return the model type identifier for this parameter class.

Return type:

ModelType

__init__(use_log_transform=True, n_epochs=10, early_stopping_patience=5, init_learning_rate=5e-05, ridge_lambda=0.01, l1_lambda=0.001, recon_loss_weight=1.0, cls_loss_weight=1.0, cat_loss_weight=1.0, weight_decay=0.001, max_grad_norm=1.0, lr_scheduler_config=<factory>, lr_scheduler_type=LRSchedulerType.REDUCE_LR_ON_PLATEAU, optimizer_type=OptimizerType.ADAMW, adam_betas=(0.9, 0.999), train_batch_size=2048, inference_batch_size=2048, sampler_type=SamplerType.RANDOM, latent_dim=32, encoder_hidden_dims=(64, ), decoder_hidden_dims=(64, ), classifier_hidden_dims=(16, ), dropout_rate=0.1, mask_rate=0.1, noise_std=0.0001, soft_loss_beta=1.0, soft_loss_start_epoch=None, positives_weight=1.0, train_log_interval=5, random_seed=42, model_type='standard', loss_type='mse', aux_categorical_types=<factory>)#
get_data_loader_params()#
Return type:

DataLoaderParams

get_dataset_params()#
Return type:

DataAugmentParams

get_lr_scheduler_params()#
Return type:

LRSchedulerParams

get_optimizer_params()#
Return type:

OptimizerParams

needs_pi_head()#

Return True if the loss type requires pi (zero-inflation) parameter.

Return type:

bool

needs_theta_head()#

Return True if the loss type requires theta (dispersion) parameter.

Return type:

bool

adam_betas: tuple[float, float] = (0.9, 0.999)#
aux_categorical_types: tuple[str, ...]#
cat_loss_weight: float = 1.0#
classifier_hidden_dims: tuple[int, ...] = (16,)#
cls_loss_weight: float = 1.0#
decoder_hidden_dims: tuple[int, ...] = (64,)#
dropout_rate: float = 0.1#
early_stopping_patience: int = 5#
encoder_hidden_dims: tuple[int, ...] = (64,)#
inference_batch_size: int = 2048#
init_learning_rate: float = 5e-05#
l1_lambda: float = 0.001#
latent_dim: int = 32#
loss_type: Literal['zinb', 'nb', 'mse'] = 'mse'#
lr_scheduler_config: dict[str, Any] | None#
lr_scheduler_type: LRSchedulerType = 'ReduceLROnPlateau'#
mask_rate: float = 0.1#
max_grad_norm: float = 1.0#
model_type: Literal['standard', 'fork'] = 'standard'#
n_epochs: int = 10#
noise_std: float = 0.0001#
optimizer_type: OptimizerType = 'AdamW'#
positives_weight: float = 1.0#
random_seed: int = 42#
recon_loss_weight: float = 1.0#
ridge_lambda: float = 0.01#
sampler_type: SamplerType = 'random'#
soft_loss_beta: float | None = 1.0#
soft_loss_start_epoch: int | None = None#
train_batch_size: int = 2048#
train_log_interval: int = 5#
use_log_transform: bool = True#
weight_decay: float = 0.001#