scxpand.util.general_util#
Functions
|
|
|
|
|
|
Compute row sums for any array-like object, returning NumPy array. |
|
|
Compute row scaling factors for normalization. |
Recursively convert enum objects to their string values for JSON serialization and logging. |
|
|
Create a copy of an array-like object if requested. |
|
Convert raw decision_function scores to probability estimates. |
Ensure the input is converted to a NumPy array. |
|
|
Flatten a nested dictionary, preserving the hierarchy in the keys. |
|
Convert a nested dictionary to a flattened dictionary with keys in the format 'key1/key2/...'. |
|
Convert a numeric float value to a string, with a given precision. |
|
Format a float number using fixed-point notation unless it is very small (but nonzero). |
Automatically detect and return the best available device for PyTorch. |
|
|
|
Get the last git commit link from the remote repository. |
|
|
Create a versioned directory path to avoid overwriting existing results. |
|
Load parameters from config file or use defaults, then apply overrides. |
|
Load model parameters from a saved JSON file. |
|
Log progress during inference or processing operations. |
|
Log nested metrics with hierarchical display and highlighted score metric. |
|
Convert nested metrics dictionary to pandas DataFrames for nice display in notebooks. |
|
Convert nested metrics dictionary to a formatted table string using pandas. |
|
Flatten a nested dictionary into a string with , separated values. |
|
Recursively returns a string containing a nested dictionary of scalars in a hierarchically indented multi-line format. |
|
Convert a number to a string, with a fixed number of decimal places. |
|
Save arbitrary dictionary data to a JSON file. |
|
Save parameters to a json file and save model type from parameter object. |
|
|
|
|
|
|
|
Convert a datetime object to a string. |
|
Convert tensor to numpy array. |
- scxpand.util.general_util.compute_false_negative_rate(label, prob_out, threshold=0.5)#
- Return type:
- scxpand.util.general_util.compute_false_positive_rate(label, prob_out, threshold=0.5)#
- Return type:
- scxpand.util.general_util.compute_row_sums(X)#
Compute row sums for any array-like object, returning NumPy array.
- scxpand.util.general_util.compute_scaling_factors(row_sums, target_sum, dtype=<class 'numpy.float32'>)#
Compute row scaling factors for normalization.
- scxpand.util.general_util.convert_enums_to_values(obj)#
Recursively convert enum objects to their string values for JSON serialization and logging.
- scxpand.util.general_util.copy_array_like(x, copy=True)#
Create a copy of an array-like object if requested.
- Parameters:
x – Array-like object (numpy array, torch tensor, sparse matrix)
copy (
bool(default:True)) – Whether to create a copy
- Returns:
Copy or reference to the original array
- scxpand.util.general_util.decisions_to_probabilities(decisions)#
Convert raw decision_function scores to probability estimates.
If decisions is 1-D (binary classifier) we apply a sigmoid. If it is 2-D (multi-class) we apply a numerically-stable softmax and return the probability of the positive / class-1 column (or column-0 if it is the only one). This mirrors the logic used during training.
- Return type:
- scxpand.util.general_util.ensure_numpy_array(x)#
Ensure the input is converted to a NumPy array.
Handles PyTorch tensors, sparse matrices, and other array-like objects.
- Parameters:
x – Array-like object (numpy array, torch tensor, sparse matrix, etc.)
- Return type:
- Returns:
NumPy array
- scxpand.util.general_util.flatten_dict(d, parent_key='', sep='/')#
Flatten a nested dictionary, preserving the hierarchy in the keys.
- scxpand.util.general_util.flatten_nested_dict(nested_dict, parent_key='')#
Convert a nested dictionary to a flattened dictionary with keys in the format ‘key1/key2/…’.
- scxpand.util.general_util.floats_to_str(a, precision=5)#
Convert a numeric float value to a string, with a given precision.
If the input is a data structure, convert all float elements in it to strings.
- scxpand.util.general_util.format_float(x, precision=4, threshold=0.001)#
Format a float number using fixed-point notation unless it is very small (but nonzero).
In which case scientific notation is used.
For scientific notation, trailing zeros in the significand and unnecessary zeros in the exponent are removed. For example, 5.000e-5 is formatted as 5e-5.
- scxpand.util.general_util.get_device()#
Automatically detect and return the best available device for PyTorch.
Checks for available hardware acceleration in order of preference: CUDA (NVIDIA) > MPS (Apple Silicon) > XPU (Intel) > CPU.
- Returns:
‘cuda’, ‘mps’, ‘xpu’, or ‘cpu’.
- Return type:
Device string
Example
>>> device = get_device() >>> print(f"Using device: {device}") >>> model = model.to(device)
- scxpand.util.general_util.get_elapsed_time_str(t0, t1)#
- scxpand.util.general_util.get_last_git_commit_link()#
Get the last git commit link from the remote repository.
- Return type:
- scxpand.util.general_util.get_new_version_path(save_path, start_index=1)#
Create a versioned directory path to avoid overwriting existing results.
If the target path already exists and contains files, creates a new versioned directory (e.g., ‘results_v_1’, ‘results_v_2’) to preserve existing data.
- scxpand.util.general_util.load_and_override_params(param_class, config_path=None, logger=None, **kwargs)#
Load parameters from config file or use defaults, then apply overrides.
- Parameters:
- Return type:
TypeVar(T, bound= BaseParams)- Returns:
The parameter object with overrides applied
- scxpand.util.general_util.load_params(save_path)#
Load model parameters from a saved JSON file.
Loads hyperparameters and configuration from training results directory.
- Parameters:
save_path (
Path|str) – Path to directory containing ‘parameters.json’ file.- Return type:
- Returns:
Dictionary containing all saved parameters.
Example
>>> params = load_params("results/model_001") >>> print(f"Learning rate: {params['init_learning_rate']}")
- scxpand.util.general_util.log_inference_progress(current_iteration, total_iterations, start_time, log_interval=20, logger_instance=None)#
Log progress during inference or processing operations.
- Parameters:
current_iteration (
int) – Current iteration number (0-indexed)total_iterations (
int) – Total number of iterationsstart_time (
float) – Start time of the process (from time.time())log_interval (
int(default:20)) – Log every N iterationslogger_instance (
BoundLogger|None(default:None)) – Logger object (default: module logger)
- Return type:
- scxpand.util.general_util.log_nested_metrics(metrics, logger_func, prefix='', group='validation', score_metric=None, epoch=None, use_table_format=True)#
Log nested metrics with hierarchical display and highlighted score metric.
- Parameters:
metrics (
dict) – Nested dictionary of metrics to loglogger_func – Logger function (e.g., logger.info)
prefix (
str(default:'')) – Prefix for log messagesgroup (
str(default:'validation')) – Group name for the metrics (e.g., “validation”, “test”)score_metric (
str|None(default:None)) – Key of the main score metric to highlightepoch (
int|None(default:None)) – Optional epoch number to include in messagesuse_table_format (
bool(default:True)) – If True, display metrics in table format instead of hierarchical
- Return type:
- scxpand.util.general_util.metrics_dict_to_dataframes(metrics, precision=4)#
Convert nested metrics dictionary to pandas DataFrames for nice display in notebooks.
- Parameters:
- Returns:
overall_df: DataFrame with overall metrics (Metric, Value columns)
category_df: DataFrame with category-specific metrics (categories as rows, metrics as columns)
Either DataFrame can be None if no data exists for that category
- Return type:
Tuple of (overall_df, category_df) where
- scxpand.util.general_util.metrics_dict_to_table(metrics, title='Metrics', precision=4)#
Convert nested metrics dictionary to a formatted table string using pandas.
- scxpand.util.general_util.nested_dict_to_flat_str(nested_scalars, omit_keys=None)#
Flatten a nested dictionary into a string with , separated values.
In case of a float, display it with 4 decimal places.
- Return type:
- scxpand.util.general_util.nested_dict_to_multiline_str(nested_scalars, indent=0, oneline_last_level=True)#
Recursively returns a string containing a nested dictionary of scalars in a hierarchically indented multi-line format.
Float values are displayed with improved formatting.
- Parameters:
- Returns:
The formatted multi-line string.
- Return type:
- scxpand.util.general_util.num2str(v)#
Convert a number to a string, with a fixed number of decimal places.
For floats, if their absolute value is small but nonzero, use scientific notation.
- Return type:
- scxpand.util.general_util.save_json_data(data, save_path)#
Save arbitrary dictionary data to a JSON file.
- scxpand.util.general_util.save_params(params, save_dir)#
Save parameters to a json file and save model type from parameter object.
- Parameters:
params (
BaseParams) – Parameter object (must inherit from BaseParams and have get_model_type method)save_dir (
Path|str) – Directory where to save the parameters
- scxpand.util.general_util.time_to_str(t, fmt='%Y-%m-%d %H:%M:%S')#
Convert a datetime object to a string.
- scxpand.util.general_util.to_np(x)#
Convert tensor to numpy array.