To open hyperparameters settings, click Edit underneath the Hyperparameters dropdown menu to open a dialog where you can set custom hyperparameters.

1. Network Type

This tab defines the deep learning network we would like to use.

Network Selections

Options	Description

Options	Description
RCAN	For denoising and super-resolution. This is also the model used by our Nature Methods paper. https://www.nature.com/articles/s41592-021-01155-x
UNet	For virtual staining and segmentation. [1505.04597] U-Net: Convolutional Networks for Biomedical Image Segmentation (arxiv.org)

Network Shape

Options: 2D or 3D

Description: By default, you should choose a 2D model for 2D image data and a 3D model for 3D data. But you can train a 2D model using 3D data. It will process the image slice-by-slice.

RCAN Specific Parameters

Number of Filters

Default: 32

Description: Number of features (i.e. number of output channels of each convolution layer).

How to use it: Increase it for model complexity, reduce it for a smaller model.

Number of Residual Blocks

Default: 3

Description: Number of residual blocks in each residual group.

How to use it: Increase it for model complexity, reduce it for a smaller model.

Number of Residual Groups

Default: 3

Description: Number of residual groups.

How to use it: Increase it for model complexity, reduce it for a smaller model.

Channel Reduction Factor

Default: 8

Description: Channel reduction factor for the squeeze-and-excitation module.

How to use it: Increase channel reduction factor for better performance.

UNet Specific Parameters

Depth

Default: 4

Description: Depth of UNet architecture (the number of down/up-sampling).

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Number of Initial Filter

Default: 64

Description: Number of filters in the first convolution layer.

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Filter Growth Factor

Default: 64

Description: Number of filters added/subtracted when down/up-sampling.

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Normalization Type

Default: None

Description: Normalization method applied in the residual block. Currently three methods ("batch", "instance", and "group") are supported. Note that the number of groups for group normalization is hard-coded to 16. No normalization is performed if None is given.

How to use it: Try different normalization methods to see which method works the best for your dataset

Channel Reduction Factor

Default: 8

Description: Channel reduction factor for the squeeze-and-excitation module. See

How to use it: Increase channel reduction factor for better performance.

Use Attention Gate

Default: False

Description: If True, attention gates are applied to skip-connection signals.

How to use it: Automatically learns to focus on target structures of varying shapes and sizes. Try on/off this option to see if it works for your dataset.

Activation Type at the Last Layer

Default: Sigmoid

Description: Activation function applied to the output.

How to use it: Try different last layer activation functions to see if it works for your dataset.

2. Training Parameters

This tab defines some general parameters and how is Aivia going to update the model weights during training

Intensity Normalization Method

Options	Description	When to use

Options	Description	When to use
None	Use the raw input to train deep learning models	Choose this option if you want to use the original data to train or your input images have been normalized (Note: If the image is 8-bit or 16-bit, the scripts will error out and ask users to choose one of the normalization methods.)
Percentile	Normalizes the image intensity so that the 2nd and 99th percentiles are converted to 0 and 1 respectively.	Generally good for fluorescence images
Divide by Max	Using the max intensity value to normalize images.	Useful for normalizing segmentation mask

Data Augmentation

Options	Description	When to use

Options	Description	When to use
None	No augmentation	If you believe you have enough image pair samples
Rotate_and_flip	Randomly rotate and flip data to increase input data variety. Note that when this option is selected, you need to make sure the Block Size width and height are the same.	If you have a limited amount of data, allowing data augmentation generally gives you better results and prevents overfitting.

Block Size

Default: 256, 256, 16 (width, height, depth)

How to adjust: If your GPU is less capable, reduce each default by several pixels until you can run the training on your computer without out-of-memory issues. Do not make block size too small, otherwise, the model may not have enough pixels/voxels to pass down the convolution neural networks.

Foreground Patch Selection

Options	Description	When to use

Options	Description	When to use
Intensity threshold	If `intensity_threshold > 0`, pixels whose intensities are greater than this threshold will be considered as foreground.	Set the threshold when your images have fewer foregrounds. Try to start with a small number such as 0.25.
Area ratio threshold	If `intensity_threshold > 0`, the generator calculates the ratio of foreground pixels in a target patch and rejects the patch if the ratio is smaller than this threshold.	Set the threshold when your images have fewer foreground signals. Try to start with 0.05.

Optimizer

The optimizer is used for updating model weights. Generally, Adam is good for all kinds of tasks.

Default: Adam

Options: sgd, rmsprop, adagrad, adadelta, adamax, nadam

Initial Learning Rate

Default: 0.0001

How to adjust: Reduce it if overfitting.

Learning Rate Scheduling Method

Options	Description	When to use

Options	Description	When to use
Staircase exponential decay	drop the learning rate by half every `100` epochs.	Default
Exponential Decay	Exponentially reduce the learning rate on every epoch using the function: learning_rate = learning_rate*0.5^(epoch/100)	If staircase exponential decay does not work for your model
Reduce on Plateau	Reduce learning rate to 0.1*learning_rate when validation loss has stopped improving for more than 10 epochs.	For models that are harder to train.

Early Stopping

Default: False

How to use: Checked this if you want to stop training when validation loss has stopped improving for more than 10 epochs.

Batch Size

Default: 1

How to adjust: Increase it if you have a larger GPU RAM to speed up the training.

Number of Epochs

Default: 300

How to adust: Reduce it to reduce training time. Increase it if the model has room to improve and not overfit.

Steps Per Epochs

Default: 256

Description: steps*batch_size examples will be given to the model to update the weights.

How to adjust: Increase it if you want your models to see more examples per epoch

Loss Function

The goal function that the optimizer tries to minimize when updating model weights. Usually the lower the loss the better the results.

Options	Description	When to use

Options	Description	When to use
Mean absolute error	Measures the mean absolute error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT).	The default for Denoising, Super-Resolution, and Virtual Staining
balanced binary cross-entropy	Weighted version binary cross-entropy loss for imbalanced data.	Default for Segmentation
Mean squared error	Measures the mean squared error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT).	More sensitive to outlier compared to mean absolute error.
binary cross-entropy	BCE compares each of the predicted probabilities to binary Ground Truth	Good for segmentation, only when the data is balanced
dice loss	2*(Pred ∩ GT) / (Pred + GT)	Also good for imbalanced data

Metrics

Options	Description	When to use

Options	Description	When to use
PSNR	Computes the peak signal-to-noise ratio between two images. Note that the maximum signal value is assumed to be 1.	Denoising, Super-Resolution, and Virtual Staining
SSIM	Computes the structural similarity index between two images. Note that the maximum signal value is assumed to be 1.	Denoising, Super-Resolution, and Virtual Staining
Accuracy	Correct outputs/Total outputs	Segmentation

3. Apply Parameters

This tab defines how are we going to update the model weights during training

Intensity Normalization Method

It should be the same as the intensity normalization method in Training parameters.

Block Size

Unless your GPU can process a larger block at a time, you should use the same block size in Training Parameters. Do not choose a block size that is smaller than the training block size, the neural network will not have enough information to pass down.

Block Overlap Size

The overlap sizes between neighboring blocks.

Aivia Wiki

Deep Learning Hyperparameters Settings

1. Network Type

Network Selections

Network Shape

RCAN Specific Parameters

Number of Filters

Number of Residual Blocks

Number of Residual Groups

Channel Reduction Factor

UNet Specific Parameters

Depth

Number of Initial Filter

Filter Growth Factor

Normalization Type

Channel Reduction Factor

Use Attention Gate

Activation Type at the Last Layer

2. Training Parameters

Intensity Normalization Method

Data Augmentation

Block Size

Foreground Patch Selection

Optimizer

Initial Learning Rate

Learning Rate Scheduling Method

Early Stopping

Batch Size

Number of Epochs

Steps Per Epochs

Loss Function

Metrics

3. Apply Parameters

Intensity Normalization Method

Block Size

Block Overlap Size