/
Deep Learning Hyperparameters Settings

Aivia Software

Deep Learning Hyperparameters Settings

To open hyperparameters settings, click Edit underneath the Hyperparameters dropdown menu to open a dialog where you can set custom hyperparameters.

 

1. Network Type

This tab defines the deep learning network we would like to use.

Network Selections

Options

Description

Options

Description

RCAN

For denoising and super-resolution. This is also the model used by our Nature Methods paper. https://www.nature.com/articles/s41592-021-01155-x

UNet

For virtual staining and segmentation. [1505.04597] U-Net: Convolutional Networks for Biomedical Image Segmentation (arxiv.org)

Network Shape

Options: 2D or 3D

Description: By default, you should choose a 2D model for 2D image data and a 3D model for 3D data. But you can train a 2D model using 3D data. It will process the image slice-by-slice.

RCAN Specific Parameters

Number of Filters

Default: 32

Description: Number of features (i.e. number of output channels of each convolution layer).

How to use it: Increase it for model complexity, reduce it for a smaller model.

Number of Residual Blocks

Default: 3

Description: Number of residual blocks in each residual group.

How to use it: Increase it for model complexity, reduce it for a smaller model.

Number of Residual Groups

Default: 3

Description: Number of residual groups.

How to use it: Increase it for model complexity, reduce it for a smaller model.

Channel Reduction Factor

Default: 8

Description: Channel reduction factor for the squeeze-and-excitation module.

How to use it: Increase channel reduction factor for better performance.

UNet Specific Parameters

Depth

Default: 4

Description: Depth of UNet architecture (the number of down/up-sampling).

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Number of Initial Filter

Default: 64

Description: Number of filters in the first convolution layer.

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Filter Growth Factor

Default: 64

Description: Number of filters added/subtracted when down/up-sampling.

How to use it: Increase it to build a more complex model, reduce it for a smaller model.

Normalization Type

Default: None

Description: Normalization method applied in the residual block. Currently three methods ("batch", "instance", and "group") are supported. Note that the number of groups for group normalization is hard-coded to 16. No normalization is performed if None is given.

How to use it: Try different normalization methods to see which method works the best for your dataset

Channel Reduction Factor

Default: 8

Description: Channel reduction factor for the squeeze-and-excitation module. See

How to use it: Increase channel reduction factor for better performance.

Use Attention Gate

Default: False

Description: If True, attention gates are applied to skip-connection signals.

How to use it: Automatically learns to focus on target structures of varying shapes and sizes. Try on/off this option to see if it works for your dataset.

Activation Type at the Last Layer

Default: Sigmoid

Description: Activation function applied to the output.

How to use it: Try different last layer activation functions to see if it works for your dataset.

 

2. Training Parameters

This tab defines some general parameters and how is Aivia going to update the model weights during training

Intensity Normalization Method

Options

Description

When to use

Options

Description

When to use

None

Use the raw input to train deep learning models

Choose this option if you want to use the original data to train or your input images have been normalized (Note: If the image is 8-bit or 16-bit, the scripts will error out and ask users to choose one of the normalization methods.)

Percentile

Normalizes the image intensity so that the 2nd and 99th percentiles are converted to 0 and 1 respectively.

Generally good for fluorescence images

Divide by Max

Using the max intensity value to normalize images.

Useful for normalizing segmentation mask

 

Data Augmentation

Options

Description

When to use

Options

Description

When to use

None

No augmentation

If you believe you have enough image pair samples

Rotate_and_flip

Randomly rotate and flip data to increase input data variety. Note that when this option is selected, you need to make sure the Block Size width and height are the same.

If you have a limited amount of data, allowing data augmentation generally gives you better results and prevents overfitting.

Block Size

Default: 256, 256, 16 (width, height, depth)

How to adjust: If your GPU is less capable, reduce each default by several pixels until you can run the training on your computer without out-of-memory issues. Do not make block size too small, otherwise, the model may not have enough pixels/voxels to pass down the convolution neural networks.

Foreground Patch Selection

Options

Description

When to use

Options

Description

When to use

Intensity threshold

If intensity_threshold > 0, pixels whose intensities are greater than this threshold will be considered as foreground.

Set the threshold when your images have fewer foregrounds. Try to start with a small number such as 0.25.

Area ratio threshold

If intensity_threshold > 0, the generator calculates the ratio of foreground pixels in a target patch and rejects the patch if the ratio is smaller than this threshold.

Set the threshold when your images have fewer foreground signals. Try to start with 0.05.

Optimizer

The optimizer is used for updating model weights. Generally, Adam is good for all kinds of tasks.

Default: Adam

Options: sgd, rmsprop, adagrad, adadelta, adamax, nadam

 

Initial Learning Rate

Default: 0.0001

How to adjust: Reduce it if overfitting.

Learning Rate Scheduling Method

Options

Description

When to use

Options

Description

When to use

Staircase exponential decay

drop the learning rate by half every 100 epochs.

Default

Exponential Decay

Exponentially reduce the learning rate on every epoch using the function: learning_rate = learning_rate*0.5^(epoch/100)

If staircase exponential decay does not work for your model

Reduce on Plateau

Reduce learning rate to 0.1*learning_rate when validation loss has stopped improving for more than 10 epochs.

For models that are harder to train.

Early Stopping

Default: False

How to use: Checked this if you want to stop training when validation loss has stopped improving for more than 10 epochs.

Batch Size

Default: 1

How to adjust: Increase it if you have a larger GPU RAM to speed up the training.

Number of Epochs

Default: 300

How to adust: Reduce it to reduce training time. Increase it if the model has room to improve and not overfit.

Steps Per Epochs

Default: 256

Description: steps*batch_size examples will be given to the model to update the weights.

How to adjust: Increase it if you want your models to see more examples per epoch

Loss Function

The goal function that the optimizer tries to minimize when updating model weights. Usually the lower the loss the better the results.

Options

Description

When to use

Options

Description

When to use

Mean absolute error

Measures the mean absolute error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT).

The default for Denoising, Super-Resolution, and Virtual Staining

balanced binary cross-entropy

Weighted version binary cross-entropy loss for imbalanced data.

Default for Segmentation

Mean squared error

Measures the mean squared error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT).

More sensitive to outlier compared to mean absolute error.

binary cross-entropy

BCE compares each of the predicted probabilities to binary Ground Truth

Good for segmentation, only when the data is balanced

dice loss

2*(Pred ∩ GT) / (Pred + GT)

Also good for imbalanced data

Metrics

Options

Description

When to use

Options

Description

When to use

PSNR

Computes the peak signal-to-noise ratio between two images. Note that the maximum signal value is assumed to be 1.

Denoising, Super-Resolution, and Virtual Staining

SSIM

Computes the structural similarity index between two images. Note that the maximum signal value is assumed to be 1.

Denoising, Super-Resolution, and Virtual Staining

Accuracy

Correct outputs/Total outputs

Segmentation

3. Apply Parameters

This tab defines how are we going to update the model weights during training

Intensity Normalization Method

It should be the same as the intensity normalization method in Training parameters.

Block Size

Unless your GPU can process a larger block at a time, you should use the same block size in Training Parameters. Do not choose a block size that is smaller than the training block size, the neural network will not have enough information to pass down.

Block Overlap Size

The overlap sizes between neighboring blocks.