Aivia Software
Deep Learning Hyperparameters Settings
To open hyperparameters settings, click Edit underneath the Hyperparameters dropdown menu to open a dialog where you can set custom hyperparameters.
1. Network Type
This tab defines the deep learning network we would like to use.
Network Selections
Options | Description |
---|---|
RCAN | For denoising and super-resolution. This is also the model used by our Nature Methods paper. https://www.nature.com/articles/s41592-021-01155-x |
UNet | For virtual staining and segmentation. [1505.04597] U-Net: Convolutional Networks for Biomedical Image Segmentation (arxiv.org) |
Network Shape
Options: 2D or 3D
Description: By default, you should choose a 2D model for 2D image data and a 3D model for 3D data. But you can train a 2D model using 3D data. It will process the image slice-by-slice.
RCAN Specific Parameters
Number of Filters
Default: 32
Description: Number of features (i.e. number of output channels of each convolution layer).
How to use it: Increase it for model complexity, reduce it for a smaller model.
Number of Residual Blocks
Default: 3
Description: Number of residual blocks in each residual group.
How to use it: Increase it for model complexity, reduce it for a smaller model.
Number of Residual Groups
Default: 3
Description: Number of residual groups.
How to use it: Increase it for model complexity, reduce it for a smaller model.
Channel Reduction Factor
Default: 8
Description: Channel reduction factor for the squeeze-and-excitation module.
How to use it: Increase channel reduction factor for better performance.
UNet Specific Parameters
Depth
Default: 4
Description: Depth of UNet architecture (the number of down/up-sampling).
How to use it: Increase it to build a more complex model, reduce it for a smaller model.
Number of Initial Filter
Default: 64
Description: Number of filters in the first convolution layer.
How to use it: Increase it to build a more complex model, reduce it for a smaller model.
Filter Growth Factor
Default: 64
Description: Number of filters added/subtracted when down/up-sampling.
How to use it: Increase it to build a more complex model, reduce it for a smaller model.
Normalization Type
Default: None
Description: Normalization method applied in the residual block. Currently three methods ("batch", "instance", and "group") are supported. Note that the number of groups for group normalization is hard-coded to 16. No normalization is performed if None
is given.
How to use it: Try different normalization methods to see which method works the best for your dataset
Channel Reduction Factor
Default: 8
Description: Channel reduction factor for the squeeze-and-excitation module. See
How to use it: Increase channel reduction factor for better performance.
Use Attention Gate
Default: False
Description: If True
, attention gates are applied to skip-connection signals.
How to use it: Automatically learns to focus on target structures of varying shapes and sizes. Try on/off this option to see if it works for your dataset.
Activation Type at the Last Layer
Default: Sigmoid
Description: Activation function applied to the output.
How to use it: Try different last layer activation functions to see if it works for your dataset.
2. Training Parameters
This tab defines some general parameters and how is Aivia going to update the model weights during training
Intensity Normalization Method
Options | Description | When to use |
---|---|---|
None | Use the raw input to train deep learning models | Choose this option if you want to use the original data to train or your input images have been normalized (Note: If the image is 8-bit or 16-bit, the scripts will error out and ask users to choose one of the normalization methods.) |
Percentile | Normalizes the image intensity so that the 2nd and 99th percentiles are converted to 0 and 1 respectively. | Generally good for fluorescence images |
Divide by Max | Using the max intensity value to normalize images. | Useful for normalizing segmentation mask |
Data Augmentation
Options | Description | When to use |
---|---|---|
None | No augmentation | If you believe you have enough image pair samples |
Rotate_and_flip | Randomly rotate and flip data to increase input data variety. Note that when this option is selected, you need to make sure the Block Size width and height are the same. | If you have a limited amount of data, allowing data augmentation generally gives you better results and prevents overfitting. |
Block Size
Default: 256, 256, 16 (width, height, depth)
How to adjust: If your GPU is less capable, reduce each default by several pixels until you can run the training on your computer without out-of-memory issues. Do not make block size too small, otherwise, the model may not have enough pixels/voxels to pass down the convolution neural networks.
Foreground Patch Selection
Options | Description | When to use |
---|---|---|
Intensity threshold | If | Set the threshold when your images have fewer foregrounds. Try to start with a small number such as 0.25. |
Area ratio threshold | If | Set the threshold when your images have fewer foreground signals. Try to start with 0.05. |
Optimizer
The optimizer is used for updating model weights. Generally, Adam is good for all kinds of tasks.
Default: Adam
Options: sgd, rmsprop, adagrad, adadelta, adamax, nadam
Initial Learning Rate
Default: 0.0001
How to adjust: Reduce it if overfitting.
Learning Rate Scheduling Method
Options | Description | When to use |
---|---|---|
Staircase exponential decay | drop the learning rate by half every | Default |
Exponential Decay | Exponentially reduce the learning rate on every epoch using the function: learning_rate = learning_rate*0.5^(epoch/100) | If staircase exponential decay does not work for your model |
Reduce on Plateau | Reduce learning rate to 0.1*learning_rate when validation loss has stopped improving for more than 10 epochs. | For models that are harder to train. |
Early Stopping
Default: False
How to use: Checked this if you want to stop training when validation loss has stopped improving for more than 10 epochs.
Batch Size
Default: 1
How to adjust: Increase it if you have a larger GPU RAM to speed up the training.
Number of Epochs
Default: 300
How to adust: Reduce it to reduce training time. Increase it if the model has room to improve and not overfit.
Steps Per Epochs
Default: 256
Description: steps*batch_size examples will be given to the model to update the weights.
How to adjust: Increase it if you want your models to see more examples per epoch
Loss Function
The goal function that the optimizer tries to minimize when updating model weights. Usually the lower the loss the better the results.
Options | Description | When to use |
---|---|---|
Mean absolute error | Measures the mean absolute error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT). | The default for Denoising, Super-Resolution, and Virtual Staining |
balanced binary cross-entropy | Weighted version binary cross-entropy loss for imbalanced data. | Default for Segmentation |
Mean squared error | Measures the mean squared error (MAE) between each element in the Prediction(Pred) and Ground Truth(GT). | More sensitive to outlier compared to mean absolute error. |
binary cross-entropy | BCE compares each of the predicted probabilities to binary Ground Truth | Good for segmentation, only when the data is balanced |
dice loss | 2*(Pred ∩ GT) / (Pred + GT) | Also good for imbalanced data |
Metrics
Options | Description | When to use |
---|---|---|
PSNR | Computes the peak signal-to-noise ratio between two images. Note that the maximum signal value is assumed to be 1. | Denoising, Super-Resolution, and Virtual Staining |
SSIM | Computes the structural similarity index between two images. Note that the maximum signal value is assumed to be 1. | Denoising, Super-Resolution, and Virtual Staining |
Accuracy | Correct outputs/Total outputs | Segmentation |
3. Apply Parameters
This tab defines how are we going to update the model weights during training
Intensity Normalization Method
It should be the same as the intensity normalization method in Training parameters.
Block Size
Unless your GPU can process a larger block at a time, you should use the same block size in Training Parameters. Do not choose a block size that is smaller than the training block size, the neural network will not have enough information to pass down.
Block Overlap Size
The overlap sizes between neighboring blocks.