Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation
Citations Over Time
Abstract
Convolutional Neural Networks (CNNs) can be shifted across 2D images or 3D videos to segment them. They have a fixed input size and typically perceive only small local contexts of the pixels to be classified as foreground or background. In contrast, Multi-Dimensional Recurrent NNs (MD-RNNs) can perceive the entire spatio-temporal context of each pixel in a few sweeps through all pixels, especially when the RNN is a Long Short-Term Memory (LSTM). Despite these theoretical advantages, however, unlike CNNs, previous MD-LSTM variants were hard to parallelize on GPUs. Here we re-arrange the traditional cuboid order of computations in MD-LSTM in pyramidal fashion. The resulting PyraMiD-LSTM is easy to parallelize, especially for 3D data such as stacks of brain slice images. PyraMiD-LSTM achieved best known pixel-wise brain image segmentation results on MRBrainS13 (and competitive results on EM-ISBI12).
Related Papers
- → Deep Residual Learning for Image Recognition(2016)216,943 cited
- → Long Short-Term Memory(1997)95,038 cited
- → U-Net: Convolutional Networks for Biomedical Image Segmentation(2015)86,371 cited
- → ImageNet classification with deep convolutional neural networks(2017)75,550 cited
- → Fully convolutional networks for semantic segmentation(2015)36,376 cited