Fork me on GitHub

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

Abstract:

In this paper(arXiv:1609.03659) we present a noval fully convolutional network with multiple scale-associated side outputs to address the skeleton detection problem. By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network. The network is trained by multi-task learning, where one task is skeleton localization to classify whether a pixel is a skeleton pixel or not, and the other is skeleton scale prediction to regress the scale of each skeleton pixel. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors.


The Algorithm:

The proposed network architecture for skeleton extraction, which is converted from VGG16 net. Our network has 4 stages with Scale-associated Side-Output layers connected to the convolutional layers. Each of these SSOs can simutaneously detec object skeleton and regress the skeleton scales.

The proposed algorithm is inspired by a simple observation: a neuron can only detect skeleton with scale less than its receptive field. We designed SSO(Scale-associated Side-Output) to detect object skeleton with various scales at different convolution stages. Further more, we developed a muti-task learning paradigm to detect object skeleton and predict skeleton scale at the same time.

Figure below illustrates multi-task SSO at stage 2: the left blocks represent the skeleton detection SSO, right block represents the scale regression ScalePred-SSO. $a_{jk}^{(i)}$ indicates how likely pixel $j$ belongs to skeleton type $k$ at stage $i$, the skeleton types are defined according to their scales; $\hat{S}_j^i$ is predicted skeleton scale of pixel $j$ at stage $i$.


Performance Evaluation:

1. Skeleton detection

a): Qualitative illustration:

We show some detected results on SK-LARGE for several selected images, which shows our method outperform other competitors with a significat margin.

Illustration of skeleton extraction results on SK-LARGE for several selected images. The groundtruth skeletons are in yellow and the thresholded extraction results are in red. Thresholds were optimized over the whole dataset.

b): Quantitative evaluation:

We evaluate the skeleton detection results with the widely applied F-measure$=\frac{2 \cdot \text{Precision} \cdot\text{Recall}}{\text{Precision} + \text{Recall}}$

Performance comparison on SK-LARGE dataset, LMSDS is our newest method with multi-task multi side-output network mentioned in TIP paper, FSDS is the algorithm mentioned in our CVPR paper.

2. Object segmentation:

Once we get the detected skeleton and associated scale, we can easily recover object segmentation by drawing inscribed circles centered at skeleton point and using skeleton scale as circle radius, this procedure can be formulated as follows: $$ \text{seg}_i = \begin{cases} 1 \ \text{if} \ \text{distance}({x_i, \text{sk}_i}) < \hat{s}_i \\ 0 \ \text{others} \\ \end{cases} $$

Where $\text{sk}_i$ is the nearest skeleton point to pixel $x_i$, $\text{distance}(x_i, x_j)$ is the euclidean distance between two points, $\hat{s}_i$ is the predicted skeleton scale fo pixel $x_i$.

Recover object segmentation from detected skeleton and skeleton scale.

Figure below shows some object segmentation results on SK-LARGE dataset.

Illustration of object segmentation on SK-LARGE for sevaral selected images.


Code & Data:

We've released full tool chain for skeleton detection and performance evaluation, you can access them on github.

Code

How to use the code:

  1. Download SK-LARGE dataset and do data augmentation, refer to SK-LARGE for deteailed steps;
  2. Clone the source code: 'git clone https://github.com/zeakey/DeepSkeleton', and customize your own Makefile.config to build the source(Do Not Use The Official Caffe because there are new implemented layers which offical caffe doesn't include), this code is based on a old version of caffe, hence you may have some problems building it, I suggest you to opt off CUDNN due to compatibility issues;
  3. Start your training by run 'DeepSkeleton/examples/DeepSkeleton/solve.py', make sure the augmented data is put in proper folder, see 'DeepSkeleton/examples/DeepSkeleton/train_val.prototxt's data layer configuration.

Data

Skeleton detection datasets:

To the best of our knowledge, the skeleton detection related datasets are listed as below:

Our released SK-LARGE dataset with more than 1000 images annotated with object skeleton as well as edge, checkout the SK-LARGE dataset. If you use this dataset in your research, please consider to cite our paper.

All the PR-curve data used to plot pr-curves in our paper are available here (md5:6bf393652023c8d260d44bdc6597e06).

Pretrained model

  1. Pretrained model on SK-LARGE dataset.
  2. Pretrained model on WH-SYMMAX dataset.

Citation:

If you use our code or data, please consider citing relevant papers:
@article{shen2016object,
  title={Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs},
  author={Shen, Wei and Zhao, Kai and Jiang, Yuan and Wang, Yan and Zhang, Zhijiang and Bai, Xiang},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2016},
  pages={222-230},
  publisher={IEEE},
  howpublished = "\url{http://kaiz.xyz/deepsk}"
}
@article{shen2017deepskeleton,
  title={DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images},
  author={Shen, Wei and Zhao, Kai and Jiang, Yuan and Wang, Yan and Bai, Xiang and Yuille, Alan},
  journal={IEEE Transactions on Image Processing},
  volume={26},
  number={11},
  pages={5298-5311},
  year={2017},
  publisher={IEEE},
  howpublished = "\url{http://kaiz.xyz/deepsk}"
}

For any questions please contact us by email:

wei.shen@t.shu.edu.cn, zhaok1206@gmail.com, kouen93@gmail.com