0 references

Stack LSTM for Chinese Image Captioning

2021pp. 1613–1617

Abstract

Image captioning has attracted considerable attention in recent years. However, little work has been done for Chinese image captioning which has unique cultural characteristics and wording requirements. This paper studies how to generate more accurate Chinese image captions. We propose a novel Chinese image captioning model, which uses the pre-trained ResNet50 to extract the visual information of the image and a double-layer LSTM to predict each Chinese word. Applying this approach to Chinese image captioning, we obtained the better results on the AIC-ICC dataset compared with other image captioning algorithms, the method proposed in this paper greatly improves the evaluation performances and achieves BLEU-4 /CIDEr scores of 39.9/121.7, respectively. The actual generation results also show that the model can generate accurate, diverse and vivid Chinese caption of images.

Abstract

Related Papers