Semantic Scholar Open Access 2015 29 sitasi

genCNN: A Convolutional Architecture for Word Sequence Prediction

Mingxuan Wang Zhengdong Lu Hang Li Wenbin Jiang Qun Liu

Abstrak

We propose a novel convolutional architecture, named $gen$CNN, for word sequence prediction. Different from previous work on neural network-based language modeling and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Also different from the existing feedforward networks for language modeling, our model can effectively fuse the local correlation and global correlation in the word sequence, with a convolution-gating strategy specifically designed for the task. We argue that our model can give adequate representation of the history, and therefore can naturally exploit both the short and long range dependencies. Our model is fast, easy to train, and readily parallelized. Our extensive experiments on text generation and $n$-best re-ranking in machine translation show that $gen$CNN outperforms the state-of-the-arts with big margins.

Topik & Kata Kunci

Penulis (5)

M

Mingxuan Wang

Z

Zhengdong Lu

H

Hang Li

W

Wenbin Jiang

Q

Qun Liu

Format Sitasi

Wang, M., Lu, Z., Li, H., Jiang, W., Liu, Q. (2015). genCNN: A Convolutional Architecture for Word Sequence Prediction. https://doi.org/10.3115/v1/P15-1151

Akses Cepat

Lihat di Sumber doi.org/10.3115/v1/P15-1151
Informasi Jurnal
Tahun Terbit
2015
Bahasa
en
Total Sitasi
29×
Sumber Database
Semantic Scholar
DOI
10.3115/v1/P15-1151
Akses
Open Access ✓