Text Summarization
來自專欄自然語言處理論文閱讀筆記
標題(2017):
Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization
概要:
- Although generated summaries are similar to source texts literally, they have low semantic relevance.
- We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries.
引入:
In this work, our goal is to improve the semantic relevance between source texts and generated summaries for Chinese social media text summarization.
During training, it maximizes the similarity score.
The representation of source texts is produced by an encoder, while that of summaries is computed by a decoder.
背景:
Current Chinese social media text summarization model is based on encoder-decoder framework.
Encoder-decoder model is able to compress source texts x into continuous vector representation with an encoder, and then generate the summary y.
Attention mechanism is introduced to better capture context information of source
texts:
When predicting an output word, the decoder takes account of attention vector.
模型:
The model consists of three components: encoder, decoder and a similarity function.
- The encoder compresses source texts into semantic vectors.
- and the decoder generates summaries and produces semantic vectors of the generated summaries.
- Finally, the similarity function evaluates the relevance between the sematic vectors of source texts and generated summaries.
- Our training objective is to maximize the similarity score.
Text Representation:
There are several methods to represent a text or a sentence, such as mean pooling of RNN output or reserving the last state of RNN.
We select the last output of RNN encoder as the semantic vector of the source text.
Actually, the last output contains information of both source text and generated summaries. We simply compute the semantic vector of the summary by subtracting from :
Semantic Relevance:
Here, we use cosine similarity to measure the semantic relevance, which is
represented with a dot product and magnitude:
Source text and summary share the same language, so it is reasonable to assume that their semantic vectors are distributed in the same space.
Training:
The objective is to minimize the loss function:
Experiments:
推薦閱讀:
※本周最新AI論文良心推薦,你想pick誰?
※Sequence Labeling的發展史(HMM,MEMM,CRF)
※<Attention is All You Need>閱讀筆記
※也來談談爆炸的NIPS
※隱馬爾科夫模型學習總結之一
TAG:自然語言處理 |