This is a implementation of Recurrent Discounted Attention unit that extends Tensorflow's RNNCell, RDA is builds on the RWA by additionally allowing the discounting of the past.
Accuracy

Cost

Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit 1705.08480v1.pdf
Recurrent Discounted Attention unit (RDA) for Tensorflow
网友评论