Seminar – March 12, 2019 @ 2 PM – Review of attention mechanisms in deep neural networks

WHEN: March 12, 2019

Time: 2pm

WHERE: Garda Room, DISI

Speakers : Prof. Renato De Mori ( McGIll University )


Recent encoder/decoder neural architectures with recurrent neural networks (RNN) including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) elements have been used with impressive performance. Nevertheless, the inherently sequential nature of RNNs precludes parallelization within training examples. This makes it difficult to identify useful context features in long sequences of data, as memory constraints limit batching across examples, potentially ignoring the structured information typical of natural language sequences.  To alleviate these problems, attention mechanisms have been introduced for modeling dependencies without regard to distance in the input or output related data. These mechanisms are based on the notion of attention as a non-uniform spatial distribution of relevant features, and the explicit scalar representation of their relative relevance, Mechanisms of attention have become almost a de facto standard in in tasks such as multi language translation, image and video captioning, question answering, language modeling, machine comprehension, relation extraction, and others. Motivations and essential components of a selection of mechanisms proposed between 2014 and 2018 will be reviewed, It includes self-attention, transformer, structured attention, multi-scale attention, semantic attention, compositional attention, graph attention, and learn to pay attention.




Host: Prof. Giuseppe Riccardi

Comments are closed.