Research interests

My primary research interests are in the intersection of natural language processing and machine learning. In particular, I am interested in attention-based neural networks for representation learning and structured prediction with the purpose of modeling syntactic and semantic aspects of natural language. Currently, I am investigating deep learning models which provide the ability to represent long text sequences hierarchically and enable the transferring of knowledge across a large number of languages and output spaces with potential application to multilingual text classification and retrieval, language modeling, summarization, machine translation and zero-shot prediction.
→ Natural Language Processing
     compositionality & semantics, sentiment, multilinguality, classification, retrieval, machine translation, summarization
→ Machine Learning
     representation learning, structured prediction, weakly-supervised learning, attention-based models


  • resdec — a Pytorch implementation of a self-attentive residual decoder for neural machine translation which will be presented at NAACL 2018.
    PDF Code
  • mhan — a Python implementation of multilingual hierarchical attention networks for document classification which was presented at IJCNLP 2017.
    PDF Code Demo
  • wmil — a Python implementation of the weighted multiple-instance learning algorithm for aspect-based sentiment analysis which was presented at EMNLP 2014.
    PDF Code Demo
  • usent — a Python implementation of a dictionary-based sentiment classification method for subjectivity and polarity detection which was presented at CICling 2013.
    PDF Code


  • DW ー Deutsche-Welle Multilingual News Collection (600K documents, 5.5K classes, 8 languages)
  • HATDOC ー Human Attention for Document Classification (50K documents, 1.6K sentences, 3 classes)
  • MVSO ー Multilingual Visual Sentiment Ontology (7.36M images, 15.6K labels, 12 languages)
  • TED ー Lecture Recommendation Dataset with User Comments (1.2K lectures, 100K ratings, 200K comments)