Research
My research aims to develop natural language models that leverage long text sequences, generalize well from a few examples, and are computationally efficient. Such models can benefit core tasks such as language modeling and downstream tasks involving classification, translation, or alignment.
→ Natural Language Processing language modeling, machine translation, document modeling, style transfer, multilinguality, sentiment analysis and summarization |
→ Machine Learning weakly-supervised learning, attention mechanisms, long sequence modeling, conditional generation, distance metric learning |
Code
- groc - a Pytorch implementation grounded compositional output embeddings for adaptive language modeling, which was presented at EMNLP 2020.
PDF Code
- drill - a Pytorch implementation of deep residual output embedding layers for neural language generation which was presented at ICML 2019.
PDF Code - gile — a Keras implementation of generalized input-label embeddings for low-resource and zero-resource text classification which was presented in TACL 2019.
PDF Code - mhan — a Keras implementation of multilingual hierarchical attention networks for document classification which was presented at IJCNLP 2017.
PDF Code Demo
Datasets
- DRPL ー A Benchmark for Document Relation Prediction and Localization (873K documents, 656K pairs).
- DW ー Deutsche-Welle dataset for multilingual news text classification in 8 languages (600K documents, 5.5K classes).
- HATDOC ー Human Attention for Document Classification dataset for evaluation of attention-based models in aspect-based sentiment classification (50K documents, 1.6K sentences, 3 classes).
- MVSO† ー Multilingual Visual Sentiment Ontology with over 15K hierarchically organized visual concepts from 12 languages for sentiment concept detection (7.36M images, 15.6K labels).
- TED ー A lecture Recommendation Dataset with User Ratings and Comments (100K ratings, 200K comments).