Intel® Labs Presents Natural Language Processing Research at ACL 2021

Highlights

  • The 59th annual meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021) runs from August 1-6, 2021.

  • Intel® Labs presents research on natural language processing (NLP), including language model fine-tuning and the first end-to-end model for CD coreference resolution from raw text.

author-image

By

This year marks the 59th year of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), which runs online from August 1-6, 2021. This is the first year that both the Association for Computational Linguistics and the Asian Federation of Natural Language Processing will join in one conference. 

The event will bring together leaders in computational linguistics, including computational modeling of and solutions for NLP, a field of linguistics, computer science, and artificial intelligence connecting computers and human language. The conference will focus on advancements in human language functionality for applications like speech recognition systems, text-to-speech, or automated voice response systems. 

Intel Labs is pleased to present two research papers in language processing, including one on improving the efficiency to improve system performance and reducing the energy footprint of language model fine-tuning. The second paper introduces the industry’s first end-to-end model for cross-document (CD) coreference resolutions from raw text and impressive results for event and entity coreference resolution on gold mentions. In addition, the research shows that this model is simpler and more efficient than recent CD coreference resolution systems, while not using any external resources.

Following are the two papers being presented at the conference:
 

  • Selecting Informative Contexts Improves Language Model Fine-Tuning

    Language model fine-tuning is essential for modern natural language processing, but is computationally expensive and time consuming. Further, the effectiveness of fine-tuning is limited by the inclusion of training examples that negatively affect performance. Here we present a general fine-tuning method that we call information gain filtration for improving the overall training efficiency and final performance of language model fine-tuning. We define the information gain of an example as the improvement on a test metric after training on that example. A secondary learner is then trained to approximate this quantity. 

    During fine-tuning, this learner selects informative examples and skips uninformative ones. We show that our method has consistent improvement across datasets, fine-tuning tasks, and language model architectures. For example, we achieve a median perplexity of 54.0 on a books dataset compared to 57.3 for standard fine-tuning. We present statistical evidence that offers insight into the improvements of our method over standard fine-tuning. The generality of our method leads us to propose a new paradigm for language model fine-tuning — we encourage researchers to release pretrained secondary learners on common corpora to promote efficient and effective fine-tuning, thereby improving the performance and reducing the overall energy footprint of language model fine-tuning.

    A link to the code can be found here: https://github.com/HuthLab/IGF
     
  • Cross-Document Coreference Resolution over Predicted Mentions

    Coreference resolution has been mostly investigated within a single document scope, showing impressive progress in recent years based on end-to-end models. However, the more challenging task of cross-document (CD) coreference resolution remained relatively under-explored, with the few recent models applied only to gold mentions. Here, we introduce the first end-to-end model for CD coreference resolution from raw text, which extends the prominent model for within document coreference to the CD setting. Our model achieves competitive or state-of-the-art results for event and entity coreference resolution on gold mentions. More importantly, we set first baseline results, on the standard ECB+ dataset, for CD coreference resolution over predicted mentions. Further, our model is simpler and more efficient than recent CD coreference resolution systems, while not using any external resources.