Accepted papers
NLP Power! The First Workshop on Efficient Benchmarking in NLP.
Oral presentations:
- Jaihyun Park, Sullam Jeoung: Benchmark dataset for others: A survey of current practices of benchmark dataset sharing platforms
- Phyllis Ang, Bhuwan Dhingra, Lisa Wu Wills: Characterizing the Efficiency vs Accuracy Trade-off for Long-Context NLP Models
- Pedro Henrique Luz de Araujo, Benjamin Roth: Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection
- Kabir Ahuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury: Beyond Static models and test sets: Benchmarking the potential of pre-trained models across tasks and languages
- David Harbecke, Yuxuan Chen, Leonhard Hennig, Christoph Alt: Why only Micro-? Class Weighting of Measures for Relation Classification
- Derek Tam, Anisha Mascarenhas, Mohit Bansal, Colin Raffel: A Simple and Cheap Multiple-Choice Probe for Evaluating Factuality in Text Generation Models
Posters:
- Usman Naseem, Byoung Chan Lee, Matloob Khushi, Jinman Kim, Adam Dunn: Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model
- Cristobal Eyzaguirre, Felipe del Rio, Vladimir Araujo, Alvaro Soto: DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference
- Giuseppe Attanasio, Debora Nozza, Eliana Pastor, Dirk Hovy: Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection
- Wencong You, Daniel Lowd: Towards Stronger Adversarial Baselines Through Human-AI Collaboration
- Amr Keleg, Matthias Lindemann, Danyang Liu, Wanqiu Long, Bonnie L. Webber: Automatically discarding straplines to improve data quality for abstractive news summarization
- Kathrin Blagec, Georg Dorffner, Milad Moradi, Simon Ott, Matthias Samwald: A global analysis of metrics used for measuring performance in natural language processing
- Federico Bianchi, Debora Nozza, Dirk Hovy: Benchmarking Transformations with Language Invariant Properties
ACL findings:
- Mousumi Akter, Naman Bansal, Shubhra Kanti Karmaker: Revisiting Automatic Evaluation of Extractive Summarization Task: Can We Do Better than ROUGE?(oral presentation)