Venue: Hyatt Regency Seattle
Proceedings available now
Christopher Potts (Stanford University) (Invited talk)
|Lexical semantics in the time of large language models|
|Abstract: Present-day language models provide contextual representations of words: a given element of the model's vocabulary will generally be represented differently depending on the larger (linguistic and/or non-linguistic) context in which it appears. This is undoubtedly a key feature of the success of these models. What do these contextual representations mean for linguists working on lexical semantics? In this talk, I'll argue, first and foremost, that contextual representations are better aligned with linguists' conception of word meaning than any previous representation scheme in NLP, including symbolic approaches. I will then describe some ways in which linguists can use large pretrained language models to gain new insights into lexical meaning, and some ways in which language model development could be fruitfully informed by findings in linguistics. The overall take-away is that the turn toward contextual representations has created an exciting new space for collaboration between linguists and NLPers.|
|Bio: Christopher Potts is Professor and Chair of Linguistics and Professor (by courtesy) of Computer Science at Stanford, and a faculty member in the Stanford NLP Group and the Stanford AI Lab. His group uses computational methods to explore topics in emotion expression, context-dependent language use, systematicity and compositionality, and model interpretability. This research combines methods from linguistics, cognitive psychology, and computer science, in the service of both scientific discovery and technology development. He is the author of the 2005 book The Logic of Conventional Implicatures as well as numerous scholarly papers in computational and theoretical linguistics.|
|14:30||Collin F. Baker, Michael Ellsworth, Miriam R.L. Petruck (ICSI), & Arthur Lorenzi (Federal University of Juiz de Fora, Brazil)|
|Comparing Distributional and Curated Approaches for Cross-lingual Frame Alignment|
|Abstract: Despite advances in statistical approaches to the modeling of meaning, many questions about the ideal way of exploiting both knowledge-based (e.g., FrameNet, WordNet) and data-based methods (e.g., BERT) remain unresolved. This workshop focuses on these questions with three session papers that run the gamut from highly distributional methods (Lekkas et al.), to highly curated methods (Gamonal), and techniques with statistical methods producing structured semantics (Lawley and Schubert).
In addition, we report on a small comparison of cross-lingual techniques for frame semantic alignment for one language pair (Spanish and English). None of the distributional techniques consistently aligns the 1-best frame match from English to Spanish, all failing in at least one case. Predicting which techniques will align which frames cross-linguistically is not possible from any known characteristic of the alignment technique or the frames. Although distributional techniques are a rich source of semantic information for many tasks, at present curated, knowledge-based semantics remains the only technique that can consistently align frames across languages.
|15:30||Andrea Lekkas, Peter Schneider-Kamp (University of Southern Denmark), & Isabelle Augenstein (University of Copenhagen)|
|Multi-sense Language Modelling|
|Abstract: The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle the same word form having a plurality of meanings (polysemy). Currently, none of the common language modelling architectures explicitly model polysemy. We propose a language model which not only predicts the next word, but also its sense in context. We argue that this higher prediction granularity may be useful for end tasks such as assistive writing, and allow for more a precise linking of language models with knowledge bases. We find that multi-sense language modelling requires architectures that go beyond standard language models, and here propose a localized prediction framework that decomposes the task into a word followed by a sense prediction task. To aid sense prediction, we utilise a Graph Attention Network, which encodes definitions and example uses of word senses. Overall, we find that multi-sense language modelling is a highly challenging task, and suggest that future work focus on the creation of more annotated training datasets.|
|16:00||Maucha Gamonal (Federal University of Minas Gerais, Brazil)|
|A Descriptive Study of Metaphors and Frames in the Multilingual Shared Annotation Task|
|Abstract: This work assumes that languages are structured by semantic frames, which are schematic representations of concepts. Metaphors, on the other hand, are cognitive projections between domains, which are the result of our interaction in the world, through experiences, expectations and human biology itself. In this work, we use both semantic frames and metaphors in multilingual contrast (Brazilian Portuguese, English and German). The aim is to present a descriptive study of metaphors and frames in the multilingual shared annotation task of Multilingual FrameNet, a task which consisted of using frames from Berkeley FrameNet to annotate a parallel corpora. The result shows parameters for cross-linguistic annotation considering frames and metaphors.|
|16:30||Lane Lawley & Lenhart Schubert (University of Rochester)|
|Logical Story Representations via FrameNet + Semantic Parsing|
|Abstract: We propose a means of augmenting FrameNet parsers with a formal logic parser to obtain rich semantic representations of events. These schematic representations of the frame events, which we call Episodic Logic (EL) schemas, abstract constants to variables, preserving their types and relationships to other individuals in the same text. Due to the temporal semantics of the chosen logical formalism, all identified schemas in a text are also assigned temporally bound “episodes” and related to one another in time. The semantic role information from the FrameNet frames is also incorporated into the schema’s type constraints. We describe an implementation of this method using a neural FrameNet parser, and discuss the approach’s possible applications to question answering and open-domain event schema learning.|
Broadly speaking, computational linguistics research can be divided into two main streams: The first consists of work that relies primarily on operationalizing prior knowledge about language and its use, such as scripts, planning, scenarios, scripts for virtual assistants and FrameNet (FN) frames (Ruppenhofer et al., 2016) as well as lexical databases such as WordNet (Fellbaum 1998), VerbNet (Kipper et al., 2000), and PropBank (Palmer et al., 2005), among others. The second seeks to derive knowledge directly from data (text, speech, and increasingly vision) with unsupervised (or distantly supervised) methods, which are distributional and frequency-based, in Linguistics (Biber et al. 2020), Cognitive Science (Xu and Xu 2021), and Computational Linguistics, notably vector embeddings like BERT (Devlin et al. 2019). They are often complementary: Kuznetsov and Gurevych (2018) combine POS tagging and lemmatization to improve vector embeddings; Qian et al. (2021) combine syntactic knowledge with neural language models to improve accuracy.
These issues are as pertinent today as they were at the 1994 ACL workshop "The Balancing Act: Combining Symbolic and Statistical Approaches to Language".
Despite great advances in statistical approaches to meaning, many questions remain unresolved. Specifically, what important dimensions of meaning can vectors obtain that FrameNet or other human-curated resources cannot and vice versa? What techniques best recover different dimensions of meaning? The goal of the workshop is to bring together researchers working in each of these two main approaches to address questions such as:
- What are the strengths and limitations of each approach?
- Are there types of knowledge that can be extracted from text/speech by one of them and not the other? Why?
- How well can each represent relations and enable reasoning over text?
- What limits the further progress of each approach?
- In a perfect world, how could the field overcome these limitations? Would combining the two approaches solve all the problems?
- What would overcoming such limitations accomplish for NLP/NLU?
We have gathered papers that explore the differences between knowledge-based approaches (particularly frame-based approaches) and distributional approaches, alignment tools or comparisons between tasks done with distributional techniques versus FrameNet, some that focus on one approach or the other, as it pertains to obtaining dimensions of meaning, and some that demonstrate ways to combine both approaches.
- Collin Baker, ICSI
- Michael Ellsworth, ICSI
- Miriam R. L. Petruck, ICSI
- Omri Abend (Hebrew U Jerusalem)
- Gerard de Melo (U Potsdam)
- Katrin Erk (U Texas Austin)
- Annette Frank (U Heidelberg)
- Richard Futrell (U CA Irvine)
- Christopher Potts (Stanford U)
- Michael Roth (U Saarland)
- Nathan Schneider (Georgetown U)
Baker, Collin F. and Arthur Lorenzi. 2020. Exploring Crosslinguistic Frame Alignment. In Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet. 77–84. Marseille, France. European Language Resources Association.
Biber, Douglas, Jesse Egbert, and Daniel Keller. 2020. Reconceptualizing register in a continuous situational space. Corpus Linguistics and Linguistic Theory16.3.581-616.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186. Minneapolis, Minnesota. Association for Computational Linguistics.
Ellsworth, Michael, Collin Baker, and Miriam R. L. Petruck. 2021. FrameNet and Typology. In Proceedings of the 3rd Workshop on Computational Typology and Multilingual NLP. 61–66. Association for Computational Linguistics.
Fellbaum, Christiane. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Kipper, Karen, Hoa Trang Dang, and Martha Palmer. 2000. Class-Based Construction of a Verb Lexicon. In Proceedings of the 17th National Conference on Artificial Intelligence and the 12th Conference on Innovative Applications of Artificial Intelligence. 91-696. Austin TX. AAAI Press.
Kuznetsov, Ilia and Iryna Gurevych. 2018. From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources. In Proceedings of the 27th International Conference on Computational Linguistics. 233–244. Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Palmer, Martha, Paul Kingsbury, and Dan Gildea. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics. 31.1.71-106.
Qian, Peng, Tahira Naseem, Roger Levy, and Ramón Fernandez Astudillo. 2021. Structural Guidance for Transformer Language Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . 3735–3745. Online. Association for Computational Linguistics.
Ruppenhofer, Josef Michael Ellsworth, Miriam R. L Petruck, Christopher R. Johnson, Collin F. Baker, Jan Scheffczyk. 2016. FrameNet II: Extended Theory and Practice.Online. Berkeley. ICSI.
Xu, Aotao and Yang Xu. 2021. Chaining and the formation of spatial semantic categories in childhood. In Proceedings of the 43rd Annual Meeting of the Cognitive Science Society