Evaluating Lexical Proficiency in Neural Language Models

This is the public repository for our paper: Evaluating Lexical Proficiency in Neural Language Models, C. Ciaccio, A. Miaschi, F. Dell'Orletta (ACL 2025).

The repository contains the resources and code that we developed in order to run our experiments for assessing the lexical proficiency of Italian neural language models. Specifically:

Datasets

The data folder contains:

100-neos.csv → neologism dataset
100-nonce.csv → nonce words dataset
ONLI-NEO → data extracted from Osservatorio Neologico della Lingua Italiana, ONLI.
it-dictionary-gz → data extracted from the Italian Wiktionary Wikizionario
(train-test-val)_dataset.csv → the splits for train, test and validation used in our experiments

More resources related on the Wiktionary data format, the parser and the ONLI scraper can be found in our Italian Wiktionary Parser repository.

Code

The code folder contains the file "finetuningT5.py" that we used to finetune all T5 models in a text-to-text multi-task learning setup (training + evaluation).

Human evaluation data

The annotation folder contains the human annotated scores of novelty and adhesion for the nonce words setting for each model (in batches of 25).

If you use any of the following contents for your work, we kindly ask you to cite our paper:

@inproceedings{ciaccio-etal-2025-evaluating,
    title = "Evaluating Lexical Proficiency in Neural Language Models",
    author = "Ciaccio, Cristiano  and
      Miaschi, Alessio  and
      Dell{'}Orletta, Felice",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.64/",
    pages = "1267--1286",
    ISBN = "979-8-89176-251-0",
    abstract = "We present a novel evaluation framework designed to assess the lexical proficiency and linguistic creativity of Transformer-based Language Models (LMs). We validate the framework by analyzing the performance of a set of LMs of different sizes, in both mono- and multilingual configuration, across tasks involving the generation, definition, and contextual usage of lexicalized words, neologisms, and nonce words. To support these evaluations, we developed a novel dataset of lexical entries for the Italian language, including curated definitions and usage examples sourced from various online platforms. The results highlight the robustness and effectiveness of our framework in evaluating multiple dimensions of LMs' linguistic understanding and offer an insight, through the assessment of their linguistic creativity, on the lexical generalization abilities of LMs."
}

Abstract: We present a novel evaluation framework designed to assess the lexical proficiency and linguistic creativity of Transformer-based Language Models (LMs). We validate the framework by analyzing the performance of a set of LMs of different sizes, in both mono- and multilingual configuration, across tasks involving the generation, definition, and contextual usage of lexicalized words, neologisms, and nonce words. To support these evaluations, we developed a novel dataset of lexical entries for the Italian language, including curated definitions and usage examples sourced from various online platforms. The results highlight the robustness and effectiveness of our framework in evaluating multiple dimensions of LMs' linguistic understanding and offer an insight, through the assessment of their linguistic creativity, on the lexical generalization abilities of LMs

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
annotation		annotation
code		code
data		data
img		img
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating Lexical Proficiency in Neural Language Models

Datasets

Code

Human evaluation data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Evaluating Lexical Proficiency in Neural Language Models

Datasets

Code

Human evaluation data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages