From Esperanto to Volapük: Interpretable Graph Features for Assessing Linguistic Complexity in Constructed Languages

Complexity metrics, constructed auxiliary languages, graph theory, linguistic complexity, regression models, explainability, interpretability, graph-based models

Аннотация

This paper explores the linguistic complexity of constructed auxiliary languages (conlangs) with two primary goals: evaluating their complexity using established linguistic criteria and developing a novel, interpretable method for quantifying language complexity. We examine seven well-known auxiliary languages and compare them to other types of conlangs. Using Johanna Nichols’ complexity metric, we assign a complexity score to each language and compare these scores to those of natural languages. Additionally, we introduce a graph-based method for measuring and explaining language complexity, focusing on how the structure of word co-occurrence networks can be interpreted to reveal underlying linguistic patterns. By training regression models on these networks, we aim to provide linguists with interpretable graph features that not only capture complexity patterns but also offer insights into the linguistic structures that contribute to these patterns. Our findings indicate that while the graph-based approach effectively captures language complexity, its predictions for unseen languages reveal limitations, highlighting areas for further research in both linguistic theory and the development of robust AI models.

Читать в источнике Cкачать pdf