Do Speakers of Different Languages Perceive the World Differently? : Quantitative Analysis of Neo-cognitive Paradigms in French and English

Abstract

This study investigates whether statistical analysis of text data can reveal divergent reader reactions to the same content when presented in different languages. To address this question, we utilized Marcel Proust’s, “À la recherche du temps perdu,” a vast corpus spanning seven books with millions of words. Our analysis involved a comparison between the original raw French text and the raw English translation by Scott Moncrieff, incorporating three distinct analytical approaches. Linguistic Analysis: The initial segment of this study comprises a stochastic analysis of linguistic elements, encompassing words, syntax, grammar, and punctuation. These details are crucial in understanding the reader’s interpretation of the literary text. Visual Syntax Mapping: In the second part, we employed a visual syntax mapping (VSM) technique to create numerical vector values based on word placement and proximity within the text. This approach assigns numerical values to each word, enabling the text to be computationally processed by machine learning models. Cosine similarity measurements were computed for character names in relation to the surrounding words, generating a two-dimensional graph of the referential fictional space. Reader Reaction Analysis: The final phase of the study involved evaluating reader reactions to words based on eye movement and heart rate data to determine their positive or negative connotations. By computing values for every word in the text and averaging these values for each sentence, we created a comprehensive map of reader reactions throughout the seven books to determine where and how much the reader reaction differs based on the language.

Presenters

Wright Donald
Professor of French and Arabic, Director of Middle Eastern Studies, Global Languages and Cultures, Hood College, Maryland, United States

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

Communications and Linguistic Studies

KEYWORDS

Linguistics, Machine-Learning, Data, Quantitative Analysis, Translation