2025-11-21T15:34:16.184333

Formalizing Style in Personal Narratives

Cortal, Finkel

Personal narratives are stories authors construct to make meaning of their experiences. Style, the distinctive way authors use language to express themselves, is fundamental to how these narratives convey subjective experiences. Yet there is a lack of a formal framework for systematically analyzing these stylistic choices. We present a novel approach that formalizes style in personal narratives as patterns in the linguistic choices authors make when communicating subjective experiences. Our framework integrates three domains: functional linguistics establishes language as a system of meaningful choices, computer science provides methods for automatically extracting and analyzing sequential patterns, and these patterns are linked to psychological observations. Using language models, we automatically extract linguistic features such as processes, participants, and circumstances. We apply our framework to hundreds of dream narratives, including a case study on a war veteran with post-traumatic stress disorder. Analysis of his narratives uncovers distinctive patterns, particularly how verbal processes dominate over mental ones, illustrating the relationship between linguistic choices and psychological states.

academic

Formalizing Style in Personal Narratives

Basic Information

Paper ID: 2510.08649
Title: Formalizing Style in Personal Narratives
Authors: Gustave Cortal, Alain Finkel (Université Paris-Saclay, CNRS)
Classification: cs.CL (Computational Linguistics), cs.AI
Publication Date: October 13, 2025 (arXiv v2)
Paper Link: https://arxiv.org/abs/2510.08649

Abstract

Personal narratives are stories constructed by authors to make sense of their experiences. Style—the distinctive manner in which authors use language to express themselves—is fundamental to how these narratives convey subjective experience. However, there is a lack of formal frameworks for systematically analyzing these stylistic choices. This paper proposes a novel approach that formalizes style in personal narratives as patterns of linguistic choices made by authors when conveying subjective experience. The framework integrates three domains: functional linguistics establishes language as a system of meaningful choices, computer science provides methods for automatically extracting and analyzing sequence patterns, which are then associated with psychological observations. Using language models, linguistic features such as processes, participants, and circumstances are automatically extracted. The framework is applied to hundreds of dream narratives, including a case study of a veteran with post-traumatic stress disorder. Analysis of the veteran's narratives reveals distinctive patterns, particularly how verbal processes dominate mental processes, illustrating the relationship between linguistic choices and psychological states.

Research Background and Motivation

Problem Definition

Core Problem: Lack of formal frameworks for systematically analyzing stylistic choices in personal narratives. While existing stylistics and stylometry research is abundant, there is a shortage of operational tools to capture how individual thought patterns are manifested in linguistic forms.
Problem Significance:
- Personal narratives are crucial ways humans understand the world and shape identity
- In therapeutic settings, narrative reconstruction can facilitate recovery; formalized frameworks enable more precise identification of linguistic patterns associated with psychological states
- Support targeted interventions and therapeutic applications
Limitations of Existing Approaches:
- Traditional qualitative frameworks (such as Husserlian phenomenology, Adamard's cognitive process analysis) provide rich descriptions but do not offer operational tools for capturing how style is manifested in linguistic forms
- Existing systemic functional linguistics parsers are "experimental, domain-sensitive, and labor-intensive to adapt"
- Lack of automated large-scale analysis methods
Research Motivation: Building on work by Tellier and Finkel (1995), which defines linguistic style as lexical and syntactic patterns for expressing intent, this research develops a sequence-based framework to analyze how personal narratives convey subjective experience.

Core Contributions

Theoretical Contribution: Proposes a sequence-based framework grounded in systemic functional linguistics, defining style as patterns in sequences of linguistic choices
Methodological Innovation: Develops methodology using sequence analysis to automatically identify patterns
Empirical Research: Demonstrates through dream narrative case studies how pattern analysis reveals psychological insights and supports therapeutic applications
Technical Implementation: First attempt to automate systemic functional linguistics analysis using large language models

Methodology Details

Task Definition

Input: Personal narrative text Output: Sequence patterns of linguistic choices revealing stylistic features of how authors encode subjective experience Constraints: Based on systemic functional linguistics transitivity system (processes, participants, circumstances)

Model Architecture

1. Linguistic Feature Classification System

Based on Halliday's systemic functional linguistics, particularly the transitivity system:

Process Types:

Action: Physical actions and events in the material world
Mental: Internal experiences such as thoughts, perceptions, and emotions
Verbal: Communicative behaviors
State: Existence, possession, or states of being

Participants: Realized through noun phrases Circumstances: Realized through adverbial groups or prepositional phrases

2. Sequence Representation Framework

Each linguistic feature system is represented as a finite set Σ (alphabet):

Σprocess = {action, mental, verbal, state}

Multiple alphabets are combined through Cartesian product:

Σ = Σprocess × Σtense × Σaspect

3. Sequence Analysis Methods

Substring Analysis: Identifies repeated patterns of consecutive symbols Subsequence Analysis: Identifies patterns maintaining relative order but not requiring consecutiveness

Similarity Metric: Uses cosine similarity

cos(s1, s2) = (Σi xiyi) / (√(Σi xi²) × √(Σi yi²))

Clustering Method: Hierarchical agglomerative clustering (Ward linkage)

Technical Innovations

Automated Extraction: Uses Llama 3.1 8B instruction-tuned model to extract linguistic features through in-context learning, avoiding hand-crafted rules and expert annotation
Sequence Representation: Maps narratives to symbolic sequences supporting computational biology-inspired pattern analysis
Multi-scale Analysis: Multi-level pattern recognition from individual symbols to complex substrings
Psychological Association: Establishes connections between linguistic patterns and psychological states

Experimental Setup

Dataset

DreamBank Corpus:

Thousands of dream narratives collected in the United States
Analysis of five series: blind (long-term blind dreamers, n=361), ed (widower, n=139), izzy (adolescent, n=1091), merri (artist, n=202), viet (Vietnam War veteran with PTSD, n=566)
Benchmark construction: Random sampling of 10 narratives per series, totaling 720 dream narratives

Evaluation Metrics

Odds Ratio: Measures relative likelihood of specific substrings appearing in different series
Fisher's Exact Test (Holm-Bonferroni correction): Statistical significance testing
Silhouette Score: Clustering quality assessment
Cosine Similarity: Sequence similarity measurement

Comparison Methods

Comparative analysis with baseline (norm)
Pattern comparison across different series

Implementation Details

Model: Llama 3.1 8B Instruct
Hardware: Tesla V100 32GB, 80 hours runtime
Preprocessing: SpaCy sentence segmentation, language model sentence segmentation
Validation: Quantitative validation on 50 gold-standard sentences with 100% prediction accuracy

Experimental Results

Main Results

Vietnam War Veteran (viet) Case Analysis:

Substring Distribution Findings:

Verbal processes 40% higher than baseline (OR=1.4, p<0.05)
Mental processes 40% lower than baseline (OR=0.6, p<0.05)
Consecutive verbal process patterns significant: verbal.verbal (OR=2.00), verbal.verbal.verbal (OR=1.75)

Clustering Analysis:

Optimal clustering: 2 clusters with maximum silhouette score
Cluster 1 representative sequence: Highly action-oriented (action processes 23 times, mental processes 2 times), covering 274 sequences
Cluster 2 representative sequence: Action-state balanced (action processes 13 times, state processes 16 times, mental processes 4 times), covering 179 sequences

Case Analysis

Example Sequence Transformation:

"I wake in a dark room. I feel a cold wind. I tell myself to move."
→ Clause analysis → Feature extraction → Sequence: amv
→ Substrings: {am, mv}

Experimental Findings

Psychological State Association: The viet series constructs experience primarily through action and verbal processes rather than mental processes, potentially related to how trauma affects cognitive and emotional processing
Pattern Consistency: Veterans follow two templates: highly action-oriented structures or state-action alternating structures
Automation Effectiveness: Language model achieves 100% accuracy on standard test sets

Systemic Functional Linguistics Parsing

Early rule-based approaches: Limited coverage, domain-sensitive
Graph-based pipelines: Convert dependency trees to SFL networks
Supervised methods: Require expert-annotated data
This Paper's Innovation: Few-shot language model approach without hand-crafted grammars or verb dictionaries

Computational Analysis of Dream Narratives

Traditional methods: Dictionary-based manual coding systems
Distributional methods: Semantic space embeddings and topic clustering
Hybrid systems: Dictionary scoring + classifiers
Language model approaches: Sentiment detection and character prediction
This Paper's Distinction: Focus on "how it is said" rather than "what is said"

Conclusions and Discussion

Main Conclusions

Theoretical Contribution: Successfully formalizes style as sequence patterns of linguistic choices grounded in systemic functional linguistics
Method Effectiveness: Automated framework can reveal psychologically meaningful patterns
Application Potential: Supports narrative reconstruction and targeted interventions in therapeutic settings

Limitations

Automated Extraction Errors: Language models may misclassify processes or participants, affecting pattern reliability
Psychological Interpretation: Associations between linguistic choices and psychological states remain correlational and descriptive, requiring clinical validation
Feature Scope: Currently focuses only on process types; future work needs to extend to more fine-grained linguistic features

Future Directions

Author Profiling: Author characteristic inference based on subjective experience patterns
Style-Conditioned Generation: Generate narratives from choice sequences supporting therapeutic interventions
Complex Systems Approaches: Apply measures such as Lempel-Ziv complexity to quantify sequence redundancy
Clinical Validation: Combine with clinical assessments to validate psychological interpretations

In-Depth Evaluation

Strengths

Interdisciplinary Innovation: Successfully integrates functional linguistics, computer science, and psychology
Methodological Advancement: First use of large language models to automate SFL analysis
Practical Value: Provides operational tools for therapeutic applications
Theoretical Rigor: Grounded in mature systemic functional linguistics theory
Scalability: Framework adaptable to different linguistic features and application scenarios

Weaknesses

Limited Validation: Validation on only 50 standard samples; requires larger-scale expert annotation validation
Psychological Association: Lacks direct validation against clinical diagnoses
Language Coverage: Only tested on English dream narratives; cross-linguistic applicability unknown
Feature Simplification: Current analysis relatively simple, not fully leveraging SFL richness

Impact

Academic Contribution: Provides new research paradigm for intersection of computational linguistics and psychology
Application Prospects: Broad application potential in digital therapeutics, author analysis, style generation, and related fields
Reproducibility: Authors provide complete prompts, hyperparameters, and extracted sequences supporting research reproduction

Applicable Scenarios

Clinical Psychology: Assist therapists in analyzing patient narrative patterns
Forensic Linguistics: Author identification and characteristic analysis
Literary Studies: Quantitative analysis of authorial style
Digital Health: Mental health monitoring through personal diaries and narratives
Educational Applications: Writing style guidance and personalized feedback

References

The paper cites rich interdisciplinary literature, including:

Halliday et al. (2014): Theoretical foundations of systemic functional linguistics
Tellier and Finkel (1995): Early work on formalizing linguistic style
Banks (2019): SFL practical guidance
Domhoff and Schneider (2008): Quantitative dream analysis methods
Extensive literature from computational linguistics, psychology, and cognitive science

This paper demonstrates excellence in theoretical innovation, methodological advancement, and application prospects, opening new research directions for computational analysis of personal narratives with significant academic value and practical significance.