2025-11-21T15:34:16.184333

Formalizing Style in Personal Narratives

Cortal, Finkel
Personal narratives are stories authors construct to make meaning of their experiences. Style, the distinctive way authors use language to express themselves, is fundamental to how these narratives convey subjective experiences. Yet there is a lack of a formal framework for systematically analyzing these stylistic choices. We present a novel approach that formalizes style in personal narratives as patterns in the linguistic choices authors make when communicating subjective experiences. Our framework integrates three domains: functional linguistics establishes language as a system of meaningful choices, computer science provides methods for automatically extracting and analyzing sequential patterns, and these patterns are linked to psychological observations. Using language models, we automatically extract linguistic features such as processes, participants, and circumstances. We apply our framework to hundreds of dream narratives, including a case study on a war veteran with post-traumatic stress disorder. Analysis of his narratives uncovers distinctive patterns, particularly how verbal processes dominate over mental ones, illustrating the relationship between linguistic choices and psychological states.
academic

Formalizing Style in Personal Narratives

Basic Information

  • Paper ID: 2510.08649
  • Title: Formalizing Style in Personal Narratives
  • Authors: Gustave Cortal, Alain Finkel (Université Paris-Saclay, CNRS)
  • Classification: cs.CL (Computational Linguistics), cs.AI
  • Publication Date: October 13, 2025 (arXiv v2)
  • Paper Link: https://arxiv.org/abs/2510.08649

Abstract

Personal narratives are stories constructed by authors to make sense of their experiences. Style—the distinctive manner in which authors use language to express themselves—is fundamental to how these narratives convey subjective experience. However, there is a lack of formal frameworks for systematically analyzing these stylistic choices. This paper proposes a novel approach that formalizes style in personal narratives as patterns of linguistic choices made by authors when conveying subjective experience. The framework integrates three domains: functional linguistics establishes language as a system of meaningful choices, computer science provides methods for automatically extracting and analyzing sequence patterns, which are then associated with psychological observations. Using language models, linguistic features such as processes, participants, and circumstances are automatically extracted. The framework is applied to hundreds of dream narratives, including a case study of a veteran with post-traumatic stress disorder. Analysis of the veteran's narratives reveals distinctive patterns, particularly how verbal processes dominate mental processes, illustrating the relationship between linguistic choices and psychological states.

Research Background and Motivation

Problem Definition

  1. Core Problem: Lack of formal frameworks for systematically analyzing stylistic choices in personal narratives. While existing stylistics and stylometry research is abundant, there is a shortage of operational tools to capture how individual thought patterns are manifested in linguistic forms.
  2. Problem Significance:
    • Personal narratives are crucial ways humans understand the world and shape identity
    • In therapeutic settings, narrative reconstruction can facilitate recovery; formalized frameworks enable more precise identification of linguistic patterns associated with psychological states
    • Support targeted interventions and therapeutic applications
  3. Limitations of Existing Approaches:
    • Traditional qualitative frameworks (such as Husserlian phenomenology, Adamard's cognitive process analysis) provide rich descriptions but do not offer operational tools for capturing how style is manifested in linguistic forms
    • Existing systemic functional linguistics parsers are "experimental, domain-sensitive, and labor-intensive to adapt"
    • Lack of automated large-scale analysis methods
  4. Research Motivation: Building on work by Tellier and Finkel (1995), which defines linguistic style as lexical and syntactic patterns for expressing intent, this research develops a sequence-based framework to analyze how personal narratives convey subjective experience.

Core Contributions

  1. Theoretical Contribution: Proposes a sequence-based framework grounded in systemic functional linguistics, defining style as patterns in sequences of linguistic choices
  2. Methodological Innovation: Develops methodology using sequence analysis to automatically identify patterns
  3. Empirical Research: Demonstrates through dream narrative case studies how pattern analysis reveals psychological insights and supports therapeutic applications
  4. Technical Implementation: First attempt to automate systemic functional linguistics analysis using large language models

Methodology Details

Task Definition

Input: Personal narrative text Output: Sequence patterns of linguistic choices revealing stylistic features of how authors encode subjective experience Constraints: Based on systemic functional linguistics transitivity system (processes, participants, circumstances)

Model Architecture

1. Linguistic Feature Classification System

Based on Halliday's systemic functional linguistics, particularly the transitivity system:

Process Types:

  • Action: Physical actions and events in the material world
  • Mental: Internal experiences such as thoughts, perceptions, and emotions
  • Verbal: Communicative behaviors
  • State: Existence, possession, or states of being

Participants: Realized through noun phrases Circumstances: Realized through adverbial groups or prepositional phrases

2. Sequence Representation Framework

Each linguistic feature system is represented as a finite set Σ (alphabet):

Σprocess = {action, mental, verbal, state}

Multiple alphabets are combined through Cartesian product:

Σ = Σprocess × Σtense × Σaspect

3. Sequence Analysis Methods

Substring Analysis: Identifies repeated patterns of consecutive symbols Subsequence Analysis: Identifies patterns maintaining relative order but not requiring consecutiveness

Similarity Metric: Uses cosine similarity

cos(s1, s2) = (Σi xiyi) / (√(Σi xi²) × √(Σi yi²))

Clustering Method: Hierarchical agglomerative clustering (Ward linkage)

Technical Innovations

  1. Automated Extraction: Uses Llama 3.1 8B instruction-tuned model to extract linguistic features through in-context learning, avoiding hand-crafted rules and expert annotation
  2. Sequence Representation: Maps narratives to symbolic sequences supporting computational biology-inspired pattern analysis
  3. Multi-scale Analysis: Multi-level pattern recognition from individual symbols to complex substrings
  4. Psychological Association: Establishes connections between linguistic patterns and psychological states

Experimental Setup

Dataset

DreamBank Corpus:

  • Thousands of dream narratives collected in the United States
  • Analysis of five series: blind (long-term blind dreamers, n=361), ed (widower, n=139), izzy (adolescent, n=1091), merri (artist, n=202), viet (Vietnam War veteran with PTSD, n=566)
  • Benchmark construction: Random sampling of 10 narratives per series, totaling 720 dream narratives

Evaluation Metrics

  • Odds Ratio: Measures relative likelihood of specific substrings appearing in different series
  • Fisher's Exact Test (Holm-Bonferroni correction): Statistical significance testing
  • Silhouette Score: Clustering quality assessment
  • Cosine Similarity: Sequence similarity measurement

Comparison Methods

  • Comparative analysis with baseline (norm)
  • Pattern comparison across different series

Implementation Details

  • Model: Llama 3.1 8B Instruct
  • Hardware: Tesla V100 32GB, 80 hours runtime
  • Preprocessing: SpaCy sentence segmentation, language model sentence segmentation
  • Validation: Quantitative validation on 50 gold-standard sentences with 100% prediction accuracy

Experimental Results

Main Results

Vietnam War Veteran (viet) Case Analysis:

Substring Distribution Findings:

  • Verbal processes 40% higher than baseline (OR=1.4, p<0.05)
  • Mental processes 40% lower than baseline (OR=0.6, p<0.05)
  • Consecutive verbal process patterns significant: verbal.verbal (OR=2.00), verbal.verbal.verbal (OR=1.75)

Clustering Analysis:

  • Optimal clustering: 2 clusters with maximum silhouette score
  • Cluster 1 representative sequence: Highly action-oriented (action processes 23 times, mental processes 2 times), covering 274 sequences
  • Cluster 2 representative sequence: Action-state balanced (action processes 13 times, state processes 16 times, mental processes 4 times), covering 179 sequences

Case Analysis

Example Sequence Transformation:

"I wake in a dark room. I feel a cold wind. I tell myself to move."
→ Clause analysis → Feature extraction → Sequence: amv
→ Substrings: {am, mv}

Experimental Findings

  1. Psychological State Association: The viet series constructs experience primarily through action and verbal processes rather than mental processes, potentially related to how trauma affects cognitive and emotional processing
  2. Pattern Consistency: Veterans follow two templates: highly action-oriented structures or state-action alternating structures
  3. Automation Effectiveness: Language model achieves 100% accuracy on standard test sets

Systemic Functional Linguistics Parsing

  • Early rule-based approaches: Limited coverage, domain-sensitive
  • Graph-based pipelines: Convert dependency trees to SFL networks
  • Supervised methods: Require expert-annotated data
  • This Paper's Innovation: Few-shot language model approach without hand-crafted grammars or verb dictionaries

Computational Analysis of Dream Narratives

  • Traditional methods: Dictionary-based manual coding systems
  • Distributional methods: Semantic space embeddings and topic clustering
  • Hybrid systems: Dictionary scoring + classifiers
  • Language model approaches: Sentiment detection and character prediction
  • This Paper's Distinction: Focus on "how it is said" rather than "what is said"

Conclusions and Discussion

Main Conclusions

  1. Theoretical Contribution: Successfully formalizes style as sequence patterns of linguistic choices grounded in systemic functional linguistics
  2. Method Effectiveness: Automated framework can reveal psychologically meaningful patterns
  3. Application Potential: Supports narrative reconstruction and targeted interventions in therapeutic settings

Limitations

  1. Automated Extraction Errors: Language models may misclassify processes or participants, affecting pattern reliability
  2. Psychological Interpretation: Associations between linguistic choices and psychological states remain correlational and descriptive, requiring clinical validation
  3. Feature Scope: Currently focuses only on process types; future work needs to extend to more fine-grained linguistic features

Future Directions

  1. Author Profiling: Author characteristic inference based on subjective experience patterns
  2. Style-Conditioned Generation: Generate narratives from choice sequences supporting therapeutic interventions
  3. Complex Systems Approaches: Apply measures such as Lempel-Ziv complexity to quantify sequence redundancy
  4. Clinical Validation: Combine with clinical assessments to validate psychological interpretations

In-Depth Evaluation

Strengths

  1. Interdisciplinary Innovation: Successfully integrates functional linguistics, computer science, and psychology
  2. Methodological Advancement: First use of large language models to automate SFL analysis
  3. Practical Value: Provides operational tools for therapeutic applications
  4. Theoretical Rigor: Grounded in mature systemic functional linguistics theory
  5. Scalability: Framework adaptable to different linguistic features and application scenarios

Weaknesses

  1. Limited Validation: Validation on only 50 standard samples; requires larger-scale expert annotation validation
  2. Psychological Association: Lacks direct validation against clinical diagnoses
  3. Language Coverage: Only tested on English dream narratives; cross-linguistic applicability unknown
  4. Feature Simplification: Current analysis relatively simple, not fully leveraging SFL richness

Impact

  1. Academic Contribution: Provides new research paradigm for intersection of computational linguistics and psychology
  2. Application Prospects: Broad application potential in digital therapeutics, author analysis, style generation, and related fields
  3. Reproducibility: Authors provide complete prompts, hyperparameters, and extracted sequences supporting research reproduction

Applicable Scenarios

  1. Clinical Psychology: Assist therapists in analyzing patient narrative patterns
  2. Forensic Linguistics: Author identification and characteristic analysis
  3. Literary Studies: Quantitative analysis of authorial style
  4. Digital Health: Mental health monitoring through personal diaries and narratives
  5. Educational Applications: Writing style guidance and personalized feedback

References

The paper cites rich interdisciplinary literature, including:

  • Halliday et al. (2014): Theoretical foundations of systemic functional linguistics
  • Tellier and Finkel (1995): Early work on formalizing linguistic style
  • Banks (2019): SFL practical guidance
  • Domhoff and Schneider (2008): Quantitative dream analysis methods
  • Extensive literature from computational linguistics, psychology, and cognitive science

This paper demonstrates excellence in theoretical innovation, methodological advancement, and application prospects, opening new research directions for computational analysis of personal narratives with significant academic value and practical significance.