There has been considerable interest in modelling the spread of information on social networks using machine learning models. Here, we consider the problem of predicting the spread of new information, i.e. when a user propagates information about a topic previously unseen by the user. In existing work, information and users are randomly assigned to a test or training set, ensuring that both sets are drawn from the same distribution. In the spread of new information, the problem becomes an out-of-distribution generalisation classification task. Our experimental results reveal that while existing algorithms, which predominantly use features derived from the content of messages, perform well when the training and test distributions are the same, these algorithms perform much worse when the test set is out-of-distribution, i.e. when the topic (hashtag) of the testing data is absent from the training data. We then show that if the message features are supplemented or replaced with features derived from users' profile and past behaviour, the out-of-distribution prediction is greatly improved, with the F1 score increasing from 0.117 to 0.705. Our experimental results suggest that a significant component of reposting behaviour for previously unseen topics can be predicted from users' profile and past behaviour, and is largely content-agnostic.
- Paper ID: 2505.15370
- Title: Modelling the Spread of New Information on Social Networks
- Authors: Ziming Xu, Shi Zhou, Vasileios Lampos, Ingemar J. Cox
- Classification: cs.SI (Social and Information Networks)
- Publication Date: October 14, 2025 (arXiv v3)
- Paper Link: https://arxiv.org/abs/2505.15370v3
This paper investigates the problem of predicting information spread on social networks, specifically predicting whether users will retweet information about previously unseen topics. Existing research typically assigns information and users randomly to training and test sets, ensuring both sets come from the same distribution. However, the new information spread problem is fundamentally an out-of-distribution generalization classification task. Experimental results demonstrate that existing algorithms, which primarily rely on message content features, perform well when training and test distributions are identical, but show significant performance degradation when the test set is out-of-distribution (i.e., test topics do not exist in training data). The study finds that supplementing or replacing message features with user profile and historical behavior features substantially improves out-of-distribution prediction performance, with F1 scores improving from 0.117 to 0.705. Results indicate that retweet behavior for unseen topics can be largely predicted through user profiles and historical behavior, and is essentially content-independent.
The core problem addressed in this paper is new information spread prediction, namely predicting whether users will retweet information about previously unseen topics. This is a typical out-of-distribution generalization problem, as test topics are completely absent from training data.
- Interdisciplinary Importance: Information spread prediction is significant across multiple disciplines including computer science, social sciences, political science, and marketing
- Practical Application Value: Possesses important applications in marketing campaigns, political propaganda, misinformation, and rumor propagation scenarios
- Theoretical Significance: Contributes to understanding the underlying mechanisms of information diffusion on social media
- Over-reliance on Message Content: Existing algorithms primarily use features extracted from message text content
- Lack of Out-of-Distribution Evaluation: Existing research typically employs random dataset partitioning, ensuring training and test data come from identical distributions
- Underestimation of User-Related Data: Important information such as user profiles, follow lists, and historical behavior are underutilized
New topics frequently emerge on social media platforms (e.g., breaking news), necessitating out-of-distribution prediction capabilities beyond traditional same-distribution classification, which is more challenging and valuable in practical applications.
- Proposed a New Evaluation Paradigm: First explicitly distinguishes between same-distribution and out-of-distribution prediction, providing a more comprehensive evaluation framework for retweet prediction research
- Constructed a Comprehensive Feature System: Identified and constructed 303 features, including 78 message-related features and 225 user-related features
- Revealed the Importance of User Features: Experiments demonstrate that user-related features are crucial for out-of-distribution prediction, with F1 scores improving from 0.117 to 0.705
- Provided Important Theoretical Insights: Discovered that retweet behavior is largely content-independent and primarily determined by user characteristics ("It is who we are, not what we see")
Retweet prediction is defined as predicting whether a recipient will retweet a message received from a sender:
f:{M,US,UR}→y∈{0,1}
Where:
- M: Message
- US: Sender
- UR: Recipient
- y=1: Recipient will retweet the message; y=0: Will not retweet
Contains text content from 111,401 X (Twitter) messages, extracting 78 message-related features:
- Topic Features (39): Topic identification using Twitter-roBERTa and LDA models
- Linguistic Features (10): Grammatical correctness, polarity, subjectivity, etc.
- Readability Features (11): Flesch reading difficulty, SMOG index, etc.
- Sentiment Features (5): Positive, negative, and neutral sentiment scores
- Emotion Features (8): Probabilities of anger, joy, fear, and other emotions
- Hate Speech Features (4): Aggressiveness and hate measures
- Tag Features (1): Presence of specific hashtags
Contains three categories of user-related data:
User Profile Data Data(U-P):
- User profiles and follow lists
- Extract 30 features: follower count, influence measures, network relationships, etc.
User Historical Behavior Data Data(U-HA):
- Metadata from the most recent 50 historical messages
- Extract 38 features: retweet rate, interaction patterns, inter-user interactions, etc.
User Historical Message Data Data(U-HM):
- Text content from the most recent 50 historical messages
- Extract 157 features: aggregated features of historical messages, topic similarity, etc.
Uses XGBoost decision trees, discovering the critical role of user features through feature importance analysis. Main hyperparameters:
- Maximum depth: 8
- Learning rate: 0.3
- Number of estimators: 100
Extended from the SUA-ACNN model with added MLP components for processing user data:
- NN-M: Uses only message data
- NN-U: Uses only user data
- NN-ALL: Uses all data types
Uses BERT-base to process message text, generating semantic embeddings for prediction.
- Out-of-Distribution Evaluation Design: For each hashtag, trains on data from 13 other hashtags and tests on that specific hashtag
- Negative Sample Generation Strategy: Selects the most similar negative sample for each positive sample, ensuring evaluation relevance
- Multi-level Feature System: Systematically extracts features from multiple dimensions including messages, user profiles, and historical behavior
- Data Source: X platform (formerly Twitter) Academic API
- Time Range: July 27 to August 14, 2022
- Data Scale:
- 111,401 messages
- 44,014 retweet events (positive samples)
- 79,707 unique users
- 3.8 million historical messages
- Topic Coverage: 14 popular hashtags
Creates three datasets with different positive-negative sample ratios:
- 1:1 Dataset: Each positive sample paired with one most similar negative sample
- 1:5 Dataset: Each positive sample paired with 5 most similar negative samples
- 1:10 Dataset: Each positive sample paired with 5 similar + 5 random negative samples
Primarily uses F1 score:
F1=TP+21(FP+FN)TP
For results across multiple hashtags, calculates overall mean and standard deviation.
Conducts three types of experiments:
- Experiment I: Same-distribution prediction with mixed hashtags
- Experiment II: Same-distribution prediction with single hashtag
- Experiment III: Out-of-distribution prediction
F1 scores on 1:5 dataset:
| Model | DT-ALL | DT-U | DT-M | NN-ALL | NN-U | NN-M | BERT |
|---|
| F1 Score | 0.884±0.002 | 0.852±0.005 | 0.758±0.002 | 0.844±0.009 | 0.835±0.004 | 0.740±0.003 | 0.740±0.010 |
Overall F1 scores (μ̄±σ̄):
| Model | DT-ALL | DT-U | DT-M | NN-ALL | NN-U | NN-M | BERT |
|---|
| F1 Score | 0.697±0.076 | 0.705±0.084 | 0.117±0.131 | 0.623±0.109 | 0.702±0.071 | 0.108±0.055 | 0.091±0.101 |
- Critical Role of User Features:
- Models using only message features show dramatic performance degradation in out-of-distribution prediction
- Models using only user features perform comparably to models using all features in out-of-distribution prediction
- Feature Importance Analysis:
- Among the top 20 most important features, 17 are user-related
- The most important feature is "whether the recipient follows the sender" (U-P_R_FollowS)
- Significant Performance Improvement:
- Out-of-distribution prediction F1 score improves from 0.117 to 0.705 (502% improvement)
- Demonstrates the importance of user features for new topic prediction
Through comparative experiments with different feature combinations, findings include:
- U-P and U-HA Features: Contribute most to out-of-distribution prediction
- U-HM Features: Show similar performance to message features with limited out-of-distribution capability
- Message Features: Nearly ineffective in out-of-distribution settings
Existing research falls into several categories:
- Message Popularity Prediction: Predicting message propagation scale
- Diffusion Tree Prediction: Predicting propagation paths and timing
- Retweet Prediction: Predicting specific user retweet behavior
- Feature Dependency: Over-reliance on message text features
- Evaluation Limitations: Lack of out-of-distribution evaluation
- Insufficient Data Utilization: Underutilization of user profile and behavioral data
- First systematic out-of-distribution evaluation
- Comprehensive consideration of user-related features
- Provides more realistic evaluation scenarios
- Content Independence: Retweet behavior is largely content-independent and primarily determined by user characteristics
- Generalization Capability of User Features: User profiles and historical behavior possess cross-topic generalization capability
- Importance of Evaluation Paradigm: Out-of-distribution evaluation is more meaningful for practical applications
- Platform Limitations: Research based solely on X platform data
- Time Window: Only considers retweet behavior within 24 hours
- Feature Engineering: Some feature extraction depends on specific tools and models
- Cultural Context: Does not consider behavioral differences across cultural backgrounds
- Cross-Platform Research: Extend to other social media platforms
- Dynamic Modeling: Consider temporal evolution of user behavior
- Causal Inference: Deepen understanding of causal relationships between user features and retweet behavior
- Real-Time Applications: Develop real-time prediction systems
- Innovative Problem Formulation:
- First explicitly proposes out-of-distribution retweet prediction
- More aligned with practical application scenarios
- Rigorous Experimental Design:
- Multiple model comparisons for validation
- Detailed ablation studies
- Statistical significance analysis
- Comprehensive Feature Engineering:
- Systematic construction of 303 features
- Multi-dimensional feature importance analysis
- Profound Theoretical Contributions:
- Important insight of "It is who we are, not what we see"
- Provides new perspective for understanding social media behavior
- Data Representativeness:
- Uses only 14 hashtags, potentially insufficient
- Short time span, lacking long-term observation
- Feature Interpretability:
- Psychological mechanisms of some user features unclear
- Lacks in-depth analysis of feature interactions
- Practical Considerations:
- Obtaining complete user history data may be difficult in practical applications
- Insufficient consideration of privacy protection
- Model Complexity:
- 303 features may contain redundancy
- Lacks feature selection and dimensionality reduction analysis
- Academic Contribution:
- Provides new evaluation paradigm for information spread research
- Challenges assumptions of existing methods
- Practical Value:
- Provides guidance for social media platform recommendation algorithms
- Offers new insights for marketing and public opinion monitoring
- Reproducibility:
- Detailed experimental setup and parameter descriptions
- Open feature engineering methodology
- Social Media Platforms: Content recommendation and user behavior prediction
- Digital Marketing: Target user identification and content strategy
- Public Opinion Monitoring: Trending topic propagation prediction
- Academic Research: Social network analysis and behavior modeling
The paper cites 48 related references, covering:
- Information diffusion theory research
- Machine learning method applications
- Social media behavior analysis
- Natural language processing techniques
Key references include classical retweet prediction work, neural network models (such as BERT and SUA-ACNN), and foundational research in social network analysis.
Overall Assessment: This is a high-quality research paper with significant contributions in problem formulation, methodological innovation, and experimental validation. Particularly, the proposal of out-of-distribution prediction and the discovery of user feature importance open new directions for social media information spread research. Despite some limitations, its theoretical value and practical significance are prominent, and it is expected to have important impact on related fields.