2025-11-16T21:28:12.824561

Generative AI and Firm Productivity: Field Experiments in Online Retail

Fang, Yuan, Zhang et al.
We quantify the impact of Generative Artificial Intelligence (GenAI) on firm productivity through a series of large-scale randomized field experiments involving millions of users and products at a leading cross-border online retail platform. Over six months in 2023-2024, GenAI-based enhancements were integrated into seven consumer-facing business workflows. We find that GenAI adoption significantly increases sales, with treatment effects ranging from 0\% to 16.3\%, depending on GenAI's marginal contribution relative to existing firm practices. Because inputs and prices were held constant across experimental arms, these gains map directly into total factor productivity improvements. Across the four GenAI applications with positive effects, the implied annual incremental value is approximately \$5 per consumer-an economically meaningful impact given the retailer's scale and the early stage of GenAI adoption. The primary mechanism operates through higher conversion rates, consistent with GenAI reducing frictions in the marketplace and improving consumer experience. We also document substantial heterogeneity: smaller and newer sellers, as well as less experienced consumers, exhibit disproportionately larger gains. Our findings provide novel, large-scale causal evidence on the productivity effects of GenAI in online retail, highlighting both its immediate value and broader potential.
academic

Generative AI and Firm Productivity: Field Experiments in Online Retail

Basic Information

  • Paper ID: 2510.12049
  • Title: Generative AI and Firm Productivity: Field Experiments in Online Retail
  • Authors: Lu Fang, Zhe Yuan, Kaifu Zhang, Dante Donati, Miklos Sarvary
  • Classification: econ.GN cs.AI q-fin.EC
  • Publication Date: October 10, 2025 (Preliminary version)
  • Paper Link: https://arxiv.org/abs/2510.12049

Abstract

This study quantifies the impact of generative artificial intelligence (GenAI) on firm productivity through large-scale randomized field experiments conducted on a leading cross-border online retail platform. Over a six-month period in 2023-2024, GenAI-enhanced features were integrated into seven consumer-facing business workflows. The research finds that GenAI adoption significantly increased sales, with treatment effects ranging from 0% to 16.3%, depending on GenAI's marginal contribution relative to existing firm practices. Since inputs and prices remained constant across experimental groups, these gains directly translate into improvements in total factor productivity (TFP). Among the four GenAI applications with positive effects, the implied annual incremental value is approximately $5 per consumer, representing an economically significant impact given the retailer's scale and the early-stage nature of GenAI adoption.

Research Background and Motivation

Problem Definition

Despite the rapid proliferation of GenAI tools and widespread interest in their potential to reshape productivity across industries, there is currently a lack of empirical evidence demonstrating measurable returns from GenAI on firm-level revenue-generating productivity. Existing research primarily focuses on individual-level task efficiency, making it difficult to detect firm-level productivity gains.

Research Significance

  1. Practical Demand: Investors and industry practitioners express concerns about whether large-scale AI investments can translate into sustained commercial returns
  2. Theoretical Gap: Existing literature primarily focuses on supply-side efficiency gains, lacking evidence of demand-side value creation
  3. Methodological Challenge: Requires detailed revenue data and causal identification settings, which are rarely available in practice

Limitations of Existing Approaches

  1. Implementation Constraints: Technical expertise limitations and complementary investment requirements may delay implementation
  2. Scope Limitations: Most GenAI applications remain in pilot stages, focusing on narrowly defined tasks
  3. Identification Difficulties: Lack of detailed revenue data and causal identification settings required for rigorous empirical analysis

Core Contributions

  1. Provides Large-Scale Real-World Evidence: First to provide evidence of GenAI's causal impact on firm productivity through randomized field experiments involving millions of users and products
  2. Reveals Demand-Side Value Creation Mechanisms: Demonstrates that GenAI creates productivity gains by reducing market frictions and enhancing consumer experience, rather than solely through cost reduction
  3. Discovers Heterogeneous Effects: Smaller and newer sellers, as well as less experienced consumers, derive greater benefits from GenAI
  4. Quantifies Economic Impact: Estimates that four GenAI applications with positive effects create approximately $5 in annual incremental value per consumer

Methodology Details

Theoretical Framework

Based on the standard Solow growth model's Cobb-Douglas production function: Y=AKαL1α,0<α<1Y = AK^{\alpha}L^{1-\alpha}, 0 < \alpha < 1

where Y is output, K is capital stock, L is labor input, and A is total factor productivity (TFP).

Under conditions where capital and labor inputs remain constant: dlnK=0,dlnL=0dlnY=dlnAd\ln K = 0, d\ln L = 0 \Rightarrow d\ln Y = d\ln A

Experimental Design

Seven Business Workflows

  1. Pre-Sales Service Chatbot: 24/7 GenAI customer service vs. pre-programmed automated responses
  2. Search Query Optimization: GenAI semantic understanding and query optimization vs. basic translation
  3. Product Description Generation: GenAI-generated structured descriptions vs. manual descriptions
  4. Marketing Push Messages: GenAI-generated personalized messages vs. standardized messages
  5. Google Ads Title Optimization: GenAI-optimized ad titles vs. original titles
  6. Return Dispute Resolution: GenAI agents vs. manual handling
  7. Real-Time Chat Translation: GenAI real-time translation assistance vs. no translation support

Experimental Characteristics

  • Randomization Level: Consumer-level (6 experiments) and product-level (1 experiment)
  • Sample Size: Ranging from 30,000 to 13.7 million participants
  • Experimental Period: September 2023 to June 2024
  • Overlap Rate: Cross-experiment consumer overlap below 1%

Econometric Model

Basic regression specification: yi=β×Treati+αc(i)+εiy_i = \beta \times Treat_i + \alpha_{c(i)} + \varepsilon_i

where yiy_i is the outcome variable, TreatiTreat_i is the treatment group indicator, and αc(i)\alpha_{c(i)} is the cohort fixed effect.

Experimental Setup

Data Sources

Partnership with a world-leading cross-border e-commerce platform, obtaining:

  • Consumer-level transaction data (spending, conversion, clicks)
  • Seller characteristic data (annual sales, years of operation, sub-accounts)
  • Product characteristic data (category concentration, price, sales volume)
  • Consumer demographic and shopping history data

Evaluation Metrics

  • Primary Metrics: Sales (USD), conversion rate
  • Secondary Metrics: Product views, clicks, orders, average cart value
  • Mechanism Metrics: Click-through rate, click-to-order rate

Sample Statistics

Descriptive statistics of key variables across experiments show:

  • Conversion rate: between 0.004-0.09
  • Average sales: 0.0450.045-2.24
  • Product views: 5-313
  • Product clicks: 0.22-8.23

Experimental Results

Main Results

Productivity Impact (Sales)

  1. Pre-Sales Service Chatbot: 16.3% growth (p<0.01)
  2. Search Query Optimization: 2.93% growth (p<0.05)
  3. Product Description Generation: 2.05% growth (p<0.05)
  4. Marketing Push Messages: 1.6% growth (not significant)
  5. Google Ads Title: -4.5% (not significant)
  6. Return Dispute Resolution: 15% success rate improvement
  7. Real-Time Chat Translation: 5.2% consumer satisfaction improvement

Mechanism Analysis (Conversion Rate)

Significant conversion rate improvements across all effective workflows:

  • Pre-Sales Service Chatbot: 21.7% improvement
  • Search Query Optimization: 1.15% improvement
  • Product Description Generation: 1.27% improvement
  • Marketing Push Messages: 3.0% improvement

Intensive Margin Analysis

Average cart value shows no significant changes across all workflows, indicating that GenAI primarily drives growth through market expansion (increasing the number of converting consumers) rather than increasing spending by existing buyers.

Heterogeneity Analysis

Seller Heterogeneity

Smaller sellers derive greater benefits:

  • Lower annual sales sellers: 3.68% sales growth vs. 2.18% for large sellers
  • Shorter operational tenure sellers: 3.19% vs. 2.28%
  • Fewer sub-accounts sellers: 3.48% vs. 0.97%

Consumer Heterogeneity

Less experienced consumers benefit more:

  • Shorter registration tenure: 22.4% sales growth vs. 13.7% for experienced consumers
  • Fewer login days: 18.5% vs. 15.0%
  • Lower past consumption: 25.9% vs. 8.6%

Product Heterogeneity

Results vary by specific workflow:

  • Search Optimization: Low concentration categories, long-tail products, high-price products benefit more
  • Product Description: High concentration categories, high-price products benefit more
  • Pre-Sales Service: Long-tail products show more pronounced benefits

Economic Impact Quantification

Based on four GenAI applications with positive effects, the annualized incremental value is approximately 4.64.6-5.0 per consumer, accounting for 5.5-6% of global e-commerce user revenue growth in 2023-2024.

GenAI Economic Impact Research

Existing research primarily focuses on:

  • Individual productivity improvements (coding, writing, customer service, etc.)
  • Supply-side efficiency gains (task completion time, completion quantity)
  • Effect measurement in laboratory settings

This research fills the gap in firm-level, demand-side value creation studies.

Online Market Friction Reduction

Related technologies include:

  • Reputation and review systems alleviating information asymmetry
  • AI-driven personalized search and recommendations
  • Targeted advertising improving matching efficiency

This research extends this literature by demonstrating how GenAI further reduces multiple types of market frictions.

Conclusions and Discussion

Main Conclusions

  1. GenAI Can Generate Measurable Productivity Improvements: Significant sales growth observed across multiple business workflows
  2. Demand-Side Value Creation Mechanism: Productivity gains achieved through reducing market frictions and enhancing consumer experience
  3. Significant Heterogeneous Effects: Smaller sellers and less experienced consumers derive greater benefits
  4. Economically Significant Impact: Generates substantial incremental value even in the early adoption phase

Limitations

  1. Short-Term Effects: Relatively short experimental period (weeks to months), lacking long-term impact data
  2. Workflow Selection Bias: Seven workflows selected based on managerial judgment rather than systematic selection
  3. Labor and Capital Input Assumptions: Potential changes in factor inputs in the future
  4. External Validity: Single-platform experiment; competitor strategic responses not considered

Future Directions

  1. Long-Term Effects Research: Impact of consumer adaptation behavior and platform model optimization
  2. Broader Applications: Other business processes such as logistics, inventory management, and dynamic pricing
  3. General Equilibrium Effects: Competitive dynamics following industry-wide adoption
  4. Cost-Side Adjustments: Labor substitution and organizational structure adaptation

In-Depth Evaluation

Strengths

  1. Rigorous Methodology: Large-scale randomized field experiments provide strong causal identification
  2. Significant Practical Relevance: First to provide empirical evidence of GenAI's productivity impact at the firm level
  3. In-Depth Mechanism Analysis: Clearly identifies demand-side value creation channels
  4. Comprehensive Heterogeneity Analysis: Reveals differential effects across sellers, consumers, and products
  5. Precise Economic Quantification: Provides specific incremental value estimates

Weaknesses

  1. Limited External Validity: Single-platform experiment raises questions about result generalizability
  2. Missing Long-Term Effects: Unable to assess impacts of sustained use and consumer adaptation
  3. Incomplete Workflow Coverage: Does not encompass all possible GenAI application scenarios
  4. Overlooked Competitive Effects: Does not consider equilibrium effects from industry-level adoption

Impact

  1. Academic Contribution: Provides important empirical foundation for GenAI economic impact research
  2. Practical Value: Offers quantified evidence for corporate GenAI investment decisions
  3. Policy Implications: Supports policymaking to promote AI technology adoption
  4. Reproducibility: Clear experimental design provides paradigm for subsequent research

Applicable Scenarios

  1. E-Commerce Platforms: Directly applicable to GenAI deployment in online retail environments
  2. Service Industries: Customer service and content generation applications
  3. Platform Economics: Friction reduction applications in two-sided markets
  4. Technology Investment Assessment: Corporate AI return-on-investment evaluation

References

This research cites extensive relevant literature, primarily including:

  • Brynjolfsson et al. (2025): GenAI's impact on workplace productivity
  • Noy and Zhang (2023): Experimental evidence of GenAI productivity effects
  • Acemoglu (2025): Simple macroeconomic analysis of AI
  • Syverson (2011): Survey of productivity determinants

Overall Assessment: This is a high-quality empirical research paper that provides compelling evidence of GenAI's impact on firm productivity through large-scale field experiments. The research design is rigorous, and the results carry important theoretical and practical significance, making substantial contributions to understanding the economic impact of AI technology. Despite certain limitations, these do not diminish its value as a pioneering study in this field.