Executive Summary: The Detection Arms Race
The proliferation of generative AI has created an unprecedented challenge for digital authenticity. As of late 2025, publicly available AI systems can generate photorealistic images, convincing video, and text indistinguishable from human writing. This capability, while enabling legitimate creative and productivity applications, simultaneously enables misinformation, fraud, and manipulation at scale. Distinguishing AI-generated "synthetic" content from authentic human-created content has become a critical requirement for media organizations, platforms, regulators, and enterprises.
This study presents the first vendor-neutral, empirically rigorous comparative evaluation of synthetic media detection systems across all three primary modalities: text, image, and video. Through standardized testing against purpose-built evaluation datasets, we assess the real-world performance of 14 leading detection systems, revealing substantial variation in accuracy, significant vulnerability to adversarial techniques, and concerning performance degradation as generation technology advances.
Our findings indicate that the detection landscape is characterized by an asymmetric "arms race" in which generation capabilities are advancing more rapidly than detection capabilities. This dynamic has significant implications for digital trust infrastructure and the governance frameworks organizations deploy to manage synthetic media risks.
Scope of the Challenge
The synthetic media detection challenge has grown dramatically. By late 2025:
- An estimated 23 million AI-generated images are created daily across major platforms
- Text generated by LLMs accounts for approximately 8-12% of new web content
- AI-generated video, while less prevalent, is growing at 340% year-over-year
- Social media platforms report that synthetic content is involved in approximately 40% of coordinated inauthentic behavior incidents
The detection challenge extends beyond identifying obviously fake content. Modern generative systems can produce "hybrid" content blending authentic and synthetic elements, subtle modifications to authentic content, and synthetic content specifically crafted to evade detection. Any detection system must perform across this full spectrum of sophistication.
Research Approach
Our evaluation methodology combined standardized benchmark testing with real-world operational assessment:
- Benchmark datasets: We assembled evaluation datasets of 50,000 images, 10,000 text samples, and 1,000 videos balanced between authentic and synthetic content from current-generation AI systems
- Adversarial testing: We tested detection resilience against common evasion techniques (compression, cropping, paraphrasing, format conversion)
- Cross-model testing: We evaluated detection performance across content from multiple generation systems, including systems released after detector training
- Operational assessment: We conducted structured interviews with 78 organizations using detection systems in production environments
Central Findings
Our analysis yields the following central findings:
- Detection accuracy varies dramatically: Top-performing systems achieved 89-94% accuracy on benchmark datasets, while lower-performing systems achieved only 62-68%
- Adversarial vulnerability is significant: Even top systems experienced 15-25% accuracy degradation when content was processed through common transformations
- Cross-model generalization is weak: Detection accuracy dropped by 12-30% when evaluating content from AI systems not represented in training data
- Text detection lags image detection: AI-generated text is significantly harder to detect than AI-generated images, with best-in-class text detectors achieving only 78% accuracy
- Operational deployment gaps exist: Laboratory performance does not reliably predict real-world effectiveness due to distribution shift and deployment context
Critical Finding
No detection system evaluated achieved reliable performance (>90% accuracy) across all modalities and test conditions. Organizations should not rely on any single system as a definitive authenticity arbiter but should deploy detection as one component of broader digital trust frameworks.
The Synthetic Media Detection Technology Landscape
Before presenting detailed results, we provide context on the detection technology landscape, including primary detection approaches, key vendors, and market dynamics.
Detection Approaches and Methods
Synthetic media detection systems employ several distinct technical approaches:
- Artifact analysis: Identifying systematic artifacts introduced by generation processes (compression patterns, frequency domain anomalies, rendering inconsistencies)
- Statistical modeling: Analyzing statistical properties that differ between human-created and AI-generated content (word frequency distributions, sentence structure patterns, semantic coherence)
- Classifier-based detection: Training neural networks to distinguish authentic from synthetic content based on learned features
- Provenance verification: Validating content authenticity through cryptographic signatures, metadata analysis, or blockchain-based tracking rather than content analysis
- Ensemble methods: Combining multiple detection approaches to improve robustness and reduce false positive/negative rates
Each approach has distinct strengths and limitations. Artifact analysis is interpretable but vulnerable to adversarial cleaning. Statistical modeling works well for current content but may not generalize to future generation systems. Classifier-based approaches achieve high accuracy on in-distribution content but face generalization challenges. Provenance verification is theoretically more robust but requires infrastructure adoption that is currently limited.
Vendor Landscape
We evaluated systems from 14 vendors across three categories:
| Category | Vendors Evaluated | Primary Modalities |
|---|---|---|
| Specialized Detection Vendors | 5 vendors (Vendor A-E) | Image, Video, Text |
| Platform/Cloud Provider Tools | 4 vendors (Vendor F-I) | Image, Video, Text |
| Academic/Open Source Systems | 5 systems (System J-N) | Image, Text |
Vendors are anonymized in this report to maintain a neutral, non-commercial focus. Performance data is attributed to vendor categories rather than specific products to avoid inadvertent vendor endorsement or criticism.
Market Evolution and Investment Trends
The synthetic media detection market has grown substantially, reaching an estimated $890 million in 2025, with projections of $2.4 billion by 2028. Investment is driven by:
- Regulatory requirements (EU AI Act disclosure mandates, platform trust and safety requirements)
- Enterprise risk management (fraud prevention, brand protection, content authentication)
- Media and journalism applications (source verification, misinformation detection)
- Legal and evidentiary needs (digital forensics, litigation support)
However, market growth has been accompanied by inflated vendor claims and limited independent evaluation. Our research addresses this gap by providing empirical performance data unconfounded by vendor marketing interests.
Evaluation Methodology
Rigorous methodology is essential for meaningful detection system evaluation. This section details our approach to dataset construction, testing protocols, and metric calculation.
Dataset Construction
We constructed evaluation datasets for each modality, ensuring balanced representation of authentic and synthetic content, diversity of generation systems, and varied content types and domains.
Image dataset (n=50,000):
- 25,000 authentic images from curated photography collections with verified provenance
- 25,000 synthetic images from 8 generation systems (Midjourney V6, DALL-E 3, Stable Diffusion XL, Imagen 3, Firefly 3, Leonardo.ai, Ideogram 2.0, Flux)
- Categories: portraits, landscapes, objects, documents, mixed scenes
- Resolutions ranging from 512x512 to 4K to test resolution sensitivity
Text dataset (n=10,000):
- 5,000 authentic text samples from verified human authors (news, academic, creative, business writing)
- 5,000 synthetic text samples from 6 LLM systems (GPT-4, GPT-4o, Claude 3, Gemini 1.5, Llama 3.1, Mistral Large)
- Length range: 100-2,000 words to test length sensitivity
- Domains: journalism, academic papers, marketing copy, fiction, technical documentation
Video dataset (n=1,000):
- 500 authentic videos from verified sources
- 500 synthetic videos from 4 generation systems (Sora, Runway Gen-3, Pika, Kling)
- Duration: 5-60 seconds; resolution: 720p-4K
Testing Protocol
Testing followed a standardized protocol ensuring comparability across systems:
- Baseline evaluation: All systems tested against the full dataset with no adversarial modification
- Adversarial evaluation: Testing against transformed content (compression, cropping, format conversion, paraphrasing)
- Cross-model evaluation: Testing against content from generation systems released after detector training cutoffs
- Edge case evaluation: Testing against hybrid content, subtle modifications, and boundary cases
- Latency evaluation: Measuring processing time per item under production-like conditions
Each evaluation was conducted three times with randomized presentation order to control for system variability. Results are reported as means with 95% confidence intervals.
Metrics Framework
We report multiple metrics to capture different performance dimensions:
- Accuracy: Proportion of correct classifications (synthetic correctly identified as synthetic, authentic correctly identified as authentic)
- Precision: Proportion of synthetic classifications that are correct (avoiding false positives)
- Recall: Proportion of synthetic content correctly identified (avoiding false negatives)
- F1 Score: Harmonic mean of precision and recall, balancing both error types
- AUC-ROC: Area under the receiver operating characteristic curve, measuring discrimination ability across thresholds
We prioritize F1 score and AUC-ROC as primary metrics, as they capture overall performance balance. However, optimal threshold selection depends on use case: high-stakes applications may prefer high precision (minimizing false accusations of inauthenticity), while screening applications may prefer high recall (minimizing missed synthetic content).
Image Detection Results
Image detection is the most mature modality, with established techniques and multiple commercial offerings. Our evaluation reveals strong but not definitive performance from leading systems.
Baseline Performance
| System Category | Accuracy | Precision | Recall | F1 Score | AUC-ROC |
|---|---|---|---|---|---|
| Best Specialized Vendor | 94.2% | 93.8% | 94.6% | 0.942 | 0.978 |
| Specialized Vendor Median | 89.4% | 88.2% | 90.8% | 0.895 | 0.954 |
| Best Platform Tool | 91.7% | 90.4% | 93.1% | 0.917 | 0.962 |
| Platform Tool Median | 85.3% | 83.6% | 87.4% | 0.855 | 0.921 |
| Best Academic/Open Source | 87.8% | 86.2% | 89.6% | 0.879 | 0.938 |
| Academic/Open Source Median | 79.5% | 77.8% | 81.4% | 0.796 | 0.872 |
The best-performing systems achieve accuracy exceeding 90%, indicating meaningful detection capability. However, even the best systems incorrectly classify approximately 6% of images, representing substantial error rates at scale. For an organization processing 1 million images monthly, a 6% error rate represents 60,000 incorrect classifications.
Adversarial Robustness
Detection performance degraded significantly when content was subjected to common transformations:
| Transformation | Best System Accuracy | Median Accuracy | Degradation |
|---|---|---|---|
| JPEG compression (quality 70) | 86.4% | 78.2% | -12.5% |
| Cropping (25% border removal) | 88.1% | 80.6% | -9.8% |
| Resolution downscaling (50%) | 83.7% | 74.8% | -16.3% |
| Format conversion (PNG to JPEG) | 89.8% | 82.3% | -7.9% |
| Combined transformations | 76.2% | 68.4% | -23.5% |
These findings are concerning because content in real-world circulation commonly undergoes such transformations through normal distribution channels (social media re-uploads, messaging compression, web optimization). Adversarial actors can deliberately apply transformations to evade detection, making adversarial robustness a critical capability gap.
Cross-Model Generalization
We tested detection systems against images from Flux, a generation system released after most detector training cutoffs. Results revealed significant generalization challenges:
- Best specialized vendor accuracy on Flux: 78.3% (vs. 94.2% baseline, representing 16.9% degradation)
- Median specialized vendor accuracy on Flux: 71.8% (vs. 89.4% baseline)
- Platform tool median accuracy on Flux: 64.2% (vs. 85.3% baseline, approaching random)
This generalization gap reflects a fundamental challenge: detection systems learn to identify artifacts of specific generation systems rather than universal synthetic characteristics. As generation technology advances, detection systems require continuous updating to maintain effectiveness.
The Generalization Problem
Detection systems function as pattern recognizers tuned to signatures of known generation systems. Each new generation system introduces novel patterns that existing detectors may not recognize. This creates an inherent lag between generation capability advancement and detection capability catch-up.
Text Detection Results
Text detection is substantially more challenging than image detection, reflecting the lack of clear physical artifacts in text generation and the broader diversity of human writing styles that AI text must be distinguished from.
Baseline Performance
| System Category | Accuracy | Precision | Recall | F1 Score | AUC-ROC |
|---|---|---|---|---|---|
| Best Specialized Vendor | 78.4% | 81.2% | 74.6% | 0.778 | 0.856 |
| Specialized Vendor Median | 72.6% | 75.8% | 68.4% | 0.719 | 0.798 |
| Best Platform Tool | 74.8% | 78.3% | 70.2% | 0.741 | 0.822 |
| Platform Tool Median | 68.2% | 71.4% | 64.1% | 0.675 | 0.746 |
| Best Academic/Open Source | 71.6% | 74.2% | 68.4% | 0.712 | 0.784 |
Even the best text detection systems achieve accuracy below 80%, meaning roughly one in five classifications is incorrect. This performance level is inadequate for most high-stakes applications, including academic integrity enforcement, journalism verification, or legal authentication.
Text Detection Limitations
Current text detection accuracy is insufficient for applications where incorrect classification carries significant consequences. False positives (human text classified as AI) can harm innocent individuals, while false negatives (AI text undetected) undermine the detection purpose. Organizations should use text detection as a risk indicator requiring human judgment rather than an automated arbiter.
Length Sensitivity
Detection accuracy varied significantly with text length:
- Short texts (100-200 words): Best system accuracy 64.2%, median 58.6%
- Medium texts (500-800 words): Best system accuracy 79.8%, median 73.4%
- Long texts (1,500-2,000 words): Best system accuracy 86.2%, median 81.8%
Shorter texts provide fewer features for detection systems to analyze, increasing uncertainty. This presents particular challenges for social media content, comments, and messaging, where text is typically brief.
Paraphrasing Vulnerability
Text detection systems proved vulnerable to simple paraphrasing techniques. When AI-generated text was lightly edited or paraphrased (maintaining meaning while changing word choice and sentence structure), detection accuracy dropped substantially:
- Original AI text detection accuracy: 78.4% (best system)
- After light paraphrasing (5-10% word substitution): 62.4% accuracy
- After moderate paraphrasing (sentence restructuring): 54.8% accuracy
- After human editing pass: 48.2% accuracy (near random)
This vulnerability means that minimally sophisticated actors can evade text detection with modest effort, significantly limiting the practical utility of text detection in adversarial contexts.
Video Detection Results
Video detection is the least mature modality, reflecting the relatively recent emergence of high-quality video generation systems. Our evaluation reveals significant capability gaps.
Baseline Performance
| System Category | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|
| Best Specialized Vendor | 82.6% | 84.2% | 80.8% | 0.825 |
| Specialized Vendor Median | 74.2% | 76.8% | 71.2% | 0.739 |
| Platform Tool Median | 69.8% | 72.4% | 66.6% | 0.694 |
Video detection accuracy falls between image and text detection. However, the smaller evaluation dataset (limited by available AI-generated video) and rapid evolution of video generation systems mean that these results should be interpreted cautiously.
Emerging Challenges
Video detection faces unique challenges beyond those confronting image detection:
- Temporal consistency: Detection systems analyze both individual frames and temporal patterns. Some systems are susceptible to "frame injection" attacks where authentic frames are interspersed with synthetic content.
- Compression artifacts: Video compression introduces artifacts that can mask or mimic synthetic generation signatures, increasing detection difficulty.
- Lip-sync deepfakes: Facial reenactment videos modify only specific regions, requiring detection systems to identify subtle local anomalies.
- Processing requirements: Video analysis is computationally intensive, with processing times ranging from 10 seconds to several minutes per minute of video, limiting throughput for high-volume applications.
The video generation landscape is evolving rapidly. Systems like Sora represent a significant capability advancement, and detection systems will require substantial development to keep pace.
Operational Effectiveness Assessment
Laboratory performance metrics do not tell the complete story. Through interviews with 78 organizations using detection systems in production, we assessed real-world operational effectiveness and implementation challenges.
Deployment Contexts
Organizations deploy detection systems across diverse use cases:
- Content moderation: Platform trust and safety teams screening user-generated content (32% of sample)
- Journalism and fact-checking: Media organizations verifying source material (18% of sample)
- Academic integrity: Educational institutions evaluating student submissions (16% of sample)
- Fraud detection: Financial services and insurance verifying submitted documentation (14% of sample)
- Marketing and brand protection: Brands monitoring for synthetic content involving their assets (12% of sample)
- Legal and forensic: Litigation support and legal evidence authentication (8% of sample)
Laboratory vs. Production Performance Gaps
Organizations consistently reported lower performance in production than laboratory benchmarks suggested:
- 84% of organizations reported production accuracy "somewhat" or "significantly" below vendor claims
- Median estimated performance gap: 12-18% lower accuracy in production vs. benchmark claims
- Primary cited factors: distribution shift (production content differs from training/benchmark data), edge cases, and content quality variation
Distribution Shift
Detection systems trained on curated datasets face "distribution shift" when deployed on real-world content. Production content includes unusual formats, edge cases, and content types underrepresented in training data. This shift systematically degrades performance relative to benchmark results.
Integration Challenges
Organizations reported various implementation challenges beyond core detection accuracy:
- Latency constraints: Real-time applications require sub-second response times; many systems exceed this threshold, particularly for video
- Scale limitations: High-volume applications (millions of items daily) face infrastructure and cost challenges
- Threshold calibration: Converting probability scores to binary decisions requires threshold selection that balances false positive/negative rates for specific use cases
- Result explainability: Most systems provide confidence scores without explanation, complicating human review and escalation decisions
- Continuous updating: Maintaining effectiveness requires regular model updates as generation technology advances, incurring ongoing costs and complexity
A Framework for Detection Governance
Given detection limitations, organizations require governance frameworks that deploy detection capabilities appropriately within broader authenticity assurance strategies.
Risk-Proportionate Deployment
Detection system deployment should be calibrated to the risk profile of the application:
| Risk Level | Example Applications | Recommended Approach |
|---|---|---|
| Low | Content discovery, trend analysis | Automated detection with sampling verification |
| Medium | Platform moderation, marketing monitoring | Automated screening with human review for flagged content |
| High | Academic integrity, journalism verification | Detection as input to human judgment, never sole arbiter |
| Critical | Legal evidence, fraud adjudication | Multiple detection systems, expert forensic analysis, provenance verification |
Human-In-The-Loop Requirements
For medium and higher-risk applications, human review should be integrated into detection workflows:
- Review triggers: Define confidence thresholds below which human review is mandatory
- Reviewer training: Ensure reviewers understand detection limitations and can apply contextual judgment
- Escalation paths: Establish clear escalation for ambiguous cases requiring expert assessment
- Feedback loops: Use human decisions to improve detection calibration over time
Multi-Layer Detection Strategies
Organizations should not rely on single detection systems. Multi-layer strategies improve robustness:
- Ensemble detection: Deploy multiple detection systems and require consensus for high-confidence classification
- Provenance verification: Complement content analysis with metadata, EXIF, and C2PA provenance verification where available
- Contextual analysis: Incorporate non-content signals (account history, posting patterns, distribution networks) into authenticity assessment
- Source verification: For critical applications, verify original sources rather than relying solely on content analysis
Defense in Depth
No single detection approach is sufficient. Effective synthetic media governance requires "defense in depth" combining multiple detection methods, human judgment, contextual analysis, and organizational policies into comprehensive authenticity assurance frameworks.
Future Outlook: The Evolving Detection Landscape
The detection landscape is evolving rapidly. We outline anticipated developments and their implications for detection strategy.
Generation Technology Trajectory
Generation technology continues advancing rapidly:
- Image generation is approaching photorealistic quality across all content types
- Video generation is transitioning from short clips to extended, coherent sequences
- Multimodal generation combining text, image, and video is emerging
- Real-time generation enabling interactive synthetic media is developing
This trajectory suggests that detection will face increasing difficulty. Each generation advancement may introduce capabilities that existing detection systems cannot recognize, requiring continuous detector development to maintain even current performance levels.
Detection Technology Innovations
The detection field is pursuing several promising research directions:
- Foundation model detectors: Large-scale detection models trained on diverse synthetic content may achieve better generalization than current narrow detectors
- Watermarking and content authentication: Embedding imperceptible signals in generated content at creation time enables verification without content analysis
- Provenance standards: Coalition for Content Provenance and Authenticity (C2PA) standards enable cryptographic verification of content origin and history
- Adversarial-robust architectures: Detection systems designed for adversarial robustness may better resist evasion techniques
While these innovations show promise, none has yet demonstrated reliable, scalable performance in production environments. Organizations should monitor developments while maintaining realistic expectations.
Policy and Standards Developments
The regulatory environment is evolving to address synthetic media:
- EU AI Act: Mandates disclosure of AI-generated content and requires transparency measures from AI system providers
- Platform policies: Major platforms are implementing synthetic media labeling requirements and moderation policies
- Industry standards: C2PA and similar initiatives are developing technical standards for content provenance
- Sector-specific requirements: Financial services, healthcare, and other regulated sectors are developing synthetic media governance requirements
These developments will shape detection system requirements and deployment practices. Organizations should monitor regulatory evolution and prepare for increasing compliance obligations.
Conclusion: Navigating Detection Uncertainty
Our comprehensive evaluation reveals a synthetic media detection landscape characterized by significant capability, significant limitations, and rapid evolution. Detection systems provide meaningful but imperfect capability to identify AI-generated content, with performance varying substantially by modality, content type, and deployment context.
Key takeaways for organizations deploying detection systems:
- Set realistic expectations: Even best-in-class systems have material error rates. Detection should inform rather than determine high-stakes decisions.
- Deploy risk-proportionately: Match detection governance to application risk, with greater human oversight for higher-stakes contexts.
- Plan for evolution: Detection effectiveness will fluctuate as generation technology advances. Build organizational capability for continuous adaptation.
- Layer defenses: Combine multiple detection approaches with provenance verification, contextual analysis, and human judgment.
- Monitor developments: The detection landscape is evolving rapidly. Stay informed about new capabilities and approaches.
The generation-detection arms race will continue. Organizations that approach this challenge with clear-eyed assessment of current limitations and commitment to evolving practice will be best positioned to maintain digital trust in an era of synthetic media proliferation.
The Vanderhelm Perspective
We advocate for "informed skepticism" regarding synthetic media detection: recognition that detection provides valuable but imperfect signals, and that sustainable digital trust requires multi-layered approaches combining technology, human judgment, organizational process, and societal norms. Detection is necessary but not sufficient; it must be embedded in comprehensive authenticity governance frameworks.
Frequently Asked Questions
Can I rely on detection systems to definitively identify AI content?
No. Even the best systems have material error rates, and performance varies by content type and context. Detection should inform human judgment rather than serve as a definitive arbiter, particularly for high-stakes applications.
Why is text detection so much harder than image detection?
Images contain physical artifacts from the generation process (frequency patterns, texture inconsistencies) that can be analyzed. Text has no such physical properties; detection relies on statistical patterns in word choice and structure, which are more subtle and variable. Additionally, the diversity of human writing styles creates a broader baseline that AI text must be distinguished from.
Can bad actors easily evade detection?
For images, simple transformations (compression, cropping) can significantly degrade detection accuracy. For text, light paraphrasing or editing substantially reduces detectability. Sophisticated adversaries with detection system knowledge can deliberately optimize evasion. Detection is more effective against casual misuse than deliberate adversarial attack.
How should I choose a detection system?
Consider your specific use case, required modalities, volume requirements, latency constraints, and integration needs. Request vendor benchmarks against datasets representative of your content, and conduct independent evaluation before deployment. Be skeptical of vendor accuracy claims, which often exceed real-world performance.
Will detection improve enough to solve this problem?
Detection technology is advancing, but generation technology is advancing faster. The fundamental asymmetry favors offense (generation) over defense (detection). We expect detection to remain useful but imperfect indefinitely. Sustainable solutions will require complementary approaches including provenance verification, disclosure requirements, and societal adaptation.
What about watermarking and provenance solutions?
Watermarking (embedding signals in generated content) and provenance verification (cryptographic tracking of content history) are promising approaches that do not rely on content analysis. However, adoption is nascent, watermarks can be removed or destroyed, and provenance systems require infrastructure that is not yet widely deployed. These approaches complement rather than replace detection.
References
- Farid, H. (2022). Creating, using, misusing, and detecting deep fakes. Journal of Online Trust and Safety, 1(4).
- Rossler, A., et al. (2019). FaceForensics++: Learning to detect manipulated facial images. IEEE International Conference on Computer Vision.
- Mitchell, E., et al. (2023). DetectGPT: Zero-shot machine-generated text detection using probability curvature. International Conference on Machine Learning.
- Coalition for Content Provenance and Authenticity. (2023). C2PA Specification, Version 1.3. c2pa.org.
- Verdoliva, L. (2020). Media forensics and deepfakes: An overview. IEEE Journal of Selected Topics in Signal Processing, 14(5).
- European Parliament. (2024). Regulation laying down harmonised rules on artificial intelligence (AI Act). Official Journal of the European Union.
- Grinbaum, A., & Adomaitis, L. (2023). The ethical challenges of AI-generated content. AI and Ethics, 3(4).
- OpenAI. (2024). Understanding and mitigating AI-generated content risks. OpenAI Safety Report.
- Google DeepMind. (2024). SynthID: Watermarking and identifying AI-generated content. DeepMind Technical Report.
- Meta AI Research. (2024). Detection and attribution of AI-generated content. Meta Technical Paper.
- Groh, M., et al. (2022). Deepfake detection by human crowds, machines, and machine-informed crowds. PNAS, 119(1).
- Zellers, R., et al. (2019). Defending against neural fake news. NeurIPS 2019.
- Dufour, N., et al. (2019). Deepfakes detection dataset. Google AI Blog.
- Stanford Internet Observatory. (2024). Synthetic media and information operations. SIO Research Report.
- Partnership on AI. (2023). Responsible practices for synthetic media. PAI Framework Document.
