ScamAI
Pricing
Learn
Fundamentals8 min read·

What is a deepfake? How they work and how to detect them

A deepfake is synthetic media — a video, image, or audio recording — where a person's appearance, voice, or actions have been generated or manipulated by artificial intelligence. The word combines "deep learning" (the AI technique used to create them) and "fake." While the term first emerged around 2017, today's deepfakes are produced in seconds by consumer-accessible tools, making them one of the fastest-growing vectors for fraud, disinformation, and identity crime.

What makes something a deepfake?

Not every manipulated image is a deepfake. The term specifically refers to synthetic media generated or altered by AI and deep learning models — distinct from traditional photo editing like Photoshop or video compositing effects. Three categories account for the vast majority of deepfakes in circulation.

  • Face swaps — one person's face replaced with another's in a video or photo
  • Synthetic faces — entirely AI-generated faces of people who have never existed
  • Voice clones — AI-generated audio that replicates a specific person's voice

The defining characteristic is AI involvement in the generation process. A photo cropped or color-corrected in editing software is not a deepfake. An image where a face was generated by Stable Diffusion or swapped using FaceSwap is. This distinction matters because the artifacts left by AI generation are fundamentally different from those left by traditional manipulation — and they require different detection methods.

How deepfakes are created

Two main AI architectures produce the vast majority of deepfakes. Understanding them is important for understanding why detection is both possible and challenging.

Generative Adversarial Networks (GANs) dominated deepfake creation from 2017 to roughly 2022. They work by training two competing neural networks simultaneously: a generator that produces synthetic images, and a discriminator that tries to tell them apart from real ones. Through millions of training cycles, the generator learns to produce images indistinguishable to the discriminator. GANs leave distinctive frequency-domain artifacts in the images they produce — patterns that AI detectors can identify even when the image looks perfect to human eyes.

Diffusion models are the newer architecture behind Stable Diffusion, DALL-E, Midjourney, Flux, and GPT-Image-2. Rather than competing networks, diffusion models learn to reverse a noise-adding process — starting from random noise and gradually resolving it into a coherent image. Diffusion models produce higher-quality, more photorealistic images than GANs. They are now the dominant technology behind AI-generated images, and they leave a different signature of artifacts that modern detectors are trained to identify.

Key Stat

ScamAI researchers found that GPT-Image-2 cannot reliably recognize its own generated documents — demonstrating how detection must remain ahead of generation (arXiv:2604.25213).

For voice deepfakes, text-to-speech models trained on recordings of a target voice can generate new speech in that person's voice from any text input. Platforms including ElevenLabs, PlayHT, and Resemble AI make voice cloning accessible to anyone with a few minutes of audio sample — a phone call, a public speech, or a podcast appearance is enough.

The four main types of deepfakes

Identity fraud deepfakes impersonate real individuals for financial gain. These include AI-cloned voices used in CEO fraud — where attackers call employees impersonating executives to authorize wire transfers — and AI-generated selfies submitted to bypass KYC identity verification during account opening. A 2024 case saw a Hong Kong-based finance employee transfer $25 million after a deepfake video call appeared to show his CFO and company colleagues.

Disinformation deepfakes fabricate false evidence of real people saying or doing things they never did. Political deepfakes have proliferated ahead of elections worldwide. According to ScamAI research, state actors documented 47 influence campaigns using AI-generated media in 2025 alone.

Romance scam deepfakes use AI-generated or AI-manipulated profile photos and cloned voices to build false relationships and extract money from victims. Over 70% of reported romance scam profiles now involve AI-generated or manipulated imagery, according to FTC data.

Document forgery deepfakes use AI to alter financial statements, identity documents, insurance claims, and medical records. ScamAI's AIForge-Doc research (arXiv:2602.20569) demonstrated that AI-edited financial documents are undetectable by human reviewers without specialized tools — the study found humans perform near chance level on modern AI-edited documents.

Why human detection fails

Human detection of deepfakes is unreliable at any level of expertise. ScamAI's research ("Do deepfake detectors work in reality?", IEEE Workshop on Security Implications of Deepfakes and Cheapfakes, 2025) found that trained human reviewers correctly identify deepfake faces at rates approaching random chance when evaluating high-quality modern deepfakes. The same study found that older deepfake detectors trained on lab-generated samples also fail significantly when tested against in-the-wild deepfakes.

The intuitive checks people apply — looking for blurry edges, unnatural eye movements, or irregular hair — no longer work reliably. Modern GAN and diffusion-based deepfakes produce pixel-perfect results that fool these heuristics. The artifacts that remain exist in the frequency domain and in statistical patterns across pixels, not in visible features detectable by the eye.

Important

Human reviewers perform near random chance on high-quality modern deepfakes. Visual inspection alone is no longer a reliable defense.

How AI-based deepfake detection works

AI-based deepfake detection analyzes artifacts that are invisible to the human eye but consistent across how different generation technologies work. ScamAI's Eva-v1 model applies several complementary analysis approaches.

  • Frequency-domain analysis — GAN and diffusion models leave distinctive patterns in the Fourier transform of an image that natural camera images do not exhibit
  • Temporal consistency analysis — deepfake videos often contain subtle timing artifacts in blinks, lip sync, and head motion across frames
  • Spectral artifact detection — AI-synthesized voice has characteristic patterns in spectrograms that differ from natural human speech variability
  • Semantic inconsistency detection — AI-generated images sometimes produce impossible geometry such as extra fingers, asymmetric features, or inconsistent shadows

Reliable detection requires models trained specifically on synthetic media from contemporary generation tools — not general-purpose computer vision systems. ScamAI's Eva-v1 model is continuously updated to detect outputs from new generation tools as they emerge, including GPT-Image-2, Flux, and other platforms that were not available when earlier detection models were trained.

What detection accuracy looks like in practice

ScamAI's Eva-v1 model achieves 95.3% accuracy on image and video deepfake detection, and 98.5% accuracy on voice clone detection. These figures are measured on in-the-wild benchmarks — real deepfakes collected from the internet, not laboratory-generated samples that detection models were trained on.

Accuracy numbers require context. A 95.3% accurate detector applied to 1,000 images will generate approximately 47 incorrect results — either missing real deepfakes (false negatives) or flagging authentic images (false positives). For KYC onboarding, false positives represent friction for legitimate users; false negatives represent fraud risk. Choosing the right sensitivity threshold depends on the specific use case.

Pro Tip

ScamAI's API returns a confidence score alongside each detection result. Setting your own threshold lets you tune the trade-off between false positives and false negatives for your specific use case.

FAQ

Frequently asked questions

Try deepfake detection for free

200 free image analyses per month. No credit card required.

Start Free

Related articles

What is a deepfake? How they work and how to detect them | ScamAI