Can GPT-4 Vision Detect Deepfake AI Images?
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
GPT-4 Vision can assign very high fake probabilities (up to 90%) when composites show measurable inconsistencies like mismatched noise patterns and lighting direction.
Briefing
GPT-4 Vision can flag some AI-made images with high confidence, but it also produces shaky results—especially when the manipulation is subtle or when “common sense” cues dominate over pixel-level evidence. In repeated tests, the system often latched onto visible inconsistencies (like mismatched lighting, edge clarity, or compositing artifacts) to assign fake probabilities ranging from as low as 5% for a real photo to as high as 90% for certain manipulated composites.
A clear win came with a cat-and-Trump composite. After analyzing artifacts, pixel characteristics, noise levels, and lighting direction, it concluded the image was likely manipulated, assigning a 90% probability of being fake. The reasoning leaned on measurable-looking differences: the cat’s edges (around ears and tail) appeared sharper than expected, the face and surrounding elements showed inconsistent clarity, and the cat and man seemed to carry different noise patterns. Those quality mismatches—typical of overlayer insertion—pushed the assessment toward fabrication.
The system also struggled with false positives and inconsistent scoring. A real Pope Francis image was rated as likely genuine, with only a 5% chance of being fake, suggesting the model can sometimes avoid overreacting when no obvious manipulation signals appear. But a separate viral “Pope in a puffy jacket” image drew an 80% fake probability, with the critique hinging largely on contextual plausibility: the attire and how it appears on a public religious figure raised suspicion even when the artifact checklist didn’t find dramatic technical anomalies.
When the manipulation was harder to see—or when the image looked broadly coherent—scores dropped. A Lamborghini image generated from stable diffusion (and an alternate off-road-style version) was given only a 10% fake probability, despite the creator’s expectation that it would be obviously synthetic. Similarly, a grainy UFO-related image attributed to U.S. Navy context was rated 20% fake (80% authentic), with the analysis treating the grain and screen-like overlays as consistent with device capture rather than editing.
The most telling pattern emerged in a direct comparison test using the Taylor Swift tattoo scenario. In one run, GPT-4 Vision assigned moderate fake probabilities (40–65%) to an image where tattoos were added. After uploading the “real” version for comparison, the model’s conclusion sharpened dramatically: the tattoo-added image was rated 90% fake. The takeaway is that the model’s confidence improves when it can anchor its judgment against a reference image, making “what changed” easier to detect.
Overall, the results point to a practical conclusion: GPT-4 Vision isn’t reliable enough to be trusted blindly for deepfake detection, but it can be useful—especially when manipulations create detectable compositing inconsistencies or when comparisons to known-good references are available. The broader implication is that AI-assisted detection may be necessary as synthetic imagery becomes more common and harder for humans to spot by eye, even if current tools still miss the mark.
Cornell Notes
GPT-4 Vision sometimes detects manipulated images with strong confidence, particularly when edits create clear compositing signals such as mismatched noise patterns, inconsistent lighting direction, or unusually sharp edges around inserted elements. In tests, a cat-and-Trump composite was rated 90% fake, while a real Pope Francis photo was rated only 5% fake. Results were less dependable for other cases: a stable-diffusion Lamborghini variant received just 10% fake, and a grainy UFO-related capture was rated 20% fake. Confidence improved most when a reference image was provided—after comparing two Taylor Swift versions, the tattoo-added image jumped to 90% fake, suggesting “difference detection” boosts accuracy.
What kinds of visual inconsistencies most reliably pushed GPT-4 Vision toward “fake” in the tests?
How did GPT-4 Vision handle false positives when given real images?
Why did the “Pope puffy jacket” case land at a high fake probability even without obvious artifact findings?
What happened in the Taylor Swift tattoo experiment, and what changed after comparison?
Why might AI-generated-looking images still receive low fake probabilities?
Review Questions
- Which specific visual signals (e.g., noise, edge sharpness, lighting direction) most increased the fake probability in the cat-and-Trump composite?
- What evidence did the model rely on more heavily in the “Pope puffy jacket” case—technical artifacts or contextual plausibility?
- How did providing a reference image change the Taylor Swift tattoo detection outcome, and what does that imply about difference-based forensics?
Key Points
- 1
GPT-4 Vision can assign very high fake probabilities (up to 90%) when composites show measurable inconsistencies like mismatched noise patterns and lighting direction.
- 2
The model can also avoid false positives on some real photos, such as rating a Pope Francis image as only 5% likely fake.
- 3
Confidence drops when edits produce images that remain visually coherent, as seen with a stable-diffusion Lamborghini variant rated 10% fake.
- 4
Contextual “common sense” reasoning can drive high fake scores even when the artifact checklist finds little—illustrated by the Pope puffy jacket case at 80%.
- 5
Providing a reference image can dramatically improve detection, as the Taylor Swift tattoo-added image rose to 90% fake after comparison.
- 6
Deepfake detection remains unreliable for blind trust; it works best when manipulations create detectable compositing artifacts or when comparisons are available.