Recently I've been approached by different news organisations to comment on deepfaked images and videos. In most cases they already know whether the thing is a fake or not, but they want to know why. It's been a pretty fascinating thing to be tasked with, honestly, and some of the examples completely caught me by surprise (warning: Daily Mail link). Many of us see faked images on a daily basis now, but there's not a lot of writing about how fakes are detected other than one-liner folk knowledge on places like Twitter. I thought I'd write a bit about how I approach the problem.
I should stress before I go on - I'm not an 'expert' at detecting deepfakes, I don't think anyone is today really as it's just not a thing people have particularly had to specialise in. I do end up looking at a lot of AI-generated art just as part of my job, so I maybe have a little more experience than average, but I wanted to stress that I'm not saying these are foolproof ways to detect fakes. They're just a second opinion, for you to take and think about and combine with your own experience. Ok, enough disclaimers, let's dive in.
Text
Text is still one of the hardest things for AI image generators to do, and even the models that do it well now often struggle in photorealistic settings (also a lot of fakes aren't made with the most cutting-edge commercial models). Text is a good thing to check because it needs to be both structurally correct - the letters need to look correct, the font needs to be consistent - and semantically correct - it needs to mean something in a real language humans speak.
The biggest tell in the recent Trump deepfake that hit the news headlines was text - look at the hat being worn in the top-left of the image, and to a lesser extent on the shirt right-of-center:

Printed text in particular has high regularity requirements - letters will be the same sizes, aligned together, all consistently styled. In the hat here we can see the letters don't look like letters, but they also aren't centred as we'd expect them to be either. It's obviously possible to make a hat like that (actually making clothing that intentionally has what looks like deepfaked text would be pretty funny) but it's not usual for clothing, so it's a red flag.
Continuity Errors
Check out this image of Joe Biden. It's fake:

There are lots of ways that things in images can be connected to one another. For example, in the image above, we would find it weird if the people in the background were wearing clown costumes instead of US military uniforms. The people are connected through the meaning of the image, there's a consistency we expect. Connections can also be very fine-grained - on the US flag patch on Biden's arm, we expect each stripe on the flag to alternate white and red and be perfectly horizontal. These are examples of things that are loosely connected over a large distance (the people in the image) and more tightly connected to their immediate surroundings (the individual stripes on a flag).
AI image generators can handle both types of context, but they sometimes struggle when the two are combined - if something has to have a consistent connecting detail across a gap, or a big portion of the image. Look at the line on the inside edge of this plate, where it meets the pastry:

If we zoom in really close, we can see there's a line that follows the edge of the plate, curving, disappearing under the pastry. But it doesn't emerge out the other side. In fact, if you look really closely, it just stops abruptly just before it meets the pastry.

This is quite a subtle detail to pick up, but it's sometimes seen whenever a shape is interrupted by something else. To us, the shape should obviously continue underneath the object and appear out of the other side. I like to think of this as the image generator wrestling between making the image more coherent locally, and more coherent globally. It's tried to match the curve of the bottom of the pastry and align the line on the plate with it, instead of realising that it is part of a circular pattern on the plate itself. You can see a similar effect in the same image in a wider shot. Check out the grain on the table:

Intuitively we know that wooden tables should have fairly consistent lines marking where pieces of wood were brought together. We can accept if they aren't perfectly aligned because they might have a more rustic look, but the line separating two bits of wood in the top part of the table just disappears when it emerges on the other side of the plate.

Again, it's not a guaranteed, damning bit of evidence, but it's something to look out for - shapes that you know should be completed, but that are obscured or covered by something else. For a more obvious example, I won't embed this here (cw: cats in distress, although it is fake) but here's a fake photo of two cats hugging amidst wreckage in a war. You can see that one of the paws supporting the ginger cat is at completely the wrong angle, because the bodies of both cats are obscuring it. To the AI, it could be connected to either cat - but the cat it appears to be connected to already has two forelegs visible.
Context Interference
AI image generators aren't very good at drawing archers. In fact, I was surprised to find people complaining about this specific use-case online:

There's actually something really interesting about the example images attached to the Reddit post, something that I see a lot with all sorts of deepfakes. Check this out and see what stands out to you (I mean in fairness there's a lot going on, this is not a good bit of AI art):

Many modern AI image generators use a process called diffusion, that starts with a noisy image and slowly removes the noise, replacing it with pixels that make it look more and more like an image that suits the input. We can imagine it like a big array of sensors, each one measuring something different about the image, that we've all trained to look for things that we associate with images labelled as 'archers'. Because of the images we've been trained on, we'll probably look for forests and the colour green, hoods, bows, that sort of thing. We can imagine each one of these gets their own little sensor (this is not how it works, of course, it's just an analogy) that detects their specific archer-related tell.
The image generator can't satisfy all of our Archer Detectors at once, but by trying to jostle the pixels around and maximise its score, it ends up finding its way to an image that scores relatively highly (this is linked to why AI image generators are good at finding images which are simultaneously both novel and not innovative at all at the same time). Anyway, imagine we're halfway through this process and you, the AI, are looking at an image like this:

You know that archers hold bows which have strings on, and you also know they are often wearing fantasy leather armour with lots of belts on. The bit circled in red here is pleasing both aspects of your archer detector right now. But you've got a problem: you need to keep removing noise and making the image more 'final'. Bowstrings and belts look similar in a blurry, noisy image but they don't look the same at all in a finished one. The AI decides to turn it into a belt, and you get what you see in the finished piece - which is a belt that almost perfectly follows the shape of where the bowstring should be. It's not that the bowstring is missing, it's just been turned into something else entirely. I've tried to highlight it here:

This is a really interesting mode of failure to me. It's also vanishingly rare in real photographs, especially candids, where things are unposed and tend to be randomly arranged. In photographs I most often see it in folds of clothing or wires and cables, which often get confused for one another. You can sort of see another example of it here: a phone cable is draped over Biden's arm, seemingly leading to nothing. It's likely that at some point this also looked a bit like a shadow in the folds of his sleeves, or something else that eventually got completed into a cable.

Texture, Lighting and Composition
AI image generators have an interesting tendency to always try and make pretty images. Not accidentally - this is the result of tweaking them towards making 'good' output, because that's often what people want. I've experienced this myself - when I was testing some generators last year to make slides for my New Scientist talk, I tried very hard to create an image of a particular scene without dramatic sunset lighting, but no matter how hard I tried I simply couldn't do it. Many generators are engineered, fine-tuned or otherwise tweaked to juice the input prompt you give them.
As a result, one of the biggest tells I find in low-effort fakes is the textural qualities of the image. The only problem is that it's quite hard to explain or teach someone what to look for - either it looks right to you or it doesn't. For example, look at this second image from Trump's recent AI deepfake set:

This looks more like a digital painting than a photograph. Most portraiture, especially quick stuff done for the news, political campaigns or candids shot by the public, do not come out slick and fuzzy like this. There are no textural flaws on anyone's skin or clothes, everything is smooth, evenly lit, no blemishes, tears or even strange glares. Photos never look like this.

Here's another image, allegedly a portrait photo of two people on their wedding day. Again, the lighting and skin texture is completely off. Even models on a photoshoot do not look like this without extensive photo editing after the fact, which is just simply unlikely. The problem is that not only is this hard to teach someone, it's also not very compelling as evidence. But it's a good thing to train your eye on as a first-round suspicion of sorts.
Sleight of Hands
You'll notice I didn't mention fingers, hands or arms anywhere in this post. That's because while they're a common trope about AI, thinking about fakes only in terms of these specific examples leave us open to being tricked by AI organisations who simply focus their efforts on those specific issues - and they've done just that. Prompters began to shift towards prompts which hid hands, and more recent models are much better at rendering them. Having too many fingers or hands is a snappy thing everyone remembers about AI generators, but it's important that we try and learn why they were tells in the first place. If we start to think in more general terms about what AI struggles with and how to spot it, we can keep looking for new examples, even when the old ones get fixed.
AI technology is always shifting, so even these guidelines won't be appropriate forever. But the more you learn about how AI systems work, what they try to optimise for, and the things they cannot do, you'll be able to reason about things yourself eventually. In the meantime, don't be drawn into making generalisations and assumptions about AI technology, because I guarantee you those same assumptions will be used to trick you before long.
Video may or may not be a new field for us soon. Sora made a big splash when OpenAI announced it recently, but video is an order of magnitude harder than images, and comes with a lot of new complications. We can generate audio, but there aren't systems that do both at once yet, and then there's issues of lip synching and so on. It's definitely possible to compose all of this stuff together right now, but we're not at a point where it's so trivial that the Internet is flooded with it. When that arrives, we'll probably be able to apply some of the ideas we've listed here, but we'll also need new ones too.
Conclusions
You can't ever know if something is really real or not on the Internet any more - but that has, to some extent, been true for a long, long time. It's also very easy to fool most people with things that don't even look particularly real. What we're experiencing now isn't exactly new, it's just a gear shift that we aren't used to. I see very smart people regularly share TikTok content that is clearly staged or fake and believe it's real - the Internet is and always has been full of lies, and that's also where a lot of its charm and playfulness comes from! I'm only saying all this because I don't want you to despair that this is the end of truth - you will adjust, you will learn to spot new things, and you'll also learn to not trust certain sources that you maybe did trust before. Ultimately that might be a good thing, because pre-genAI we probably fell for a lot of lies without even realising, so a bit of a wakeup call might help us in the long run.
In any case, I hope you found this interesting and my examples helpful! They are not clear-cut rules, just some insight into how I think about some of this stuff. Good luck out there and stay safe.
Thanks to Chris and Fed for feedback on an initial draft of this piece.