You tap your subject's face. The camera hesitates, hunts, then snaps to the brickwork six inches behind them. Sharp mortar lines, beautifully blurred nose. You tap again. Same result. The wall wins.
This isn't a cheap-camera problem. It happens on flagship phones, on cameras that cost more than a month's rent, on hardware that can identify a dog's breed in a dim room. The culprit is a fundamental mismatch between how phase-detection autofocus works and what repeating geometric patterns look like to a sensor.
The tiny prism trick that usually works brilliantly
Modern phone cameras focus using phase-detection autofocus (PDAF). Dedicated pixels on the sensor are masked so half see light from the left side of the lens, half from the right. The processor compares those two views: if they're offset, the lens is out of focus, and the direction of the offset tells the motor which way to move and by how much. Fast, usually under 100 milliseconds. It's why tapping a face in good light produces an almost instant lock.
The whole system depends on one assumption: the two half-images look different from each other only because of focus error. Fix the focus, align the images, done.
Repeating patterns break that assumption completely.
Picture a tiled floor, thirty identical white squares separated by dark grout lines. The left-eye image and the right-eye image of that floor are offset, yes. But they're also nearly identical to the image shifted by one full tile. The processor can't tell whether it's looking at a focus error of a few millimetres or an alignment error of one whole tile. So it picks a candidate, often the wrong one, and locks. Think of it like trying to align two identical combs: every tooth looks like a match.
This is called the aperture problem in computer vision, documented in optical engineering since the film era. Contrast-based autofocus, the older method where the camera hunts for peak sharpness by moving the lens back and forth, fails on these scenes for a related reason: a repeating pattern has multiple local contrast peaks, one per period of the pattern. The lens finds the nearest peak. Not yours.
Why some scenes are worse than others
Frequency matters. Fine patterns (window screens, herringbone fabric, a chain-link fence) confuse autofocus more aggressively than coarse ones because there are more ambiguous candidates within the sensor's search range.
Angle helps, a little. A brick wall shot dead-on is maximally periodic. Tilt the camera slightly and perspective distortion breaks the perfect repetition, giving the algorithm more to grip. Photographers do this instinctively, without knowing why it works. Knowing why it works is better.
Depth separation is the real fix. Take two people shooting the same tiled hotel lobby, floor stretching to a far wall. Priya is trying to focus on a bag sitting directly on the tiles. Tom is photographing a person standing in front of the tiles, about two metres of air between subject and wall. Priya's camera hunts and misfires repeatedly: the bag and the floor behind it are almost the same distance, so the phase signal is ambiguous. Tom's camera locks cleanly because the depth gap gives PDAF a clear reading that isn't replicated anywhere in the tile pattern. Depth separation, not subject size, is what saves the shot.
Object recognition helps too. If the camera app's AI layer identifies a face or a pet, it biases the focus search toward that region and ignores the competing pattern signals behind it. That's why portrait mode on most current phones handles brick walls better than a standard tap-to-focus: the face detection is doing the heavy lifting before the optical algorithm even runs. The optics haven't changed. The politics of which signal gets priority have.
What you can actually do about it
Tap and hold instead of just tapping. On most phones, a long press locks both focus and exposure (AF/AE lock). Move the camera so the focus point covers your subject against a plain background, lock, then reframe. Two seconds of deliberate work beats three ruined shots, every time.
Close the distance. The closer your subject is relative to the background, the larger the depth gap, the harder it is for the pattern to fool the sensor.
Switch to portrait mode even when you don't want the blur. The face-detection layer stabilises focus independently of whatever chaos is happening behind your subject.
And if your phone is still hunting on a featureless concrete wall with no repeating structure at all, that's a different problem entirely: low contrast, not periodicity. Worth knowing the difference.
The camera isn't confused. It's doing exactly what the physics demands. The scene handed it a question with thirty identical right answers, and it picked one. Unfortunately, none of them were yours.