Do 404 errors hurt SEO? It’s a simple question. However, the answer is far from simple. Most 404 errors don’t have a direct impact on SEO, but they can eat away at your link equity and user experience over time.
There’s one variety of 404 that might be quietly killing your search rankings and traffic.
404 Response Code
What is a 404 exactly? A 404 response code is returned by the server when there is no matching URI. In other words, the server is telling the browser that the content is not found.
404s are a natural part of the web. In fact, link rot studies show that links regularly break. So what’s the big deal? It’s … complicated.
404s and Authority
One of the major issues with 404s is that they stop the flow of authority. It just … evaporates. At first, this sort of bothered me. If someone linked to your site but that page or content is no longer there the citation is still valid. At that point in time the site earned that link.
But when you start to think it through, the dangers begin to present themselves. If authority passed through a 404 page I could redirect that authority to pages not expressly ‘endorsed’ by that link. Even worse, I could purchase a domain and simply use those 404 pages to redirect authority elsewhere.
And if you’re a fan of conspiracies then sites could be open to negative SEO, where someone could link toxic domains to malformed URLs on your site.
404s don’t pass authority and that’s probably a good thing. It still makes sense to optimize your 404 page so users can easily search and find content on your site.
Types of 404s
Google is quick to say that 404s are natural and not to obsess about them. On the other hand, they’ve never quite said that 404s don’t matter. The 2011 Google post on 404s is strangely convoluted on the subject.
The last line of the first answer seems to be definitive but why not answer the question simply? I believe it’s because there’s a bit of nuance involved. And most people suck at nuance.
While the status code remains the same there are different varieties of 404s: external, outgoing and internal. These are my own naming conventions so I’ll make it clear in this post what I mean by each.
Because some 404s are harmless and others are downright dangerous.
External 404s occur when someone else is linking to a broken page on your site. Even here, there is a small difference since there can be times when the content has legitimately been removed and other times when someone is linking improperly.
Back in the day many SEOs recommended that you 301 all of your 404s so you could reclaim all the link authority. This is a terrible idea. I have to think Google looks for sites that employ 301s but have no 404s. In short, a site with no 404s is a red flag.
A request for domain.com/foobar should return a 404. Of course, if you know someone is linking to a page incorrectly, you can apply a 301 redirect to get them to the right page, which benefits both the user and the site’s authority.
External 404s don’t bother me a great deal. But it’s smart to periodically look to ensure that you’re capturing link equity by turning the appropriate 404s into 301s.
Outgoing 404s occur when a link from your site to another site breaks and returns a 404. Because we know how often links evaporate this isn’t uncommon.
Google would be crazy to penalize sites that link to 404 pages. Mind you, it’s about scale to a certain degree. If 100% of the external links on a site were going to 404 pages then perhaps Google (and users) would think differently about that site.
They could also be looking at the age of the link and making a determination on that as well. Or perhaps it’s fine as long as Google saw that the link was at one time a 200 and is now a 404.
Overall these are the least concerning of 404 errors. It’s still a good idea, from a user experience perspective, to find those outgoing 404s in your content and remove or fix the link.
The last type of 404 is an internal 404. This occurs when the site itself is linking to another ‘not found’ page on their own site. In my experience, internal 404s are very bad news.
Over the past two years I’ve worked on squashing internal 404s for a number of large clients. In each instance I believe that removing these internal 404s had a positive impact on rankings.
Of course, that’s hard to prove given all the other things going on with the site, with competitors and with Google’s algorithm. But all things being equal eliminating internal 404s seems to be a powerful piece of the puzzle.
Why Internal 404s Matter
If I’m Google I might look at the number of internal 404s as a way to determine whether the site is well cared for and has an attention to detail.
Does a high-quality site have a lot of internal 404s? Unlikely.
Taken a step further, could Google determine that the odds of a user encountering a 404 on a site and then use that to demote sites from search? I think it’s plausible. Google doesn’t want their users having a poor experience so they might steer folks away from a site they know has a high probability of ending in a dead end.
That leads me to think about the user experience when encountering one of these internal 404s. When a user hits one of these they blame the site and are far more likely to leave the site and return to the search results to find a better result for their query. This type of pogosticking is clearly a negative signal.
Internal 404s piss off users.
The psychology is different with an outgoing 404. I believe most users don’t blame the site for these but the target of the link instead. There’s likely some shared blame, but the rate of pogosticking shouldn’t be as high.
In my experience internal 404s are generally caused by bugs and absolutely degrade the user experience.
Finding Internal 404s
You can find 404s using Screaming Frog or Google Search Console. I’ll focus on Google Search Console here because I often wind up finding patterns of internal 404s this way.
In Search Console you’ll navigate to Crawl and select Crawl Errors.
At that point you’ll select the ‘Not found’ tab to find the list of 404s Google has identified. Click on one of these URLs and you get a pop-up where you can select the ‘Linked from’ tab.
I was actually trying to get Google to recognize another internal 404 but they haven’t found it yet. Thankfully I muffed a link in one of my posts and the result looks like an internal 404.
What you’re looking for are instances where your own site appears in the ‘Linked from’ section. On larger sites it can be easy to spot a bug that produces these types of errors by just checking a handful of these URLs.
In this case I’ll just edit the malformed link and everything will work again. It’s usually not that easy. Most often I’m filing tickets in a client’s project tracking system and making engineers groan.
Correlation vs Causation
Some of you are probably shrieking that internal 404s aren’t the problem and that Google has been clear on this issue and that it’s something else that’s making the difference. #somebodyiswrongontheinternet
You’re right and … I don’t care.
You know why I don’t care? Every time I clean up internal 404s, it produces results. I’m not particularly concerned about exactly why it works. Mind you, from an academic perspective I’m intrigued but from a consulting perspective I’m not.
In addition, if you’re in the new ‘user experience optimization’ camp, then eliminating internal 404s fits very nicely, doesn’t it? So is it the actual internal 404s that matter or the behavior of users once they are eliminated that matters or something else entirely? I don’t know.
Not knowing why eliminating internal 404s works isn’t going to stop me from doing it.
This is particularly true since 404 maintenance is entirely in our control. That doesn’t happen much in this industry. It’s shocking how many people ignore 404s that are staring them right in the face. Whether it’s not looking at Google Search Console or not tracking down the 404s that crop up in weblog reports or deep crawls.
Make it a habit to check and resolve your Not found errors via Search Console or Screaming Frog.
404 errors themselves may not directly hurt SEO, but they can indirectly. In particular, internal 404s can quietly tank your efforts, creating a poor user experience that leads to a low-quality perception and pogosticking behavior.