How Spammers Hijack Abandoned URLs to Spread SEO Garbage Across the Internet


Illustration: Jim Cooke/Gizmodo

“Was The Morningside Post website hacked?” a friend asked me. The site, which I once co-edited, seemed to have died, and returned as a zombie version of itself. About five months ago, my successors at TMP—the student-run news publication at Columbia University’s School of International and Public Affairs—accidentally allowed their site’s web domain registration to lapse. A mysterious new owner snapped the site up, cloned its content, and transformed it all into sloppy, spammy garbage.

Today, there’s a new visual theme and a new, generic tagline, “World News.” References to Columbia University are gone. The old posts are there, in violation of TMP’s copyright, but, they’re no longer formatted, just ugly walls of texts. Author bylines now read “Morningside,” and later “Writer.” Comments sections are closed, the old comments are deleted.

The Morningside Post’s site, before (left) and after the conversion (right).

As of early June, there was just one new post, an advertorial promoting a Toronto-based drone photography company, SkySnap. It includes a prominent link to a strikingly similar article on the company’s site.

In an effort to get to the bottom of what had happened to my old site, I came to realize this wasn’t a one off. The internet is packed with nondescript, cookie-cutter garbage sites: endless badly-written WordPress blogs on marketing and weight loss, keyword-splattered business directories, link-spammed content sections, and fake crowds of braying social media bots.

Why? In large part, because “black hat” and “grey hat” search engine optimizers (SEOs)—those who knowingly violate Google’s rules—create vast networks of interlinked spam content sites, in part on the ruins of the old, Web 1.0 Internet, for the sake of boosting their clients’ own sites to the top of Google’s search rankings. (A spokesperson for Google requested detailed questions, then declined to comment for this story.)

Buying up dead domains for SEO is not a new technique. In the words of Jason Duke, industry veteran and co-founder of the.domain.name, an SEO data site, “Bringing a site back to life, taking the power of its links, that’s been around for 15-16 years.”

The Morningside Post’s new visual theme. “Before” on the left; “after” on the right.

After repeated requests for comment, Slava Gravets, CEO of SkySnap (and partner at AdWave, an SEO and web marketing firm) told Gizmodo, “It definitely looks like a sponsored link, like a bought link. We do a lot of link building. We sponsor posts on external websites… It’s very possible we bought the link as part of a batch.” He suggested that a blogger—who SkySnap hired to boost its Google ranking—may have just purchased the TMP domain through a for-hire exchange such as Fiverr. He claimed that SkySnap had no way of knowing the specific sources of all its incoming links and had no control over the website hijacked to link back to their services.

While it’s unclear who bought The Morningside Post’s site (TMP has since resumed publishing, now on a different site), the new owner’s reason for doing so was straightforward, Duke and Gravets both said. It’s all about generating backlinks that boost the company’s Google rank. Incoming links, also called backlinks, are links posted on other sites pointing to your site. Google’s founders Larry Page and Sergey Brin built their pioneering search algorithm on the insight that backlink networks signal important information about sites’ relative “quality ranks.”

TMP’s site has backlinks from sites associated with Columbia University, Harvard University, the Atlantic, Business Insider, Al Jazeera, the Huffington Post and others, according to Moz.com, an SEO data portal. Those links confer status. They tell Google that TMP’s site should be ranked above a similar but lower-status site that covers the same subjects.

“If CNN links to another website, if any publication does, in Google’s minds, that’s a vote of confidence,” said Barry Schwartz, news editor of Search Engine Land, an SEO and search engine marketing (SEM) news site. “That website should rank higher.”

Having that status means new links posted to TMP’s site confer status too, if substantially less than links posted to, say, Harvard’s site, or the Atlantic’s, or Gizmodo’s. That’s what makes TMP’s site valuable to SEOs, and that’s how their Google page ranking improves.

“You’re page one, or you’re nothing,” said Duke. “That’s the reality of making money on Google.”

SEO is a very lucrative hustle for people who know how to do it well: Businesses in the U.S. spent an estimated $65 billion on SEO services in U.S. in 2016, according to Borrell Associates, a web marketing data firm.

“Google is the new Yellow Pages,” said Duke. “You know how you have ‘AAAA Doorlock Removals?’ This is the same thing, except instead of lots of lots of A’s, it’s lots of links.”

“Links” is a broad term, though. The nuances of Google’s algorithm, and those of its competitors, remain fundamentally mysterious to outsiders. To find out how this works, I talked to Kyle Duck, an SEO consultant and founder of Triumph.ai, who has been optimizing an army of websites to produce backlinks. He claims he used to pull $10,000 a month only from acting as a middleman in link rental transactions.

Duck said that with the rise of artificial intelligence and deep learning techniques, even the companies themselves may not fully know how this works.

In its Webmaster Guidelines, Google instructs would-be web publishers to focus on creating high-quality content and rationally-organized sites, and to avoid techniques that game the system. The company warns against “link schemes,” such as link selling, “excessive” link exchanging, and automated link creation, as well as tactics like hiding invisible text, presenting false-front sites to search engines, and creating sites that trick users into installing malware or viruses that steal information or exploit their systems. An excerpt:

Basic Principles

Make pages primarily for users, not for search engines.

Don’t deceive your users.

Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you’d feel comfortable explaining what you’ve done to a website that competes with you, or to a Google employee. Another useful test is to ask, “Does this help my users? Would I do this if search engines didn’t exist?”

Think about what makes your website unique, valuable, or engaging. Make your website stand out from others in your field.

Google frequently removes sites from its listings that it finds to be in violation of the guidelines. The company is secretive about its exact methods of detection. But we know that some sites are demoted automatically, through the algorithm that looks for tell-tale signs of spam sites, and some through what are called “manual actions” by Google employees—humans overseeing some of the automatic findings and responding to user-submitted spam reports.

Messages owners receive when Google employees penalize their sites.

“There are many shades of gray,” Joost de Valk, founder and CEO of Yoast, the top SEO plugin on WordPress, running on over 4.6 million sites, told Gizmodo in an email. “From buying dropped, related domains and reviving them and adding links, to building large networks of sites on older domains and not caring much about the history—there are tons of SEOs out there who’ve never done anything like this, but there are also many who like the ‘quick and dirty,’” he wrote. “That’s what this is.”

One way the latter sort of SEO avoids detection, said Duke, is by not not placing advertisements on their site. When advertisers make accounts with, for example, Google Ad Words, they’re given a unique code to insert on a site’s source code, so it directs ad revenue to their account—one account per credit card. Investigators can search sites looking for that unique code and if they find a network of a few hundred sites all with same Ad Words code, it’s pretty likely they’re all owned by the same entity. Without a unique advertiser-identification code, this method won’t work. The front page of TMP’s site sports a conspicuous, unfilled ad slot.

The site’s new owners took one more step to cover their tracks. Gizmodo had reached out to the new owners of the domain through a TMP “Contact Us” form. The next day, on June 7th, 2016, two new posts went up TMP’s site (see above). One, a story involving President Donald Trump and drone policy, included a blatant grammatical error in its opening sentence. The second, comparing Apple and Amazon products, had nothing to do with the content of the site, and included another spammy link (to a site called “Gizmotimes,” as it happens). Two additional posts have gone up since, following the same playbook.

These are done as a form of “link laundering,” wrote Duck in an email, of the two posts. “The idea is to add relevant content to the site [that doesn’t include] links to your target site. The goal is to pass a manual review by one of Google’s PhD linguistic contractors.”

Whatever PBN (Private Blog Network) operators are doing, it seems to be working, at least for some. Judging by the chatter on “black hat” SEO forums—as well as on Google’s own support forums—efforts to game the system abound.

“I haven’t read [Google’s webmaster guidelines] in a while, I have to confess,” said Duck. He suggested that what works best in SEO is not what Google says is “supposed” to work best, so these guidelines are probably not very useful against people intending to game SEO. “If you’re gonna hunt deer, it’d be like asking the deer, ‘Where are you gonna be?’” he said.

From Google’s Webmaster support forums. “PBN” stands for “Private blog network,” a group of sites from which SEOs can link to their target sites.

He said a link on a site like The Morningside Post’s could rent for as much as $50-$60 a month. When the domain sold at auction, after it was “dropped” from the registry, Duck estimated it could have gone for as much as $1,000.

But to focus on an individual site is to miss the forest for the trees, he suggested.

“My business partners and I owned more than 3,000 sites like this,” Duck said. “50 people worked on the network on a daily basis,” including interns as well as contract workers in the Philippines.

Search results on Fiverr, the service marketplace, for “Domain authority 40+”

“Any domain that has SEO value is going to get grabbed up automatically,” he added. Creating and managing sites, producing posts, avoiding detection—every step can be automated.

“Even if you think about all the websites that ever existed,” said Duck, “at this point, it’s hard to find decent ones.” That’s because so much of the web has already been colonized.

Thedoubledouble.com, before and after.

But, how vast an expanse does the SEO spam web cover? Duck estimated as many as 80 percent of all websites exist solely for SEO purposes. The secondary domain data broker, Duke, roughly concurred. Others, like Pete Meyers, marketing scientist at Moz.com, disagreed. “That sounds crazy to me,” said Search Engine Land’s Schwartz. Using some back-of-the-envelope math, Meyers proposed that out of 1.2 billion websites on the internet, about 19.5 percent was dedicated to SEO spam.

“I asked Google just for fun,” said Meyers. “They said if they knew, they wouldn’t tell me. But, they don’t know.”

A site formerly run by the National Guild of Hypnotists, now linking to locksmiths in every zip code in the U.S.

Several of the experts told Gizmodo that outside of backlinks, social media post traffic is the greatest force affecting search rankings. “They can buy bot traffic directly,” said Augustine Fou, a cybersecurity and ad fraud consultant and an adjunct professor at New York University. “If you need to promote an article, you can have the botnet tweet it 100,000 times.”

Others, just as confidently, social media traffic had little or no impact.

Defenses against these newer methods aren’t yet particularly robust, according to Fou. And, there might be a reason for that: the potential conflict of interest inherent in being both a policer of fraud and a seller of ads.

A national education nonprofit, with its keyword­ and link­bombed comment section.

“If you think about the financial motives,” said Fou, echoing sentiments expressed by several of the experts we spoke with, “I’m not accusing Google, but they don’t have a lot of incentive to hurry up. If they detect a ton of bot traffic, and they clean it up, that means they just cut their topline ad revenue in half.”

I did my part, anyway. I officially reported the Morningside Post’s site to Google via the company’s link spam reporting tool. No action yet.



Source link

?
WP Twitter Auto Publish Powered By : XYZScripts.com