Note: Wow. This got long, and somewhat technical. For today, some of you might want to look at cute pictures of cats instead. I won’t mind.
I noticed the other day a huge rush of spam comments from ip addresses starting 108.62. I did a lookup and found that the whole block is owned by an outfit called Nobis Technology Group. Most of the addresses also mentioned Ubiquity Server Solutions. They are a massive hosting and colocation service. Basically, they supply the hardware and infrastructure, and their customers set up Web servers and whatnot.
Some of those customers (or the customers of the customers) send out a lot of spam. A truckload. In some cases the customer of a customer of a customer might have been lax and his server got hacked and turned into an unwitting spambot. In other cases the people using Ubiquity’s servers are likely institutional spammers.
Brief aside: Why does comment spam even exist in the first place? Google plays a big role there, with a number called Page Rank. Part of Page Rank (at least historically) was that more links pointing to a page make it land higher in Google searches. So, the spam comment isn’t to get readers of a blog to buy Doc Marten shoes, it’s to get that particular site to land higher in Google’s results when someone searches for them.
The thing is, Google doesn’t publish page rank numbers anymore, and they steadfastly maintain that the comment spamming actually hurts your results in a search. That hasn’t stopped many companies from promising higher sales and taking people’s money in return for smearing their name all over the Internet.
Google could go a long way toward eliminating this sort of spam by publishing page rank again, only now include the amount the rank was hurt by spamming activities. My shoe salesman above is not going to keep paying when Google shows the opposite of the desired result.
So anyway, using CloudFlare’s threat control, I blocked an entire range of ip addresses allocated to Ubiquity’s servers. Then another. I didn’t like this solution; I had no idea how many legitimate potential blog visitors I was blocking. After reading more, the answer surprised me.
The folks at Ubiquity point out that they have terms of service that prohibit using their infrastructure to spam people. When I sent them a complaint, they were professional and courteous. They asked for more specifics, then said they’d sent a complaint to the culprit. Only after they’d asked what my domain name was.
Question: Did they send a message to the culprit saying ‘stop spamming people’ or did it say ‘stop spamming that guy?’
On other blogs where people have ranted about Ubiquity, representatives of the company have responded with measured, rational responses, explaining what a huge uphill battle it is for them, and asking the community to keep sending reports when spam comes from their range. Those reports make it possible for them to put sanctions on clients who are in violation of their terms of service. It is a huge problem and not easily solved.
And yet. Other hosting companies don’t seem as bad, from where I’m sitting.
One of those responses from a Ubiquity representative threw out the argument (I’m paraphrasing from this) “While it’s theoretically possible to monitor all data to weed out the 500MB/s of spam from the 2GB/s of legitimate traffic, that would be really expensive and we wouldn’t be able to compete in this market.” My first takeaway: they think 20% of the traffic from their servers is unethical. Wow. Now, that’s reading a lot into a statement like that, so take it with a grain of salt. Also, it was in a comment to a blog post and may well have been a typo in the first place.
But still, it makes me wonder. And a request coming in to a server for data (legitimate traffic like a request to load a Web page) is fundamentally different than robots on a server sending unrequested data OUT (a high percentage of which will be spam), and sending emails (almost all of which will be spam). A small random sampling of GET and PUT messages outbound from their data centers would probably smoke out the most egregious violators pretty quickly, and not require a lot of hardware to implement. (Not sure how I feel about this from a privacy standpoint.)
Once I got the message that Ubiquity had sent their complaint to the spammer involved, I unblocked that range. Sure enough, in a few minutes more spam came through. I sent the report and back up went the blockade. In my casting around the Internet I read assertions that were not contradicted (so must be true!) that said that NO legitimate traffic would come from those IP’s anyway; they were the addresses of big servers and not IP’s that would appear when Joe User is surfing. So there’s no downside to blocking them. (I’ll put the blocked ranges in a comment below, if you want to follow suit.)
Although, as I put the blockade back up, I had a thought: If I complain about every violation, and cc Google, then the cost of NOT clamping down more effectively on the host’s clients goes up. At some point, if enough people complain enough times, the cost of fixing the problem at the source becomes less than the cost of continuing to do business they way they are now.
That goes not just for Ubiquity, but for all hosts, and for Google and the other search engines. There is no incentive for them to play nice unless we create one.
Yep, I’m proposing fighting spam with a deluge of emails, and I’m probably too lazy to do it effectively.
Of course, this blog is hosted at a data center that almost inevitably will have spammers. Do I want to pay more for my own hosting because my data center has to install a bunch of spam detectors? In my case, I’d be willing to pay a bit more to know my host is doing the right thing, but I think I’d be in the minority. That makes it really difficult for one host to unilaterally decide to take the high road. And you’d be alienating about 20% of your customers, if Ubiquity’s off-the-cuff numbers are an indication.
Hi Jerry, what abut captcha implementation? I made it long time ago
on my WordPress blog and spam is off…
Almost all the spam is getting intercepted before it reaches public eyes, and I’d rather not put any barriers up that affect legitimate commenters. Even spam that gets blocked adds to server load and bandwidth usage, however, and eventually that costs everyone.
As promised, the Ubiquity ranges I blocked:
108.62.0.0/16
173.208.0.0/16
23.19.0.0/16
64.120.0.0 – 64.120.127.255 [tricky, as CloudFlare doesn’t accept /17. might have to use apache to block those]
Since I use CloudFlare to block them, the requests never even reach my servers. Others would probably best be served to block with .htaccess.
If I block other Ubiquity IP’s, I’ll update the list.
Now if you are talking about massive amounts of complaints to drive up the cost of the USS doing business, could you be considered to be advocating a DDOS-like attack?
Wow. Now that I’m checking regularly, it’s amazing how many comment spams come from Nobis (the parent of Ubiquity). Another block added to the list above.
Comment spam hasn’t been about Page Rank since everyone started using the nofollow attribute. These days, it’s just about luring in the blog’s readers. Related link, and test of whether this blog uses nofollow properly: http://en.wikipedia.org/wiki/Nofollow
I have a user style rule I use to easily tell when nofollow is being used (you might want to change the way it displays though, the #-symbol is kind of ugly): a[rel~=”nofollow”]:before { content: “# “; }
Publishing any data at all about Page Rank would give attackers a better opportunity to game it, even if that data was designed to show spammers being punished.
Yeah, you may be right about Google publishing the numbers. However, it’s not whether page rank is helped by comment spamming, it’s whether marketers can convince people what they do will improve search engine results. It doesn’t have to be true, it just has to sound good.
And, some of the spams at least do have links that could entice readers to click, so obviously it’s not all about page rank; there’s the direct marketing value as well, as you say.
There may not be much Google can do about it in the long run, but I think they can try a little harder to at least get the word out that comment spamming hurts page rank (and to make that true).
This blog sits on WordPress, which automatically adds nofollow to all links submitted by people who aren’t me. I hadn’t actually double-checked that in a while, so thanks for reminding me about that.
Now that Ubiquity is pretty much shut off from this blog (and the spammers seem to have stopped trying, so maybe some good was done), Singlehop is the new subject of my ire. I wrote this to them today:
Here ist an updated list of NOBIS/UBIQUITY netblocks:
69.147.224.0/19
64.120.0.0/17
23.80.0.0/16
23.19.0.0/16
23.104.0.0/13
173.234.0.0/16
173.208.0.0/17
142.91.0.0/16
108.62.0.0/16
From what I understand, a lot of spam comes from so-called bot-nets, swarms of computers that each belong normal non-spammy people who have had their computers infected with a sub7-like remote control.
This way, the spam is coming from, by definition, normal users.
Side note, I concur with the above poster that there is a certain degree of irony in calling for a concerted effort for citizens to email spam the heck out of companies whose IP ranges are perceived as generating too much spam blog comments. A “Spam-cott,” one might call it. Or rather, “It’s time to start fighting spam with spam.” Or perhaps, “Ok everybody knows you can’t out-spam a spammer. But maybe… just maybe… could we spam their internet service providers into submission?”
While my blockade of Ubiquity Solutions won’t stop the spam from unwitting regular folk who’s refrigerators have been turned into spambots (yay Samsung!), that action empirically reduced the amount of crap getting scooped up by my crap-catchers.
Once we recognize that the ironic solution is also a complete pipe-dream, we have to realize that we little guys have no mechanism to make unethical behavior too expensive to be worth it, so we just keep building walls.