Note: Wow. This got long, and somewhat technical. For today, some of you might want to look at cute pictures of cats instead. I won’t mind.
I noticed the other day a huge rush of spam comments from ip addresses starting 108.62. I did a lookup and found that the whole block is owned by an outfit called Nobis Technology Group. Most of the addresses also mentioned Ubiquity Server Solutions. They are a massive hosting and colocation service. Basically, they supply the hardware and infrastructure, and their customers set up Web servers and whatnot.
Some of those customers (or the customers of the customers) send out a lot of spam. A truckload. In some cases the customer of a customer of a customer might have been lax and his server got hacked and turned into an unwitting spambot. In other cases the people using Ubiquity’s servers are likely institutional spammers.
Brief aside: Why does comment spam even exist in the first place? Google plays a big role there, with a number called Page Rank. Part of Page Rank (at least historically) was that more links pointing to a page make it land higher in Google searches. So, the spam comment isn’t to get readers of a blog to buy Doc Marten shoes, it’s to get that particular site to land higher in Google’s results when someone searches for them.
The thing is, Google doesn’t publish page rank numbers anymore, and they steadfastly maintain that the comment spamming actually hurts your results in a search. That hasn’t stopped many companies from promising higher sales and taking people’s money in return for smearing their name all over the Internet.
Google could go a long way toward eliminating this sort of spam by publishing page rank again, only now include the amount the rank was hurt by spamming activities. My shoe salesman above is not going to keep paying when Google shows the opposite of the desired result.
So anyway, using CloudFlare’s threat control, I blocked an entire range of ip addresses allocated to Ubiquity’s servers. Then another. I didn’t like this solution; I had no idea how many legitimate potential blog visitors I was blocking. After reading more, the answer surprised me.
The folks at Ubiquity point out that they have terms of service that prohibit using their infrastructure to spam people. When I sent them a complaint, they were professional and courteous. They asked for more specifics, then said they’d sent a complaint to the culprit. Only after they’d asked what my domain name was.
Question: Did they send a message to the culprit saying ‘stop spamming people’ or did it say ‘stop spamming that guy?’
On other blogs where people have ranted about Ubiquity, representatives of the company have responded with measured, rational responses, explaining what a huge uphill battle it is for them, and asking the community to keep sending reports when spam comes from their range. Those reports make it possible for them to put sanctions on clients who are in violation of their terms of service. It is a huge problem and not easily solved.
And yet. Other hosting companies don’t seem as bad, from where I’m sitting.
One of those responses from a Ubiquity representative threw out the argument (I’m paraphrasing from this) “While it’s theoretically possible to monitor all data to weed out the 500MB/s of spam from the 2GB/s of legitimate traffic, that would be really expensive and we wouldn’t be able to compete in this market.” My first takeaway: they think 20% of the traffic from their servers is unethical. Wow. Now, that’s reading a lot into a statement like that, so take it with a grain of salt. Also, it was in a comment to a blog post and may well have been a typo in the first place.
But still, it makes me wonder. And a request coming in to a server for data (legitimate traffic like a request to load a Web page) is fundamentally different than robots on a server sending unrequested data OUT (a high percentage of which will be spam), and sending emails (almost all of which will be spam). A small random sampling of GET and PUT messages outbound from their data centers would probably smoke out the most egregious violators pretty quickly, and not require a lot of hardware to implement. (Not sure how I feel about this from a privacy standpoint.)
Once I got the message that Ubiquity had sent their complaint to the spammer involved, I unblocked that range. Sure enough, in a few minutes more spam came through. I sent the report and back up went the blockade. In my casting around the Internet I read assertions that were not contradicted (so must be true!) that said that NO legitimate traffic would come from those IP’s anyway; they were the addresses of big servers and not IP’s that would appear when Joe User is surfing. So there’s no downside to blocking them. (I’ll put the blocked ranges in a comment below, if you want to follow suit.)
Although, as I put the blockade back up, I had a thought: If I complain about every violation, and cc Google, then the cost of NOT clamping down more effectively on the host’s clients goes up. At some point, if enough people complain enough times, the cost of fixing the problem at the source becomes less than the cost of continuing to do business they way they are now.
That goes not just for Ubiquity, but for all hosts, and for Google and the other search engines. There is no incentive for them to play nice unless we create one.
Yep, I’m proposing fighting spam with a deluge of emails, and I’m probably too lazy to do it effectively.
Of course, this blog is hosted at a data center that almost inevitably will have spammers. Do I want to pay more for my own hosting because my data center has to install a bunch of spam detectors? In my case, I’d be willing to pay a bit more to know my host is doing the right thing, but I think I’d be in the minority. That makes it really difficult for one host to unilaterally decide to take the high road. And you’d be alienating about 20% of your customers, if Ubiquity’s off-the-cuff numbers are an indication.