|
Voting Rings in Social NetworksPosted by Don in social network, evil, Digg, automation |
A lot of "experts" in social media have made the assertion that if you participate in a voting ring on any of the major social sites, you'll get caught. There is a mystique around the all-knowing data centers that can track your activities on their site. If you cheat, you'll get burned. It's that simple.
I'm not going to pass judgement on whether or not you should participate in voting rings. It's probably not good for your karma. But I am going to say that there is a lot of voting ring activity out there, and if you don't understand the issues around it you're going to be at a disadvantage.
The Rings Exist
It doesn't take much in the way of google searching to turn up a number of vote exchange networks. They're pretty blatant about it. Here are some of the major ones that I found with a cursory search:
- Piqqus - Formerly known as DiggBoss, Members exchange social votes on Digg, StumbleUpon and Propeller.
- SubmitterBot - Exchange Digg and StumbleUpon votes, part of a larger service
- 1rst Link - Exchange links on a huge list of social neworks, as well as a no-reciprocal link exchange.
- Spike The Vote - Formerly a Digg exchange site, was sold on Ebay and purchased by a Digger for $1,000 who shut it down.
- Stumblebot - Appears to be software that will generate stumbles. Not sure if this is a network or just a bot.
- Social Traffic Exchange - A forum for exchanging votes on several social media sites
There are lots more, but you get the idea. You'll notice I nofollowed those links because I don't want to encourage them. But the exchanges aren't limited to just forums and applications. Do a search for "social media" in Google and Yahoo Groups and you'll find several mailing lists all targeted at the same kind of activity.
This is Cheating!
Perhaps, but there is a ton of it going on. Is participating in one of these rings any different than the "A Listers" who send out 25 IMs in the morning to get the 15 votes on Sphinn required to get their articles to the Up and Coming section? Why is it always the same people getting to the front page on these sites? If you don't think there's some offsite networking going on in the social media world, you need to pull your head out of the sand.
Is it cheating when the system has become so corrupt that the way the "Big Names" get their stuff to the top is to rely upon soliciting votes from their friends? It's a self perpetuating cycle, because the people with offsite friend networks are the ones that get to the front page, and front page exposure on these networks is what leads even more people to follow you.
If you think it's possible for "great content" to simply rise to the top, then try this experiment. Go find the greatest Digg bait in the world, something that just can't miss. Submit it with an account that has no history and no friends. The only thing that's going to happen is that some other, more popular Digger is going to find your content and submit it again (ignoring your duplicate), and then it will get popular. This is a popular complaint about MrBabyMan -- people claim that he finds the gems with only a few votes and resubmits them in a different category.
The simple fact is that even the greatest content in the world requires promotion in order to get seen.
Analyze the Top Diggers
Here's an example of the activities of a top 100 Digger. You'd recognize the name, but I'm not going to out them. If you look at their history, they've been digging about 85 stories a day for the last two years. 39% of their submissions go popular.
How much work is 85 Diggs a day? If you put in 6 hours a day on the site, that's a Digg every 4.3 minutes. No breaks. No vacations. If you're taking the time to read the stories you're digging everything you read. This person also submits about 5 stories a day. And they blog a lot. And they participate in a lot of other social networks and are at the top of those too. And they've got a full time job. They either work 20 hours a day, or they've got some special help.
I suppose it's possible that they're just super-human and can Digg like that, but it's much more likely that they've got a bot or a Greasemonkey script that handles a lot of the load. Or there's an entire agency behind that persona doing all that work. Just vote the first five pages each day or the submissions of other popular Diggers, with a 4 minute delay. When you submit something, send an email blast to 25 buddies to get those first votes. Enough people follow this person that they can get most anything to the front page. They also do a good job of submitting Diggable material, but one wonders how the heck they're making money at it. Negative stories about McCain and Bush will always do well with the right care and feeding, but it's tough to monetize them.
Why Can't Social Sites Catch These People
It's mathematics, pure and simple. The problem is simply not computable in any reasonable amount of time.
Let's take our Top Digger in the above example and see if we could catch them by looking at the voting behaviors on their stories. The trick is that they send out 25 vote requests, but the pool of people they can request from is much larger, say 250. So for any given story, there's a 10% chance that a person out of the group will vote for it. And the average story gets a few hundred votes because they've become popular, so we're looking for 10% out of that.
This is a well understood problem in computer science. What we're trying to figure out here are the functional determinants in the data. We're saying that a submission by A leads to votes by B and C. If the variance is 0% -- in other words, every time A submits B and C vote, then it's pretty easy to spot. You can take a small sample of data, and just iterate through A,B,and C's behavior a single time and you'll find that there is an exact correlation. We can see that A functionally determines B and C.
But what if B & C only vote for A's stories 50% of the time? Now our nice and neat functional dependency algorithms won't work. We can't use a small random sample of data, we have to look at a much larger set in order to spot the trend. So instead of looking at 25 submissions to spot the trend, I'd have to look at all 2,500. And remember, out of the 1,000s of people that ever voted on a story submitted by A, I don't know who B & C are ahead of time. So I have to look at everyone that has ever voted on a submission by A. Now work the numbers if our voting pool only votes 10% of the time. If I look at our Top Digger's friends page I see that there are over 22,000 recent Diggs by people in their friend network. And that's a very small amount of Diggs compared to the total number of Diggs across all of their submissions. It's just not possible to spot the rings. If you had a billion dollars in venture capital and a giant supercomputer you still couldn't police it.
What Can Social Networks Do?
What is possible is to spot a ring if you have a hypothesis about who to look at ahead of time. For instance, let's say A is silly and submits something that is clearly spam. It gets 5 votes before it is marked as spam. Now checking the voting behavior of A vs 5 people is quite easy. And if you roll them up, what does it mean?
- You've taken out 5 people from a group of 250, which is pretty easy to rebuild.
- If you ban the Top Digger, you're opening yourself to the ultimate black hat attack. Want to take someone down? Just set up 5 fake accounts and have them digg the Top Diggers submissions 100% of the time. Then send a complaint to Digg that you've spotted a voting ring.
- Your process required manual intervention, which is quite expensive.
You can also send out employees to join these networks and participate, looking for people that are asking for their submissions to be voted up and banning them. They don't generally do this because someone can just ask for votes for someone else's submissions and have them wrongly accused of participating in the ring. If the sites are smart, they'll periodically insert stories from top users to be voted up. A black hatter could lay waste to hundreds of competitors by submitting their stories to various voting rings.
Likewise, they can track activity. If you vote a story every 2 seconds you're leaving a clear footprint. Except that people do that all the time without consequences on Digg. Witness the Greasemonkey scripts used by the bury brigade that automatically bury stories than contain certain keywords or from certain users. So if you're a top digger they're likely to check your history and if you do something like digg a bunch of stories without a pause you'll get caught.
There's no way they can catch everyone. They can't even catch a small percentage. What they can do is concentrate on policing their top users and clear spammers very carefully, and if they catch someone make their ban very public pour encourager les autres. And they can keep fostering the fantasy that anybody that cheats on a social network is going to get caught, aided and abbetted by "A List" people that did exactly that on their way up.
But if users stay away from submitting stories that are clearly spam, insert pauses in their voting, and limit their ring activity to around 10%, there's no way they're going to get caught. At least not until we get a few orders of magnitude in compute power available.





I don't condone spam, but I also point to the fact that Oprah has a spam network under the definitions of Digg and others: Her book club.
A "group" (of millions) get together and vote with their dollars on the next NYT bestseller. Was that truly the best book of the moment? Hell no. It was just the result of a "club" or group of people getting together to rally around a title which generate ripples of success for the title that would otherwise have been impossible.
No one gets pissed about that and the hundreds of other real world examples of what people online call spam.