ma.gnolia

Sign In | Learn More

Skip to main content


Ma.gnolia Blog: A Dirty Shame

This week Andy Baio uncovered how one unscrupulous link spammer was filling his account with Timesonline.co.uk bookmarks, shining a little spotlight on the world of link spamming, SEO, and SMO.

Today, he followed up with a post gathering reactions from those of us running some of the sites on which this user had created accounts. There’s some great stuff in there, though I feel that I came across somewhat less eloquent than my colleagues. Guess that’s what happens when you’re interviewed via IM.

One of the things I’d mentioned was that spamming ”... costs us real money in time and resources.” To understand what I mean by this, I thought it would help to illustrate some of the different kinds of promotion we encounter as well as some statistics on service usage.

An Illustrated Guide

Had Enough Yet You Can't Fool Me Tagmania! Trying to Appear Legit One Bookmark, Many Accounts

To illustrate what we see as link spam, I created a set on Flickr covering some of the major categories we see. This is not cute, “here’s my blog” type self-promotion. I’m sure for many of us, the first social bookmark we created was of our blog. In the world of real spam, there are infinite variations, but most of it falls into these major categories:

One Site, Many Accounts: This spammer has one site or one page they are trying to promote, so they just create a bunch of accounts and link to the same place in all of them.

Too Legit to Quit: This spammer will add one bookmark to a legitimate source of information to try and cover the Squidoo lens or illegal pharmacy bookmarks they’ve created.

Tagsploitation!: This spammer read somewhere that the more tags means their site will be seen more, so they bookmark one site with as many tags as they can think of. I’ve seen bookmarks with at least a hundred tags.

Joe Seo: This is someone who bought a make money at home book which told them to develop several niche sites which run AdWords and then drive traffic to them through social bookmarking. They’re getting rich in their sleep.

You Can’t Fool Me: This person has a cool avatar and some profile information filled out, but a remarkable interest for dating sites, forex strategies, vacation homes, or mortgage refinancing.

Had Enough Yet?: Just as many bookmarks as they can possibly load up. Probably made use of our import facility.

The Numbers

This comes back to my quote about spammers having an actual cost to us. Though link spammers can take many forms, how much of Ma.gnolia do these users really account for. Quite a bit.

The thumbnail numbers of a sample of the past 3 months show that almost 80% of the actually-used new account created on Ma.gnolia were created by spammers. That means for every one real member that joined, about 4 spammers or their bots created an account.

Similarly, in the same 3 month period, more than twice as many spam bookmarks were created as legitimate bookmarks.

All of these account and spam bookmarks tax our database, server infrastructure, storage, bandwidth and contribute to equipment and maintenance costs. That represents time and money that could be spent on our members who sincerely want to share with one another and build communities.

But, Why?

Why is link spam so prevalent? There are two reasons: it’s easy to create and spammers believe it will bring them money.

Making It Easy To Get In

Some link spammers spam the old fashioned way, they actually create accounts and add bookmarks. Some of these people may speed this process by importing a file full of bookmarks. But the real drivers of link spam are APIs and SMO.

The brave new world of APIs has made it easier than ever to add as many bookmarks as you want, as fast as you want. Some enterprising souls, even more unscrupulous than spammers, have built tools which automate the process of creating accounts and filling them with bookmark via the de-facto standard de.licio.us compatible API.

With one of these multi-posting, multi-submit tools, one person can generate accounts and bookmarks without ever having to even view Ma.gnolia or any of their other intended targets. Of course, these tools only sell for a low, low price of $99.95.

Finally, and actually depressing for me to think about, is outsourced SMO (Social Media Optimization). Yep, believe it or not, you can pay people in foreign countries pennies to create accounts and bookmark your site, bringing the cost and traceability of your link spamming down to almost nothing. Perhaps even less than buying one of the above mentioned multi-submit tools.

Show Me The Money. Not!

Even as easy as it is to spam social bookmarking sites, it wouldn’t be worth it if it only meant that people would see those bookmarks on in-site searches.

What’s really driving link spam is the (usually) mistaken belief that these bookmarks create valuable back links which will raise their sites Google PageRank pushing them up in search results. The real prize is visibility on Google, Google == Money, not in the social bookmarking world.

But, the page rank boost is just a cargo-cultish myth from a time before rel-nofollow. Most responsible web sites now stick a piece of code in their links, “rel=’nofollow’”, which lets search engines know that an outbound link isn’t to be trusted and shouldn’t be given “credit”.

Even more ironic, or not, is that many spam bookmarks point to spam blogs which are trying to make money from Google AdWords from traffic peeled off of traffic from Google search, which is where they’re trying to boost their rankings. At least someone is making money here.

And In The End…

What does it all mean? I’m not sure. I’ve been meaning to write this post on link spam for over a year, and Andy finally motivated me to sit down and make it happen.

Perhaps I’ve been putting it off because I don’t have any grand answers or conclusions. No solution to save Ma.gnolia or the web from link spam.

What we do have is at least a better idea of what link spam is, where it comes from, and the kind of impact it has one services like Ma.gnolia. So…

Anyone? Ferris? Anyone?

Posted by Larry on January 31, 2008 | Mark This Post

Member Comments

John on February 1, 2008

Good article. One tiny correction:

If 75% of all new accounts are spammers (wow!), then for every one real new user there would be *three* new spammer or spambot accounts, not four.

If it's indeed four spammers to every one authentic joiner, then 80% of all new accounts are spammers.

Larry on February 1, 2008

Hey John,

Thanks for pointing out the math mismatch. It was a case of rounding in different directions, so I've updated the post with more accurate numbers.

otis on February 2, 2008

(this is Otis from Simpy - another social bookmarking service)

Larry, good post.

I'm in the same situation, though the numbers you cite sounds a lot worse than what I *think* I see on Simpy. Knock on wood. I, too, have been meaning to talk about this more publicly.

My impression is that a good percentage of these people don't have a clue what they are doing, so one factor is education. For example, I have been meaning to modify the Simpy signup page to make it very explicit - "Looking for traffic? Click here" - and take people to the newly added FAQ entry about this: http://www.simpy.com/faq#useGuide

Even though we are in the same space, I am wondering if there is room for cooperation. I don't have (m)any concrete thoughts and wouldn't want to expose them here even if I did, but if you think we can suppress spam at least somewhat by cooperating, let me know.

Larry on February 2, 2008

Hi Otis,

Thanks for commenting. I agree completely that education is an important way to correct this sort of behavior, and that's an area we could substantially improvement. Your FAQ is a great example of this.

I'll keep you posted with our efforts and we'll see if we can maybe work together on this.

Frederik on February 4, 2008

Just thinking out loud here,

but is there no way to stop the API spamming ?
I haven't used the ma.gnolia API yet, but plan on using it.
is there no way to put a limit on the requests per API key, & if someone wants a higher amount of requests / day then they need to upgrade their API key (after you guys check it manually).
or maybe an additional requests limit on ip ?

more ideas :
- a report as spam button on pages with highly targeted keywords by spammers.
- some spam messages look like they'd never would get through a spam filter, try to include one ... automize the spam removal.
- are there some databases with spam-website urls ? block those.

Moni on March 20, 2008

Good to hear the truth about these things. Thanks so much for this post! It's incredible to think the situation is statistically racked up like that -- at least everyone knows it's not their imagination!

Ken on March 20, 2008

Hmm...interesting post, Larry; thanks! I agree with the larger picture, but I'm wondering how you draw the line. In many cases, it's obvious, but I'm not 100% sure from reading your post how "aggressive" you want to be in preserving Magnolia's resources.

Case in point: after being inspired by Jon Udell in this excellent post: The blurred line between personal information management and publishing (View Details) , I began some time ago to "self-tag" all my own posts; first at delicious, now both here and there (for the time being, I'm still using both, though I'm leaning towards magnolia). I use the tags "kenkennedy" and "kenzoid" for my own posts (plus any other relevant categories).

I adore this concept; as Jon points out, tagging "in the cloud" provides me with improved results both in quality and breadth; I can do things (sorting, querying, API, etc.) here and at delicious that many blog tagging addons can't manage; and just as importantly, I can aggregate across many sites, blogs, comments, etc. It's great!

Now, I do this both for myself (for purposes of improved organization), and admittedly, because sites like delicious and magnolia give me a wider reach for my tagging. But I don't consider this spamming...honestly. I'm trying to add metadata to the cloud. It happens to be about me, but I tag FAR more things about other topics; I'm one category among many. Is that the difference?

How do you guys feel about this? Am I abusing the system? Please let me know; I really think the service is great, but if we're at cross-purposes here, I want to know early on. I don't want to be considered a "tagsploiter"! *grin*

Thanks so much for everything!

Larry on March 21, 2008

Hey Ken, thanks for commenting. You're definitely not a spammer, by any stretch.

Self-tagging is a great idea, and I do it myself here and in other places, like Flickr. Tagsploiters are looking to gather search hits or raise their results for keywords by purposely mistagging or overtagging for common search terms; not, to improve the quality of the metadata.

Ken on March 22, 2008

Thanks for the response, Larry. It makes perfect sense. I knew in my gut that there was a difference, but I couldn't articulate it at the time, so I figured I'd just ask the experts. *grin* The clarification regarding mistagging and overtagging brings it home; active, purposeful "poisoning" of the metadata well, as it were.

Thanks again. And thanks for what you're doing here...I'm really, really impressed with ma.gnolia. From the powerful APIs to the community tools, the forward-looking OpenID support, and even the performance. I'm totally impressed and appreciative.

Larry on March 24, 2008

Thanks for the kind words and encouragement, Ken!

Post Comment

Only Ma.gnolia members can post comments. Please sign in or join.

  |