Promote My Site

Welcome to the Home of Great Social Media Management Products

PMS Social Suite - Strategize, Automate, and Manage everything about your Digg Marketing. Find and maintain great friends, shout effectively, and perform in depth analysis on your social network. Freemium and Premium.    PMS Social Network Analyzer - Query and analyze a huge list of social networking sites. Find the networks that most closely match your target audience. Freemium.    PMS Ystore Analyzer - Analyze and improve SEO on your Yahoo store. Mazimize your store's presence in the search engines. Free.

PMS Ping - Ping all the backlinks to a URL. Make sure you get credit for your hard earned links! Free.
   Greasemonkey Scripts - FireFox browser enhancements for improving your social media efficiency. Free.   
 
Category >> architecture

Mar 02
2008

Bad SEO Tool Security Can Get You Pwned

Posted by admin admin in SEO toolSecurityevilarchitecture

admin
Badly Architected SEO Apps

I was reading this really cool article on Chlorine Trifluoride, which apparently can basically burn through just about anything, including sand, asbestos tile, glass, and probably even leftover high school cafeteria pizza.  I completely love this description:

It is, of course, extremely toxic, but that's the least of the problem. It is hypergolic [ignites on contacts - ed with AP chemistry] with every known fuel, and so rapidly hypergolic that no ignition delay has ever been measured. It is also hypergolic with such things as cloth, wood, and test engineers, not to mention asbestos, sand, and water-with which it reacts explosively.

Speaking of Explosive

We have been spending a lot of time looking at SEO tools while deploying some of our own (Yahoo Store SEO Analyzier, Digg Friend Finder, Backlink Pinger) and while we've talked a lot about SEO Application Architecture we never did much writing about security.  I guess we thought that with all the, er, black hat stuff that can go on around this industry that people would be careful about how their SEO applications were architected.

Uh, No

Without naming names, though you'd recognize them as very big players, we found dozens of security holes in their applications, including but not limited to:

  • Wide Open Ajax Services - Ajax is a wonderful thing. And FireFox protects the browser against cross site scripting. But if the service on the back end is willing to accept a call from anything and doesn't verify that it's the client that's actually calling, then someone else can write an application that does the same thing you do, but uses your server to do the work. For example, another server running PHP could use curl to load one of your pages and then make web service calls to your "public" service and you'd be hard pressed to tell. You'd think you were getting lots of traffic, but you'd just be providing the back end for someone else.
  • Javascript Based Security - It's hard to believe, but we've seen plenty of applications that take a login in javascript, make an Ajax call to authenticate, and then enable a button or show content using javascript. If some hacker couldn't figure out how to rape and pillage those systems I think they'd get kicked out of their club.

About That Picture

That is a picture of a couple of pounds of Chlorine Trifluoride going off inside an asbestos berm test container. Or it is your website as some hacker takes control of your PR checker (for example) and hoses down google with it until they block your IP or penalize your site?

Ouch.

Perils of Outsourcing

Of the dozen or so tools we found with major security flaws the most common theme was not age of deployment, or country, or sophistication of the tool.  It was that the development was outsourced by a very non technical person.  Not non-technical as in "doesn't understand SEO" or "can't figure out how to tickle google" but as in: not much exposure to complex software engineering.

One thing you should know: we have some extremely technical people on staff.  (Not me, I just fetch the coffee.)  I think we could probably safely outsource applications built to a safe and sophisticated  architectural specification, but it'd be tricky.

We described our SEO architectural technical stack earlier, but here it is again:

Promote My Site SEO Application Technical Architecture

Here is my rule of thumb: if you can't understand that picture, you can't export the work.  You need someone working for you who "gets" it.  I'm not bragging - we're not perfect and there are a lot of things (*cough* graphic design *cough*) that we don't do very well and have to get help with.

Conclusion

If you are going to outsource some development and you'd like to avoid a meltdown, well, you should probably get someone on staff or at least locally consulting with you to ensure that you have proper security. If you can't look at the code that your overseas outsourcing partner is giving you and make sense of it, then you probably shouldn't be trying to play that game.

Feb 20
2008

Digg Friend Finder Top Ten Learnings

Posted by admin admin in social networkSEO toolDiggarchitecture

admin

Well, Digg Friend Finder has really made a lot of people happy and has taught us quite a bit about launching SEO/SEM Tools.

Top 10 Digg Friend Finder Learnings

Of course it had to go into a top 10 list.....

1: People Don't Read Directions

The ratio of people using the tool to reading the directions is around 15:1. Or vice versa, I couldn't get that right in 5th grade either.

2: Free Products Need Support

Not surprising, but at least the questions are easy to answer.

3: Half the Traffic is SEO

At least judging by the queries. I'm not sure if I expected more or less.

4: Threads Are Confusing

Digg Friend Finder uses our distributed SEO architecture to run the queries against Yahoo and Digg in an Ajax app running in your browser. And because of technical limitations and API functionality we can't really know how many digg friend we can find until a bunch of threaded processes finish running against the digg api. So after you hit "find" it take 10 seconds or two minutes to have the friends come back.

People find this very confusing.

5: Everything is Black Hat

Sigh. No, man, this is about being efficient in your digg friending. It's not black hat - we're within the API, the TOS, and, I wouldn't be ashamed to have my mom find out.

6: NO MAN YOU ARE A #$**&^^%% SPAMMER!!!!!

Well, gee, all caps, I'll take that seriously.

7: Users Click on Ads Even When They're Clumsy

Frankly we did not prioritize ad layout (we stink at that anyway) because we decided that we had other priorities during launch. But a lot of people clicked ads. A lot.

8: People Like Working Tools

We got a fair number of emails saying, in effect, "Thanks for making something that works." Yeah, well, we have a very large list of SEO tools we've looked at and a fair number of them simply don't work reliably. Which is very frustrating.

9: You Can Watch The World Wake Up

If you set the sitemeter geographic display on a refreshing page on Mozilla then you watch people in India, Oz, China and other strange countries (ex: San Francisco!) wake up and bookmark over to your site. Pretty neat.

10: You Learn Something New Every Day

There are 384 posters in digg with HAMSTER and there are 213 with GERBIL. I'd have thought that hamsters were a lot more popular, but then I'd have never thought to look for either. Digg is a biiiiig community.

Feb 12
2008

Architecting SEO Apps for Digg

Posted by admin admin in SEO toolROIDiggcapabilityarchitecture

admin
Digg is a sweet target for automated SEO tools because they have a powerful API, plenty of horsepower, and you can get a lot of ROI from even a little advantage. But you have to build it right or you’ll consume all your own bandwidth/CPU and produce a tool that is down a lot. And no matter how free something is, it’s not very useful if it doesn’t stay up and produce results.

The Obvious Way

Well, it’s usually wrong, or at the very best it’s what everyone else is doing, so there is no advantage gained. So if you were building a tool to mine digg for information you’d:

Digg API Via Server

Easily Banned

We would never do anything black hat (we just look like the jelly beans nobody wants when we dress as Men In Black) but it’s pretty easy to imagine that with a server based architecture Digg would choose to ban our service or choke our bandwidth rather than change their TOS so we’d stop:

How To Build it Unstoppable

Once again, you download a small AJAX application which hits the Digg API from your local machine.

Unstoppable Client Digg API

This conserves our bandwidth, CPU, and (most importantly) makes the overall application run faster and be unbannable.

What if Digg Hits Me Like You Worry They’ll Hit You?

Reasonable enough. But let’s take a look at why they’d hit us….

If five hundred people show up and each of them causes a thousand queries to the Digg API, well, you don’t even have to do the math to figure out that 500K hits on the API from one IP address, well, that is going to attract attention.

But everything we’re going to do is within the TOS of Digg, uses their API in a reasonable way, and can be used in a white-hat manner. So if the service runs from your computer then what Digg sees is five hundred new people using the API in an appropriate fashion. Their response to this sort of use is much more likely to be throwing some more hardware at the gateway.

Also, worst case, if you trip some sort of capacity limit and they slow you down, well, you're on a dynamic IP and our server ain't....

What Happens if Digg Changes their TOS

We’d change the service to stay inside the rules. If that were not possible, we’d pull it down. We’re a business and we have to co-exist within a framework of contracts and rules.

You HATE Free Stuff – Why Are You Doing This?

Well, we actually love “free” stuff that works, it’s the undependable and almost not-working free stuff that drives us nuts. But this is not a free service, we get several items of value:

  • Advertising on the ‘free’ pages
  • Name recognition for future, fee based services

How much does it cost to capture a customer? It can cost a lot and we do this sort of utility because we believe that this is the most cost efficient way to get paying customers. By paying we mean that people are either clicking ads or they are migrating to paid services.

What if People Abuse It?

Jeeze, you don’t even know what it does yet and you’re worried about abuse?Well, you’re in good company, we worry too. But we are assuming that people using the site are adults, with good judgment and respect for their internet environment. If that is a bad assumption then we’ll regroup and start putting controls in place.

Coming soon……

We’re pretty excited about this tool and we think you will be too.

Feb 09
2008

iMacro SEO Automation Framework

Posted by admin admin in iMacrocapabilityautomationarchitecture

admin

In order to build an efficient SEO automation framework you must carefully seperate value added manual tasks from reptitious automatable steps. For example, writing content on Social Networking is value added, but posting it to 23 different pligg sites and bookmarking the article at 50+ sites is not.

Manual or Automation

But determining which categories it goes into on the 23 pligg sites is an appropriate manual task. So it's not like you are just going to offshore someone to pickup the phone and say: "Dr. Newharts office, hold please."

Let's posit, for the sake of argument, that you've built a framework of CMS, user administration, content creation, database driven workflow, and iMacro automation of specific tasks. Not a small task, but you can get there stepwise, so bear with me.

What we do here is to flowchart (remember that?) our process and mark boxes as M/A - where:

  • M = Manual
  • A = Automatic

The trick is to NOT mark something as "M" just because you don't know how to automate it. If you can concieve of something as non-manual, odds are that a bit of sweat equity will allow you to build a tool to automate the process.

Real World Automation Example

We recently had to create a BUNCH of customer records for a system migration, fix address fields, verify zip codes, etc, etc. And by BUNCH I mean it was 200K+ items. We had done this type of work before using Mark V1.0 Humans so we had an idea of times and costs. This time we did a bit of flowcharting and what-if'ing and decided that rather than spend 3 months doing the needful with copy/paste/excel and 40 people we would build an iMacro based toolset.

The tool creation took two weeks, we temp'd a dozen folks to manage exceptions, and were done in 2 months. For half the cost and at a measured 95% accuracy rate.

That is SO NOT SEO

Well, yeah, but did you think we'd tell you our internal SEO tricks? Uh, no.

But use your imagination. Say you have 50 keywords/phrases you buy on Google. Who else buys them? How do they rank in the SERPS? How are their websites organized? What else do they own?

You can certainly figure out all that stuff manually. Or you could write an iMacro to query google, capture the ads, present them to people to decode, send iMacro to run reports on their sites, look up their IP's, etc, etc.

If you Do It More Than Once

You might consider automating. Me, personally, it's only worth getting the geeks involved if I have to do it more than twice/week. Otherwise, in my experience, it may take me more time to specify than to do.

Summary: Automate

I think if I were smarter I'd move my 'automate' bar down to the things I do once/week for more than six months. But, then, I have a fair bit of technical support. If you are a roll-your-own person, your barrier to entry might be even lower than mine.

When in doubt, automate with iMacro and capture your data in a relational database.

Feb 04
2008

Choosing a Web Application Manipulation Tool

Posted by admin admin in softwareiMacrocapabilityautomationarchitecture

admin

The bad news is that if you choose the wrong tool you'll have a heck of a time unwinding the mistake. The good news is that these products are different enough and the choice is pretty clear.

Real World Web Application Manipulation Tool

We said earlier that a web application manipulation tool is one that drives a web site based on a link back to your backend application containing workflow and data driven information. In a real world setting this tool must have the following characteristics:

  • Supported application - from a commercially viable vendor or an active OSF-type community
  • Mature product - must have developers documentation and have deployment successes
  • Flexible - Must handle a wide variety of web-based applications

There were only four real world candidates that were close enough to analyze:

I was going to lump Chickenfoot in with CoScripter but I will break it out as it has some particularly interesting academic shortcomings.

ChickenFoot / CoScripter


Maturity Test: Failed

I would have dinged both of these tools as not being mature - CoScripter is less than a year old and ChickenFoot is barely 6 months.  I really don't care how smart the guys at IBM or MIT are - that's not a mature product.

Supported Application: Failed

But there is another problem with CoScripter, from IBM.  And the problem is IBM.  Normally (unless you remember OS/2!) it is a good thing to buy software from IBM, but it's not exactly in their software strike zone, is it?  Oh,  well, yeah, it's free and everything,  but how does it fit in with their Linux strategy?

It doesn't.  So CoScripter is only as alive as the interest of the researchers working (part time) on it.

ChickenFoot is even worse: senior project at MIT.  Next year, aside from NOT getting the girls, these guys  will be doing what, exactly?  Again, open source, but is that your business?

Flexible: Too Much So

Here is where the wheels really come off ChickenFoot.  It uses a pattern matching engine to figure out what it wants to click when you say click(“Submit”).  If there are, say, five submit buttons then you have to write a buncha javascript.  Uh, dude, how fragile is that?

CoScripter and ChickenFoot Final Grade: D

AutoIt
Maturity And Support– Yes!

AutoIt is in the third incarnation, has an incredibly active community, and receives regular updates.  Best of all, it’s free, small, and looks a lot like visual basic.  And you can call Windows system level ‘stuff’ as well as COM, DOM, and all those other overloaded Microsoft Acronyms.

Which is the real problem:

Flexible: Yes - Everywhere But the Web

The web side is pretty much, well, krep.  You can smack mouse click into exact locations in a programmatic window that you overlay on an IE region.  And if that sounds like using a laser cannon to heat your Beenie Weenies, well, it is.

AutoIt Final Grade:  D

MacroExpress


Mature and Relatively Flexible

MacroExpress has many of the same powerful windows features of AutoIt but with numerous web features built in.  It is a well supported VB runtime like product, with a relatively active user group and lots of examples. 

It does not handle Java U/I issues, Ajax, Flash, etc.  I'd say that for plain vanilla HTML apps their web automation would work pretty well.  And, yes, I am aware that this is a diminishing crowd.

Well Supported - Not So Much

It costs under $40 and you get about that much support.  The user group/forum seems pretty effective, but there are persistent bug complaints that seem to go unresolved.

MacroExpress Final Grade:  D

iMacro


Very Mature

This product is several years old, is installed in a host of major corporations and startups.   Of all the products, this is most like tools from 'the old days.'   I was reminded more of MultEdit or WinZip or some other product with a cadre of developers and a wide installed base.

Properly Supported

When you buy iMacro (and the developer license starts at $500 and goes up pretty quickly) you get support.  Just like a real product.

Flexible Like A Cirque Contortionist

iMacro can handle Java, Direct Screen, Ajax, etc, etc.  It can even do fuzzy image recognition of bitmapped objects on screen.  Frankly we've been unable to find a situation where we couldn't  bang on an application using iMacro.

iMacro Final Grade:  A

What We Chose

This is probably pretty obvious: iOpus iMacro.  For your amusement, I've placed the candidates on our SEO capability matrix, but I think I can summarize why this really works best: it is the simplest solution.  It has a lot of sophistication under the covers, but a simple glass bottle full of red wine can have a lot of complexity, and history, and artistry too.  So don't be fooled - the buys at iOpus have crafted a specialized tool that eschews the useless and focuses on completing a job just exactly right.

 

Conclusion

We'll start giving some concrete SEO examples using iMacro and some of the architectural framework we've discussed in earlier posts.

Feb 03
2008

Building Your Own Social Networking Automation

Posted by admin admin in social networksocial bookmarkSEO toolautomationarchitecture

admin

It's a hallowed tradition in technology when you need something: build it. And once built, use it to support your core business. Nowadays big companies call it "Eating Your Own Dogfood." So we know that the Microsofties were tortured with Office/2007 (motto: you thought you knew where things were....) long before any of us.

What do YOU Need to Automate your Social Networking Tasks?

We've looked at minor productivity enhancers and semi-automated tools, but they all lack key elements (automation, reporting) or have architectural issues (RSS Bookmarker's server centric design).

SEO Capability Grid

At the end of the day, there is a lack of actionability and ROI. Which comes from solving part of the problem or solving it the wrong way. In the end, what is missing is a technology stack that is focused on solving the end-to-end problem.

Technology Stack

Technology stack is a term of art that discusses the solution's technical elements and their combination, specifically relating each piece of the stack to part of the critical path for the solution of the business problem through technology.

For example, choosing your OS might revolve around current investment, running cost, etc. In this case the choice of Windows/Server over Linux would not only relate to the other technical pieces (ex: VB versus PHP) but to the real world of budget and staff ability.

I believe the minimum technical stack to automate Social Network System usage will look something like this:

Technical Stack for Social Network Autmoation


Components Included

Client Side:

  • Browser (ex: IE, Mozilla, Opera)
  • ClientOS (ex: Windows, Linux, MacOS)

Server Side:

  • ServerOS
  • Database (ex: MySQL, Oracle, etc)
  • 3GL "glue" language (ex: Perl, VP, etc)
  • Reporting (ex: Crystal, Excel)
  • Workflow and Admin
  • Web Application Manipulation Language

Please note that the server is only conceptually separate from the client. You could certainly run it all on one machine, but if it is built in this fashion you'll be able to bring a horde of minions online when you're swamped with success.

This is the same reason you need to build in your thinking about Admin (user creation, security) and workflow (who does what in which order) from the beginning. You don't have to fully invest in activating it all at first, but if you don't slot it into the design, the retrofit will be horrible. And expensive.

I am certainly not going to get involved in the relative merits of Linux/Windows, Mozilla/IE, language_a/language_b - because we all have opinions and skills there, along with infrastructure and educational investment.

You want to use SQL Server or MySQL or Oracle? Fine, because that means you are solving the database nature of the problem. You want your Workflow/Admin in Nuke or WordPress or Joomla? Fine, because you're solving the problem at the right architectural level with the right tools.

What I want to focus on is the most difficult and arcane piece of the equation, the part that makes this application interestingly different than yet another CRM business application, this piece is the:

Web Application Manipulation Tool

The most basic assumption to start with is that you are developing an application to save you time, money, and to produce a competitive advantage. And that you are not going to write your own web browser with a built in voice recognition and replay capability. In Urdu.

The pointy end of the problem solving spear is a tool that you can use to programatically manipulate a web page: click buttons, insert text from your database, collect results, etc. This is a web application manipulation tool.

It must have the following capabilities:

  • Supported application - from a commercially viable vendor or an active OSF-type community
  • Mature product - must have developers documentation and have deployment successes
  • Flexibile - Must handle a wide variety of web-based applications

Available Options for Web Application Manipulation Tools

We started with a list of over 20 contenders and boiled it down to the the following serious products:

All these products placed strong showings in the above requirements, clearly leaving rivals behind.

To Be Continued.....

The next article will compare and contrast these four candidates and place them on the SEO Capability Grid.

Feb 02
2008

I'm Just Saying - Complexity is Complex

Posted by admin admin in softwaremistakescapabilityarchitecture

admin

We're trying to debug something that seems relatively simple, but we've got the following bits and pieces in the mix:

  • Joomla
  • Components galore for Joomla
  • PHP
  • Ajax
  • MySQL
  • Oracle
  • Two servers (same hosting center)
  • Development on Windows
  • Deployment on Linux
  • IE and Mozilla
  • Display widgets from DXHTML (awesome stuff!)

And some other stuff, I'm sure. Swear to gosh, simple problems can take forever to find.

On the upside, we find that complex problems are easily tackled and that the system's flexibility in meeting the needs of new solutions is outstanding.

True Story

Back in the days when nobody owned a domain and a 24K modem was trick, I was working on an embedded system. We'd compile in a development environment, test, then when it all looked plausible, we'd cross-compile from the x86 environment to the M68K hardware world. (Little Indian to Big Indian for the other geriatrics out there.)

One Friday morning everything stopped working from a software perspective. Everything. Lights didn't go on, lines wouldn't go from low (-5V) to high (+12V) to make the widgets widgetier. Nothing.

We got out the fricking oscilloscope. Nothing made sense.

We dumped the memory (128K of it!) onto some green bar paper and started reverse assembling it back to C. Still nothing made sense.

48 hours later, around 9am Monday, a colleague walked by, asked what was going on, listened to us explaining how we were totally baffulated, and glanced at the much scribbled green-bar.

"Shouldn't there be a memory offset for the pointer to the hardware PROM load right there" he said, pointing at the first four bytes of the printout.

The bug was that we'd somehow forgotten to #include the hardware.h file.

And somehow we'd not noticed that the very first thing on the printout told us the problem.   For two days! 

We fixed the code (10 seconds), cross compiled (5 minues) and were magically right back where we'd been three days ago.

Then we went and got drunk, which was not so easy to do at 10am on a Monday morning in a small Southern town.

Plus Ca Change

The more things change, the more they stay the same.  I betcha we're looking for something really tricky and what we're really seeing is something really simple.

Jan 31
2008

iMacro Introduction and Installation

Posted by admin admin in SEO tooliMacroautomationarchitecture

admin
iMacro is an amazing piece of technology and it can help you automate any of your web browser-centric tasks. They have some great examples, a good documentation wiki, and an active support/user community.

Big Savings in Time and Money

We have been big users of this technology for over a year now and have found that it can pay for itself more quickly than an other product we've used. In one recent project we replaced three months of work by 14 people in a data entry center in India ($16,800) with 2 days of programming ($4K) and an enterprise copy of iMacro ($699). We then let six computers chug away for 10 days.

Do the math - time saved and money saved.  The bi-fecta!  What is your time worth?

Review of Technology

We'll do a review of iMacro and why we think it is the technology to pick for automating web based tasks, but for now let's just review the installation.

Download Instructions

Go to iMacro and click on the download link:Download iMacro

 

Then download the iMacros Version 6: download imacros

Save the file:

Save the file

 

Installation Instructions

Go wherever it is you save files when you download them and double-click the "imacros-setup.exe" file. Walk through the normal Windows installation procedure, accepting all the defaults.

When the install is complete you may see a reboot notice. This seems to occur depending on the status of installed Microsoft updates. I tend to NOT reboot at this point but to go onto the next step:

install is complete

 

 

Make sure you check the, er, checkbox for Clean Install and then click “Start the iMacros Browser.”

You'll see, well, you'll see the iMacro browser. Basically it's a windows program that has the IE browser built in.

Yep, I know what you're thinking: you can do a lot with that. Yes, you can.

Final Step

Now check your desktop for these three icons:

iMacro desktop icons

At this point, if you got the ‘reboot’ notice earlier above you can reboot your computer.

Congratulations, your iMacro is fully installed.

Now

If you’re here because you were in the midst of installing some of our tools, well, for gosh sakes, get back where you belong and finish!We'll start posting some handy iMacro code soon, stay tuned.
Jan 30
2008

Architecting SEO Tools For Success

Posted by admin admin in softwareSEO toolcapabilityarchitecture

admin

I'd like to get a bit geeky on everyone - not the chicken biting geeky, but the other kind - and talk about architecting SEO tools for successful deployment. I'm going to skip by discussions of PC/Mac UI, Ajax/Ruby, vi/EMACS, etc, etc. I want to talk about basic system architecture.

Service with a Smile

The most commonly deployed form of system architecture for SEO tools is server-centric.

Server based SEO

In this model you use a browser to make requests of some mysterious backoffice system that queries around the internet (in this example, Yahoo) and brings you back some results pretty on your screen.

Yawn.

As the service gets more popular it does what? Slows down.

But, here's the problem: as the service gets more popular, or does more stuff that gives you a competitive advantage, guess what happens? Exactly.

Server There is a Fly in my Soup

IP Banned Server SEO

Happens all the time. So people do crazy stuff like anonymizing their server, hopping IP's, etc, etc.

But all that stuff is a bandaid. Once you're server centric, popular, and whacking the smack out of some other guys site, well, banning is going to occur. Or they'll help you experience "serial temporary outages" - anything to get you to go away.

 

There is an Easier Way

You just have to do a little work and have the actual grunt work happen on the user's side.

Client XML Based SEO

In this instance what we're showing is a lightweight AJAX app running in the browser. The actual mysterious query happens on the client side, XML is sent back to the server, which grinds the data up, and then it sends it back for display.

What a Load of Trouble!

Not really. Let's say Yahoo dislikes Cartoon Man's use of this service and blocks him.

DHCP SEO Architecture

Our intrepid cartoon man just gets a new IP. And goes on his merry way.

But that is even less likely to happen because instead of one server hitting Yahoo a zillion times a day (think the last guy working there will notice?) you have a few thousand users hitting Yahoo a few times a day each.

The other thing uses will notice is that it is much faster for almost any service - after all the client PC is really sitting around not doing much, most of the time, ain't it?

Conclusion

There is fair amount of work required up-front to get an advanced SEO friendly architecture like this working but it pays benefits because it is simply more robust in the wild.

Jan 29
2008

10 Very Evil Things That Could Happen When You Use a Free Theme in your CMS

Posted by Don in softwareopen sourcecapabilityarchitecture

Don

Have you ever installed a theme or a component without reading and thoroughly understanding the source code of what you've just installed?

SEOmoz had a great article on Choosing the Right CMS Platform for Your Website and Dawud Miracle ran The Ultimate Resource For Free WordPress Themes . Both are great articles worth reading.

Everybody Loves Free Themes

What do these two seemingly unrelated articles have in common? They both implicitly guide readers towards using systems that are heavily dependent upon external themes. There's absolutely nothing wrong with that and I heartily support leveraging open source software where it makes sense. This blog runs on Joomla with several components and a commercial template from an outside vendor.

Themes and CMS

What is a theme within the framework of a content management system? A theme is just a mechanism for a content management system to separate the graphical user interface from the underlying code. Rather than trying to be all things to all people, the CMS developers have put hooks into the system that allows third parties to control what the user interface looks like without having to dive into the innards of the underlying system. It all works very well, and there are literally thousands of themes to choose from for just Word Press.

The Birds and Bees for Themes

So what's the catch? It's quite minor, but it could be disasterous if you don't know what you're doing. A theme is really just a collection of php files and the associate graphics and css files that allows the rendering of the page for the user. For instance, if you look at the directory structure of the default Wordpress theme you see something like:

404.php comments.php header.php page.php search.php archive.php comments-popup.php images rtl.css sidebar.php archives.php footer.php index.php screenshot.png single.php attachment.php functions.php links.php searchform.php style.css

All harmless enough. The different files implement "methods" for dealing with the different rendering events, so when it's time for the system to draw the footer of the page it just calls footer.php. Here's what the code looks like:

What?

That is the default theme from Wordpress - you get to give them a free link back to their site. Yes, it's linkware which is really a paid link and Google still counts it, but that's a different rant.

Leaving the Door Unlocked

But what else could a malicious person do? See the PHP statements? That tells the web server that it's no longer running HTML code and should now invoke the PHP processor to interpret this code. That php code can do all sorts of wonderful things. For a theme, it generally transfers something to the output stream that will end up on your web page. But there are literally thousands of functions available in PHP, and any code could be contained within a template. If someone had evil and very black hat intentions, they could:

Do These 10 Bad Things

1> Cloak your pages so that it looks normal to everyone except the search engine bots. They get shown a page of spammy links.

2> Implement an Ajax based function that sends any form data entered (for example, login and passwords from the comments) to an external web site.

3> Cloak your pages so that they look fine to you, but it someone enters the page on a search engine they get a different page with the evil template developer's adsense

4> Watch the IP addresses that view the pages (phone home) and make a good guess as to which addresses are probably the owner. Cloak the pages so that the site owner sees their own content, but everyone else see's the template developer's content.

5> Collect the email addresses of anyone who enters a comment and phone them home. In fact, in certain systems they could access the MySQL database and just query for all users and emails.

6> Change the ID's for ads and affiliate programs so that revenue flows to the theme producer and not the site owner.

7> Open a command window for pages served to certain IP addresses that allows the template developer to enter any PHP command (and thus any operating system command since PHP can shell out to the OS).

8> Send home the configuration data for the CMS, such as the administrator's user id and the "salt" for the password. If you don't know what the "salt" is, your takeaway should be that it's a lot easier to use brute force methods to decrypt someone's password if you know it.

9> Modify the robots.txt file so that search engines don't see any pages. Or do see the wrong pages. Or block the google-bot IP's and let ask.com through. Can you imagine the head-scratching to figure that out?

10> Ping the, er, ping services 1,000 times on each publish and comment so that you get banned.


Evil You

I was trying to come up with 10 because that's what you're supposed to do, and I thought it would be hard - it wasn't.

Don't think that I'm letting some super secret ideas out to the blackhat community. If they're good enough to do those things and evil enough to think of it, they've already figured it out.

Don't Panic!

Don't go screaming off into the night with fear that any Wordpress Theme you download is going to bite you. I personally have never seen a theme that did any of those things (with the exception of changing the Amazon Id -- I saw a plug-in that did that, but they were upfront about it in their documentation).

But I do read through the source code of any third party open source tool I'm going to install.

Do You?

<< Start < Prev 1 2 Next > End >>