A while back I was talking with some people regarding a specific type of database and the true impact it could have on the user to client conversion for businesses. While mapping out the current possibilities, limitations, and ‘the great unknowns’, the realization was made that much of what was required has already been done by the likes of Google. One of my first reactions was simply “Crap. They’ve already done it, are doing it, and have a few years and billions of dollars head start..“. This was shortly followed by “Crap! They’ve already done it, and are doing it!“.
So shift gears for a second here. Sure, there is the Google we all know and love for searching, SEO, email, chat, shared documents, YouTube, etc. etc.. Literally, the list goes on for quite a while.. But how much user data are they actually harvesting from a) the people that use their sites, and b) the sites that gladly log their data with Google in trade for some poor analytics? It’s one thing to gather some data from your own group of sites, but now hook up millions and millions of websites with billions of users logging data into your systems. The best part is that all you have to provide in return is some approximate/estimated metrics that are older than 24h. You don’t need to provide database access, and they’re completely limited to exactly what you give them. Do you honestly believe that because a specific item of user data isn’t available in your GA account that they’re not storing it?
So why then are we using it? Well, for small sites it’s cheap and it’s good. Even for medium sized sites it’s a good solid platform that can provide the necessary data to make good decisions. The challenge here is that I have a big issue with large corporate sites giving away their data. Ok, but what are the alternatives, right? Well, sadly there isn’t much. There are a bunch of companies that claim to have superior analytics, and one I’ve found that answered a lot of questions about the specific database that started all of this.. Not sure how good their product is, however, it is very interesting to say the least.
Anything available that is Open Source? Well, actually, now that you mention it – sorta. While doing some of this digging I came across PiWik:
Piwik is a downloadable, open source (GPL licensed) web analytics software program. It provides you with detailed real time reports on your website visitors: the search engines and keywords they used, the language they speak, your popular pages… and so much more.
Piwik aims to be an open source alternative to Google Analytics.
Oh really? After installing it on my trusty webserver, and noticing how truly dismal my traffic is (meh, most of the content is ready by RSS anyways which sometimes makes me wonder why bother having anything other than an RSS feed), I began to notice that this crew is onto something. Sure, it’s still got a ways to go before it becomes a serious contender to GA – but wow!
Ok, so now look at it this way: GA is top dog, but barely gives a dog a bone – right? PiWik is out proving that you can own your own data, they have made great strides in gathering data just like GA, and really, if they had some funding/more coders they could be in serious contention in a very short time. The question now begs, if you’re a massive corporation who has the bankroll + developers who could actually make this happen – then why are you giving your data away?
Data is everything, because without it you cannot convert it to information, you cannot gather sufficient information to convert it to knowledge, and without knowledge, you cannot take action.