Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


How Accurate Are Your Analytics Tools And Data?

analytics data tracking privacy marketing tools server logs

  • Please log in to reply
11 replies to this topic

#1 cre8pc


    Dream Catcher Forums Founder

  • Admin - Top Level
  • 13638 posts

Posted 18 December 2012 - 12:15 PM

I've brought this up before and now have related concerns about analytics and the integrity of data.

First, I don't like how Google Analytics data is different than actual server logs. I feel that the data coming straight from the server is pure and with good software, you can remove search engine bots and known bad IPS to get the human traffic and behavior.

My additional curiosity is over user manipulation and control. I have installed Ghostery, a Chrome extension that tells me all the junk going on in the background from each web site I go to. I have the ability to turn off their tracking, including Google Analytics and Clicktracks. There are search engines that we can use that don't track our movements.

Marketers and site owners are dependent on their data. How much of it is a true picture of activity? Why not just use server logs, that get everything?




  • Hall Of Fame
  • 5500 posts

Posted 18 December 2012 - 12:35 PM

I have used server logs processed through ClickTracks, server logs processed through WebLogExpert, and live tracking through GetClicky.

The server logs methods usually return higher traffic counts. I think that they count bots.

GetClicky has lower traffic counts. I think that they filter out some bot traffic.

I am a numbers junky and I like GetClicky best.

But.... we need to hear from iamlost on this thread. He is a no BS traffic counter.

#3 iamlost


    The Wind Master

  • Site Administrators
  • 4644 posts

Posted 18 December 2012 - 01:20 PM

Ha. You don't want to get me started on this subject :)

The server logs methods usually return higher traffic counts. I think that they count bots.

GetClicky has lower traffic counts. I think that they filter out some bot traffic.

The server logs note every user-agent request. So, of course that includes any bots not blocked at the router/firewall in front of the server.

Most blacklists are in .htaccess files so visitors including bots so blocked will not be shown requested IRLs and so will not be seen by third party logs.

Further, all third party add-ons be it GoAn or GetClicky require the site add a tracking code (usually javascript). This means:
---can only track users with javascript turned on (probably most visitors)
---while older/simpler bots can not handle javascript and are not counted in the past few years more and more bots can and these bots will be counted.

If a site uses spider/bot traps or similar then for more accurate data one needs to subtract such catches from the third party numbers and information.
Note: Subtraction is easy, properly accounting for specific catches in all the filters, graphs, etc. is usually not.

Then there is how third party services recognise and track or not SE bots. A whole other chapter or three here.

Sites are typically inundated by bots. And most haven't got a clue to what degree. And I don't know of a non-custom software that does an acceptable (to me) job. Think what not appropriately blocking/accounting for bots does:
* it dilutes conversion numbers.
* it reduces direct ad rates
* it increases competition with scrape and republish sites
---which can lower SE visibility
---which lowers visitors/customers
---which lowers revenue
* etc.

To answer your title-question: my analytics tools and data are quite accurate (not perfect but within a comfortable confidence).
Are most webdevs' tools and data? No. Not even close. IMO.

#4 earlpearl


    Hall of Fame

  • Hall Of Fame
  • 1682 posts

Posted 18 December 2012 - 01:51 PM

I tend to use G analytics and often back it up with one other source. Once in a blue moon I check a small volume of logs against my various analytics. I'm fairly comfortable w G Analytics on that basis. Everyone reports somewhat different data. When bots get mixed with traffic it gets pretty confusing.

Here is another area on a different level where you want some reliance on your own data.

Our SMB's get hit up by web advertisers every week. A lot of them are the IYP's types (internet yellow pages) and all the directories of traffic. The smb's are of totally different types.

Virtually every time we add "free directories" for citations (for G Maps) we get hit up to join those entities as a paid "premium" member. Sometimes the solicitations are slick and by phone, sometimes by email, sometimes both. Everyone wants part of the pie.

Of the times I speak with them, EVERY one has overstated the traffic they state they deliver to us versus our analytics. Every time. I checked on this with a group of SEO's that do Local SEO for a lot of clients. not one of those SEO's believes the claims of the other sources.

I like something I can at least periodically check against log files. Everything else is BS imho. :D

#5 Black_Knight


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 9339 posts

Posted 24 December 2012 - 12:36 PM

Years ago, I had a few articles around on how no form of tracking, even server-side, was ever accurate. None at all. Moreover, the type of inaccuracy is variable, and thus even what appear to be clear trends can be utterly misleading. Not only can users cache pages, but so can their networks (some workplaces may serve multiple visitors a cached page grabbed earlier so extra visitors are completely invisible to the server) and even some large ISPs have used extensive caching, with AOL being the most famous. Indeed, at one time, no AOL user could ever grab a page from a site. AOL was set in such a way that the whole thing was a vast firewall, and any user requesting a URL caused an AOL bot to grab the page, cache it, and serve the cached copy to the user. AOL might even use a different bot IP to grab subsequent pages, so one visitor might appear to be 6 different ones, while 10,000 later visitors would not appear in tracking via server logs at all.

That's the absolute most basic level of tracking defeated unpredictably. The more advanced stuff gets even more confusing.

The only truth is this: No method of tracking, and no form of analytics are accurate. The best they give is a snapshot of some random stuff that may or may not be indicative, and is a bit like 2 or 3 different scouts in positions that give limited but different perspectives passing chinese whispers of what they have seen. :)

But we like stats. We like to feel that if we can put numbers on things we can control them. Just remember that all of those numbers, no matter what the source, are barely more than guesswork, and there's no way to make the data from them accurate even to a predictably inaccurate generality. The numbers could be spot on, or be massively and critically wildly off, in any random direction (under reported or over).

#6 earlpearl


    Hall of Fame

  • Hall Of Fame
  • 1682 posts

Posted 24 December 2012 - 12:53 PM

@Black Knight: I seem to recall reading some of that info years ago. I absolutely devour statistics. Like you said above:

But we like stats. We like to feel that if we can put numbers on things we can control them

True for me. LOL.

so on this Christmas eve I'd just like to thank you, Black Knight, for making my day!!! :D :emo_gavel: :emo_gavel: :emo_gavel:

#7 glyn


    Sonic Boom Member

  • Hall Of Fame
  • 2620 posts

Posted 24 December 2012 - 02:26 PM

All that we hear or see is but a dream within a dream.

#8 Ken Fisher

Ken Fisher

    Mach 1 Member

  • 250 Posts Club
  • 432 posts

Posted Yesterday, 08:57 AM

I thought EGOL would appreciate refreshing this thread.


I am a stats junkie too! It really sucked when The Gorg went secure. Geesh that was a good time looking at search phrases etc.


To the point. I signed up for getClicky (GC) an hour ago (paid version).


I know it's really early (30 mins of data), but these numbers are so different than GA. What's really frustrated me in the past with GA is the inaccuracy of bounce and time on site. Reading more of the above I guess every program is more of a guesstimate?, But I'm really looking forward to looking at heat map data. I guess that doesn't show for awhile?


As of 30 minutes now my bounce rate is 20% compared to that dreadful GA number over 75%. Granted GA has counted about 500 visitors today and getClicky only 72.


So EGOL- How do you compare GA to GC in bounce? Possible? Anything positive that I can take to the bank? I could use some confidence.


This should be an entertaining day...

#9 bobbb


    Sonic Boom Member

  • Hall Of Fame
  • 2192 posts

Posted Yesterday, 10:51 AM

I think you will have to wait to get 1 complete day of data to be able the get numbers and compare. Will be interesting. I often look at my GA and say "quoi?" Agreed Interpreting all that data is confusing at times.


Believe I have read that time on site is inaccurate (shows 0) in GA if a visitor never clicked to another page. I suspect Landing pages and All pages also count a visitor twice when return arrow is clicked. Most visitors will land on page A follow one or two links then click back to page A. So page B and A are counted twice. A->B->C->B->A->exit possibly back to the original Google SERP. I have observed this. So now sessions is the real number.

#10 EGOL



  • Hall Of Fame
  • 5500 posts

Posted Yesterday, 12:00 PM

For the past few months I have been running clicky and GA simultaneously on the same site.


The numbers are very similar.   How many people are on the site, what pages are they viewing.


The big difference between the two is how bounce rate is counted.  Clicky counts bounce rate as explained here.... http://clicky.com/he...ent/bounce-rate

#11 glyn


    Sonic Boom Member

  • Hall Of Fame
  • 2620 posts

Posted Yesterday, 03:16 PM

I also find the numbers match. I like clicky because you fan very easily look at individual activity rather thab gs preference for the big zoomed out view and nonsensical marketing vomit menu labelling.

#12 Ken Fisher

Ken Fisher

    Mach 1 Member

  • 250 Posts Club
  • 432 posts

Posted Yesterday, 04:24 PM

nonsensical marketing vomit menu labelling.


LOL. You mean all that crap I can't understand..and can't find half the time? I'm glad I'm not alone. Horrible usability.


I have about seven hours of numbers and page views per visit are close to GA, but average time on site (session?) is drastically different as is the bounce.


Avg Time Visit - Avg Session Duration- GC - 3:56 / Gorg 1:56

Bounce - GC - 18% / Gorg 80.7%


I did read that link earlier EGOL. Yes, that explains things.


These numbers make me feel like I have something of value for people to look at. Prior to that I was beginning to think I can't do anything to keep a visitor around.


I'm thinking I was expecting something else on the heat map stuff. Like how many visitors actually scrolled or hit the bottom of a page? Or maybe there's a gizmo/filter to set that up.

RSS Feed

Also tagged with one or more of these keywords: analytics, data tracking, privacy, marketing tools, server logs

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users