Skip to content

How can 10,000 unique visitors mean an audience of 100?

A distinct advantage of internet advertising is the ability to accurately measure the audience (through page views), and to know precisely how many people took an interest in the ad by clicking on it. 'Click fraud' (simulating different people by repeated clicking) is detected by automated software, and 'unique visitors' (based on the IP addresses) deals with the problem of the same person refreshing a page in order to simulate a different person.

This is how Google has made billions of dollars, so it must be pretty reliable overall.

However, how can 10,000 unique visitors equal an audience of 100? To answer this, we have to consider the network within which the ad is displayed. For this example, let's imagine a random blog advertising network - called 'BlogAdNet': BlogAdNet works by registering thousands of blogs, all of whom allocate space on their blog for advertisements to be automatically displayed as and when BlogAdNet wants to. They then go to potential clients and say, for example, 'Our network of blogs receives 10,000 unique visitors a day'; but this does not necessarily mean 10,000 different people. Imagine a very dense network of 100 bloggers, all of whom visit each other's blog every day - each blogger reads 99 other blogs every day. 99 x 100 = 9,900. So, the 10,000 unique visitors could in fact be 100 people, plus one other person (imagine BlogAdNet doing regular monitoring) visiting all the blogs.

I've used NodeXL (a useful social network analysis (SNA) tool that integrates with Excel), to think about a few examples that demonstrate how SNA can give more insight into the behavioural aspects of blog readers. Represnted in an SNA graph, the dense network of 100 readers would look like this (except that I've scaled it down to ten users to be easier to see):
social networks analysis sna blogs malaysia

Everyone is connected to everyone else, and nobody is more 'influential' than others.

However, this would be very unusual. Most networks are clustered - using the above ten blogs, I've chosen A, B and C as the 'top bloggers': everyone visits them, and they always visit each other (but don't visit the other). DEF always visit ABC, and each other. GHI are a similarly clustered sub-group. And J, who is visited by nobody (aww) always visits ABC (like everyone else), and also D, F, G and I.

Now, the same network, based on the same calculations, looks like this:
social networks analysis sna blogs malaysia

The size of the nodes are based on the 'in-degree' - i.e. the number of incoming visitors. So A, B and C are the biggest, and J the smallest.

You can also calculate 'Betweenness'. In a network, it's not only the direct connections that matter - someone 'between' you and another person may be relaying your thoughts, or enhancing your reputation.
social networks analysis sna blogs malaysia

So, the node J is now bigger than the other two sub-groups DEF, and GHI. So, in theory, J could be seeing something on A's blog, and then telling others about it; or starting conversations in their comments section and acting as a 'bridge' between sub- groups DEF and GHI. Or maybe J is just a lurker, who never says anything? The only way to find out would be to go and look at what J does. This points to one of the limitations of SNA - you can detect the presence of a link, but you don't always know what it means in practice.

The Eigenvector Centrality calculation combines the above, looking at the number of connections each blog has, and the degree of the blogs it connects to:
social networks analysis sna blogs malaysia

E and H are now smaller, because they have less overall connections. J remains apparently influential, but the lack of incoming links is not reflected here.

OK, I've got to stop this, and get on with writing my thesis!! :-|

Some conclusions

The density of a blogger network tends to depend on a few factors such as: geographical location, shared cultural features, blog genre, gender, and interest. For example, Malaysian bloggers/readers are more likely to read other Malaysian blogs; or female bloggers/readers interested in fashion and makeup will read blogs that focus on that. The density will be increased when they go to events together, when they link to each other, and so on.

If you want to measure influence on the internet, relying on classic data that is based on non-contextualised quantities is not enough. For example, if you say ‘There are 5,000 mentions of new product X since we launched the campaign’; this does not tell you the relative importance of each mention. You can combine that with unique visitors: ‘5,000 mentions of which 200 were on blogs that receive more than 2,000 daily unique visitors’. But still, what if all those 2,000 visitors are part of a densely clustered network who mostly read each other’s blogs?

The subjective and 'thick' understanding of the contextual meaning of links still needs human eyes. But they can be helped by automated processes that, for example, detect key words, emotional content, etc.

What do you think? How important can SNA be in elucidating these more subjective social aspects of online interaction?

I’m still learning about SNA, and don’t know much about what happens in social media monitoring companies, so if anyone has any corrections or advice, please use the comments section below. Thanks! :-)


No Trackbacks


Display comments as Linear | Threaded

Jess on :

Wooots! very detailed :-) u go Julian!

Huai Bin on :

Very interesting post Julian. You have some very valid points there. I understand the words but not so much the charts. I agree, this is something that has to be factored into SM.

julian on :

Jess - it could be lots more detailed!

Huai Bin - Thanks. The charts basically have links between the 'blogs' represented by the lines (there are arrows to represent the direction, but you can't really see them), and the size of the node/blog represents its relative size in terms of the calculation made.
If that's any help... :-|

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
:'( :-) :-| :-O :-( 8-) :-D :-P ;-) 
BBCode format allowed
Form options