Skip to content

Social Network Analysis of the Social Media Club - Kuala Lumpur

SMCKL is a group that meets occasionally to explore matters relevant to social media and industry. The most recent one was about social media monitoring tools, and featured three presentations by comScore, Brandtology and JamiQ. They were interesting, but I was surprised that nobody was talking about social network analysis - so I thought I'd do a little demonstration here.

There was much tweeting going on before and after the evening, which was also an occasion for people to meet and network. Using NodeXL, I gathered all the tweets with the hashtag #smckl: in all there were 71 tweeters, and 757 'edges' (i.e. links in the form of 'Followed' relationships, 'Mentions', or 'Replies to'). The following examples only take into account the Followed relationship - i.e. I am only showing a link between tweeters when one follows the other.

A question for social media monitoring has to be: how influential is any particular tweeter? Here I'll look at two ways of visualising that.

Followers
A common measure is how many followers a tweeter has.
nodexl social network analysis sna visualisation twitter social media malaysia

In these images, the size of the profile picture is proportionate to the number of followers - the bigger the profile picture, the more followers. Also, the more central the tweeter is, the more ties s/he has with the other tweeters. The person in the middle is the most embedded in the network - with the most ties to other people, directly or indirectly; on the other hand, as you can see, there are some really on the edge - with only a couple of lines attached them to the denser cluster in the middle. They are outliers, less likely to be influential within this group.

The first picture was very dense, so I have filtered out all tweeters with less than 500 followers
nodexl social network analysis sna visualisation twitter social media malaysia

and with less than 1000 followers.
nodexl social network analysis sna visualisation twitter social media malaysia

Again, a pattern emerges of a denser cluster in the middle with a few outliers. What this suggests is that most people at the SMCKL evening already know each other. But not all: I said above that outliers are less likely to be influential within that group - it's important to note here that the person with the most followers (@victorliew) is an 'outlier'. This suggests that he could be an important 'bridge' for this group to connect to another group. The question would be - who is he? And why are so many people following him?
Continue reading "Social Network Analysis of the Social Media Club - Kuala Lumpur"

How can 10,000 unique visitors mean an audience of 100?

A distinct advantage of internet advertising is the ability to accurately measure the audience (through page views), and to know precisely how many people took an interest in the ad by clicking on it. 'Click fraud' (simulating different people by repeated clicking) is detected by automated software, and 'unique visitors' (based on the IP addresses) deals with the problem of the same person refreshing a page in order to simulate a different person.

This is how Google has made billions of dollars, so it must be pretty reliable overall.

However, how can 10,000 unique visitors equal an audience of 100? To answer this, we have to consider the network within which the ad is displayed. For this example, let's imagine a random blog advertising network - called 'BlogAdNet': BlogAdNet works by registering thousands of blogs, all of whom allocate space on their blog for advertisements to be automatically displayed as and when BlogAdNet wants to. They then go to potential clients and say, for example, 'Our network of blogs receives 10,000 unique visitors a day'; but this does not necessarily mean 10,000 different people. Imagine a very dense network of 100 bloggers, all of whom visit each other's blog every day - each blogger reads 99 other blogs every day. 99 x 100 = 9,900. So, the 10,000 unique visitors could in fact be 100 people, plus one other person (imagine BlogAdNet doing regular monitoring) visiting all the blogs.

I've used NodeXL (a useful social network analysis (SNA) tool that integrates with Excel), to think about a few examples that demonstrate how SNA can give more insight into the behavioural aspects of blog readers. Represnted in an SNA graph, the dense network of 100 readers would look like this (except that I've scaled it down to ten users to be easier to see):
social networks analysis sna blogs malaysia

Everyone is connected to everyone else, and nobody is more 'influential' than others.

However, this would be very unusual. Most networks are clustered - using the above ten blogs, I've chosen A, B and C as the 'top bloggers': everyone visits them, and they always visit each other (but don't visit the other). DEF always visit ABC, and each other. GHI are a similarly clustered sub-group. And J, who is visited by nobody (aww) always visits ABC (like everyone else), and also D, F, G and I.

Now, the same network, based on the same calculations, looks like this:
social networks analysis sna blogs malaysia

The size of the nodes are based on the 'in-degree' - i.e. the number of incoming visitors. So A, B and C are the biggest, and J the smallest.

You can also calculate 'Betweenness'. In a network, it's not only the direct connections that matter - someone 'between' you and another person may be relaying your thoughts, or enhancing your reputation.
social networks analysis sna blogs malaysia

So, the node J is now bigger than the other two sub-groups DEF, and GHI. So, in theory, J could be seeing something on A's blog, and then telling others about it; or starting conversations in their comments section and acting as a 'bridge' between sub- groups DEF and GHI. Or maybe J is just a lurker, who never says anything? The only way to find out would be to go and look at what J does. This points to one of the limitations of SNA - you can detect the presence of a link, but you don't always know what it means in practice.

The Eigenvector Centrality calculation combines the above, looking at the number of connections each blog has, and the degree of the blogs it connects to:
social networks analysis sna blogs malaysia

E and H are now smaller, because they have less overall connections. J remains apparently influential, but the lack of incoming links is not reflected here.

OK, I've got to stop this, and get on with writing my thesis!! :-|

Some conclusions

The density of a blogger network tends to depend on a few factors such as: geographical location, shared cultural features, blog genre, gender, and interest. For example, Malaysian bloggers/readers are more likely to read other Malaysian blogs; or female bloggers/readers interested in fashion and makeup will read blogs that focus on that. The density will be increased when they go to events together, when they link to each other, and so on.

If you want to measure influence on the internet, relying on classic data that is based on non-contextualised quantities is not enough. For example, if you say ‘There are 5,000 mentions of new product X since we launched the campaign’; this does not tell you the relative importance of each mention. You can combine that with unique visitors: ‘5,000 mentions of which 200 were on blogs that receive more than 2,000 daily unique visitors’. But still, what if all those 2,000 visitors are part of a densely clustered network who mostly read each other’s blogs?

The subjective and 'thick' understanding of the contextual meaning of links still needs human eyes. But they can be helped by automated processes that, for example, detect key words, emotional content, etc.

What do you think? How important can SNA be in elucidating these more subjective social aspects of online interaction?

I’m still learning about SNA, and don’t know much about what happens in social media monitoring companies, so if anyone has any corrections or advice, please use the comments section below. Thanks! :-)

A historical chronology of English language blogs in Malaysia

OK, the title pretty much says it all :-)

To get an overall view of the history of blogs in Malaysia, and my fieldwork, I've made a table.

Of course, this only represents what I know of, and the events and so that I was able to attend during my fieldwork. There are many many thousands of blogs out there, and I can never hope to cover all of what blogs have been to all bloggers over the years.

So - I'd really appreciate any feeback! Anything I've missed out, got wrong... please tell me!

It's too long to post as a table (or rather, I don't know how to convert the Word table into html), so I've uploaded it as a pdf.

Just to give you an idea of what it looks like, here's a screenshot - click on the picture to get the full version!
history of malaysian blogs