Which country blogs the most?
Just some thoughts on a international report on blogging by sysomos I saw today (blog post here). I’m always really interested in international comparisons of blogging, but this one - based on the methodological details they give anyway - is not as useful as I hoped.
To be fair, they are also faced with the perennial problem which is not knowing the total population (of blogs) and therefore whatever sample is taken, it can ultimately only be said to represent itself. And I don't know their precise way of collecting data, which may have taken into account some of the critiques outlined below. So I'd love to hear from anyone with more details on the data sampling and so on.
• It says "Over 100 million blog posts analyzed" - this does not mean the same as 100 million blogs. For example, I have about 291 blog posts on my blog. Measuring blog posts alone would favour countries that have an older blogging population; and also blogs that have permalinked posts (most do, but not all). So, I wonder how they identified individual blogs.
• They also say that "a third of blog posts are from the U.S." If the total bloggers are 29.2% from the US, and the blog posts are 33.3% - then that means each blogger has on average a bit more than one post each. Which would make sense if they have taken a 'snapshot' of the blogosphere I suppose - i.e over one day or something. But it still raises questions in my mind - that would mean on one day almost 10% (see chart below) of Americans were blogging/had a blog (assuming one blog per person, which is not always the case). Which seems like a lot.
• How did they identify blogs? Is it based on url - e.g. those on the blogspot or Wordpress domains, or perhaps identifying the software used (e.g. Wordpress, or Serendipity)? What happens to blogs on self hosted domains? Could this introduce a bias - for example - against certain countries or languages (e.g. China)?
• They say "We analyzed more than 100 million blog posts that provided information about their age, gender and location information." I would have some questions here too:
**When they say "blog posts", does that include the sidebar information, or just the blog post?
For example: a blog post (such as this) may have the words: "I love living in Malaysia, because I am a man who loves good food, and when you’re over forty like me, good food is important." Or, the blog post may have nothing revealing in it, but on the side bar the profile says: "Male, Malaysia, Age: 40". If you're counting all the sidebars, that's a lot of duplicates.
**Perhaps the location data is based on some IP address analysis? I don’t know how that would work, but if they could work out where the last update was done from, that would be a good bet. Otherwise, many blogs are hosted in the US, but belong to people outside (i.e. all those on blogspot, etc.)
• Anyway. A final thing. I was interested to see that Malaysia ranked 14th in the number of blog posts. Which is not bad for a smallish Asian country. But I thought it would be interesting to get some idea of the proportionate number of blog posts - because America has a population of about 310m, so it’s not surprising there is more blog activity. So here are a couple of tables (population figures taken from the CIA World Factbook)
The first table is derived from sysomos; and I have assumed the total number of blog posts was 100 million, from where I get the "Putative no. of blog posts" per country

Then, dividing the blog posts by population, we get the proportion of blog posts to the population - giving us in theory the ranking of the countries where blogging is the most popular

So according to this - Sweden is at the top, and Malaysia comes first in Asia! (OK I dunno why China isn’t in there either, maybe they can't search in Chinese?)
To be fair, they are also faced with the perennial problem which is not knowing the total population (of blogs) and therefore whatever sample is taken, it can ultimately only be said to represent itself. And I don't know their precise way of collecting data, which may have taken into account some of the critiques outlined below. So I'd love to hear from anyone with more details on the data sampling and so on.
• It says "Over 100 million blog posts analyzed" - this does not mean the same as 100 million blogs. For example, I have about 291 blog posts on my blog. Measuring blog posts alone would favour countries that have an older blogging population; and also blogs that have permalinked posts (most do, but not all). So, I wonder how they identified individual blogs.
• They also say that "a third of blog posts are from the U.S." If the total bloggers are 29.2% from the US, and the blog posts are 33.3% - then that means each blogger has on average a bit more than one post each. Which would make sense if they have taken a 'snapshot' of the blogosphere I suppose - i.e over one day or something. But it still raises questions in my mind - that would mean on one day almost 10% (see chart below) of Americans were blogging/had a blog (assuming one blog per person, which is not always the case). Which seems like a lot.
• How did they identify blogs? Is it based on url - e.g. those on the blogspot or Wordpress domains, or perhaps identifying the software used (e.g. Wordpress, or Serendipity)? What happens to blogs on self hosted domains? Could this introduce a bias - for example - against certain countries or languages (e.g. China)?
• They say "We analyzed more than 100 million blog posts that provided information about their age, gender and location information." I would have some questions here too:
**When they say "blog posts", does that include the sidebar information, or just the blog post?
For example: a blog post (such as this) may have the words: "I love living in Malaysia, because I am a man who loves good food, and when you’re over forty like me, good food is important." Or, the blog post may have nothing revealing in it, but on the side bar the profile says: "Male, Malaysia, Age: 40". If you're counting all the sidebars, that's a lot of duplicates.
**Perhaps the location data is based on some IP address analysis? I don’t know how that would work, but if they could work out where the last update was done from, that would be a good bet. Otherwise, many blogs are hosted in the US, but belong to people outside (i.e. all those on blogspot, etc.)
• Anyway. A final thing. I was interested to see that Malaysia ranked 14th in the number of blog posts. Which is not bad for a smallish Asian country. But I thought it would be interesting to get some idea of the proportionate number of blog posts - because America has a population of about 310m, so it’s not surprising there is more blog activity. So here are a couple of tables (population figures taken from the CIA World Factbook)
The first table is derived from sysomos; and I have assumed the total number of blog posts was 100 million, from where I get the "Putative no. of blog posts" per country
Then, dividing the blog posts by population, we get the proportion of blog posts to the population - giving us in theory the ranking of the countries where blogging is the most popular
So according to this - Sweden is at the top, and Malaysia comes first in Asia! (OK I dunno why China isn’t in there either, maybe they can't search in Chinese?)
Trackbacks
anthroblogia on : How many Malaysian blogs are there?
Show preview
A recent statement by the Malaysian Information, Communication and Culture Minister - Datuk Seri Dr Rais Yatim (Dewan Rakyat: 2 Million Bloggers Proof Of Media Freedom In Country) affirmed that there are two million bloggers in Malaysia. I wish I knew
Comments
Display comments as Linear | Threaded
Rodney Lim on :
Just a thought.
julian on :
Fred on :
julian on :
michael- on :
michael-
webpages on :
Raz on :