Corante

Authors

Clay Shirky
( Archive | Home )

Liz Lawley
( Archive | Home )

Ross Mayfield
( Archive | Home )

Sébastien Paquet
( Archive | Home )

David Weinberger
( Archive | Home )

danah boyd
( Archive | Home )

Guest Authors
Recent Comments

Gry Przegladarkowe on My book. Let me show you it.

Gry przeglÄ…darkowe on My book. Let me show you it.

DUI Attorney Chicago IL on My book. Let me show you it.

eau claire used cars on My book. Let me show you it.

MySocialMediaMentors.com on My book. Let me Amazon show you it.

Gry przegladarkowe on My book. Let me show you it.

Site Search
Monthly Archives
Syndication
RSS 1.0
RSS 2.0
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

Many-to-Many

« Majority of Human Discourse Now In Product Reviews | Main | Not Your Mama's Wiki »

May 22, 2003

Bursty Community Formation in Blogspace

Email This Entry

Posted by Clay Shirky

Absolutely fascinating paper on community formation in blogspace, by Ravi Kumar, Prabhakar Raghavan, Jasmine Novak, and Andrew Tomkins, called On the Bursty Evolution of Blogspace. (Free ACM account required -- it's so worth it, just for this article.)

The authors develop a method of measuring time-stamped link-space, so that blogspace can be mapped based not just on links, but links by date, allowing them to track the formation of communities, defined here as a dense cluster of weblogs all pointing back and forth to one another.

Using this method, they put some meat on the bones of what everyone knows:

Within a community of interacting bloggers, a given topic may become the subject of intense debate for a period of time, then fade away. These bursts of activity are typified by heightened hyperlinking amongst the blogs involved -- within a time interval.
They then go on to identify several examples of communities coalescing in a brief period of time around a set of posts -- WannaBeGirl's blog poetry in 2000, or Dawn's Funniest/Sexiest Blogger poll from 2002. (Unsurprisingly, both examples used posts about other people to get those people's attention.)

They outline their method for crawling and analysing blogspace while looking for these burst-forming communities, and the algorithm looks like a useful feature for ongoing exploration of blogspace. (Paging David Sifry. David Sifry to the white courtesy telephone...) They also segment blogs by in-bound links:

...pages linked-to by an enormous number of other pages are too well-known for the type of communities we seek to discover; so, we summarily remove all pages that contain more than a certain number of in-links.
in order to differentiate between community participation and publishing (and argument I've been groping towards in Communities, Audiences and Scale, and Weblogs, Power Laws and Inequality, but the algorithms here are far more precise than my descriptions.)

Finally, they analyze the changes in their data set overall, and come to two remarkable conclusions: first, 2001, really was the unusual year, with the link structure at both a macro and micro level taking a remarkable jump in density.

Second, there is a core set of blogs that form a Strongly Connected Cluster, and is growing rapidly:

But up to this point, blogspace is not a coherent entity -- the overall size has grown but the interconnectedness is not significant. At the start of 2001, the largest component begins to grow in size relative to the rest of the graph, and by the end of 2001 it contains about 3% of all nodes. In 2002, however, a threshold behavior arises, and the size of the component increases dramatically, to over 20% by the present day. This giant component still appears to be expanding rapidly, doubling in size approximately every three months. Clearly this growth cannot continue and must plateau within two years.
Oh, and they prove that blogspace is not a random graph, and conclude that blogspace can better be analyzed as a set of inter-networking communities than as a set of stand-alone blogs.

It's too early to tell for sure, but this paper feels absolutely seminal. I know its a pain to set up another online account, but do it anyway, and then go read the paper. (Thanks, Hylton)

Comments (0) + TrackBacks (1) | Category:


TRACKBACKS

TrackBack URL:
http://www.corante.com/cgi-bin/mt/teriore.fcgi/1024.

Listed below are links to weblogs that reference Bursty Community Formation in Blogspace:

Deep thinking on federated identity, community formation, upcoming conferences and what is wrong with the Liberty Alliance... [Read More]

Tracked on December 17, 2005 11:44 AM

Thought Leadership: Federated Identity and Community Formation


EMAIL THIS ENTRY TO A FRIEND

Email this entry to:

Your email address:

Message (optional):




RELATED ENTRIES
Spolsky on Blog Comments: Scale matters
"The internet's output is data, but its product is freedom"
Andrew Keen: Rescuing 'Luddite' from the Luddites
knowledge access as a public good
viewing American class divisions through Facebook and MySpace
Gorman, redux: The Siren Song of the Internet
Mis-understanding Fred Wilson's 'Age and Entrepreneurship' argument
The Future Belongs to Those Who Take The Present For Granted: A return to Fred Wilson's "age question"