Clay Shirky
( Archive | Home )

Liz Lawley
( Archive | Home )

Ross Mayfield
( Archive | Home )

Sébastien Paquet
( Archive | Home )

David Weinberger
( Archive | Home )

danah boyd
( Archive | Home )

Guest Authors
Recent Comments

pet rescue saga cheats level 42 on My book. Let me show you it.

Affenspiele on My book. Let me show you it.

Affenspiele on My book. Let me Amazon show you it.

Donte on My book. Let me show you it.

telecharger subway surfers on My book. Let me show you it.

Ask Fm Anonymous Finder on My book. Let me show you it.

Site Search
Monthly Archives
RSS 1.0
RSS 2.0
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline


« RSS Winterfest Today and Tomorrow | Main | Weblogs are less self-consistent than Blaze imagines »

January 21, 2004

Link propagation and "discovery credit"

Email This Entry

Posted by Seb Paquet

William Blaze has a post up on Abstract Dynamics titled Amplification and Stratification, tracing the linkflow in blog space in which he analyzes how a link to Linton Freeman's article, "Visualizing Social Networks" in the Journal of Social Structure, was passed from weblog to weblog until it had reached quite a few eyeballs. He cites it as an example of blog-enabled amplification but points out that some things were lost in the process. As a result, credit to the original publisher of the article, to the source of the link, and to the blogger who originally dug it up didn't propagate widely along with the link itself. (Go read the post.) I agree with Blaze that this is an instance of a general problem, and this connects to recent discussions of fairness in weblogs. For instance, as he points out, within the "political economy of linking" there can be incentives not to point to one's sources. While there's a general norm of bloggers linking to sources, the practice is not universal and few chains of credit go all the way, with the unfortunate consequence that promising sources can remain obscure for longer than they would otherwise. Unlike chains of oral gossip, however, blogs are on the public record, and this is another area where blog crawlers can perhaps help a little bit. For instance, the Technorati page for the link in question enables us to trace it back to William's post (but unfortunately no further). A few questions spring out from this. It is generally accepted that giving credit for creation is important; is it the same for "link discovery credit?" Will (should) the practice of linking to sources of links come to be taken very seriously by bloggers, out of a shared concern to keep things fair and transparent, in a similar manner to standards of citation in academia? Should one link to the immediate source or make an effort to trace links back to the original source? (Is it always clear which is "the" original source?) [Addendum, by Clay: It's worth noting that the Freeman link appeared on many-to-many after I found it on, not on a blog as Blaze surmises. More on this here.]

Comments (17) + TrackBacks (0) | Category: social software


1. Nick Levay on January 21, 2004 5:03 PM writes...

I posted this in a comment to an earlier post about social network visualization, but it seems relevant here too. We recently posted a collection of graphs and statistics based on the activity present in the MemeStreams community during 2003.

MemeStreams tracks the flow of links between the system's users, builds a model of the social network present on the system, and gives users tools capable of aggregating content based on it. The main page is a "democratic view" of what's currently most popular in the system, but the view the Reputation Agent gives is unique to every user.

Our system does track several metrics that would certainly qualify as "link discovery credit". We feel that the practice of tracking the sources of links will be taken seriously by bloggers as useful tools that can grok the information start to appear. We consider MemeStreams one of those tools, although still a very young and immature one. MemeStreams currently does not see outside its "walled garden" of proof of concept code.

Permalink to Comment

2. Alex on January 21, 2004 6:29 PM writes...

There is enough of a critical mass of MT users (including this blog) that it might be fun to play with a plugin to automate and store (as metadata if not visibly) the attribution of discovery. I've done some experimenting with a system of examining and categorizing *text* in this way, as well as explicit links. The question, which I'm now grappling with, is whether you can model post hoc ergo propter hoc. That is, how many times do you see a 1-2 pattern of link mentions between two blogs before you can surmise that 2 is reading and repeating 1? (Or are they just both reading 0 at two different times during the day?)

Permalink to Comment

3. fat kid on January 21, 2004 7:11 PM writes...

I should probably not even be wasting the comment space on this, as I generally have no idea what I'm talking about. However, blogger (yes, I know) has a cool feature called "blog this" - I'm curious if there's a way to track this back to the previous "referring link" or something to give credit where credit is due, as the source typically doesn't lie too deep in the chain. There are a number of downsides to this I realize, one probably being the technical impossibility, but another large one being, you'd have to get the folks at blogger/google to buy in on it.

Feel free to delete my comment if it makes no sense. :-D

Permalink to Comment

4. Mike G on January 21, 2004 7:11 PM writes...

If you're going to give credit for link discovery, then you soon have the absurdity of some obscure blogger claiming co-equal credit with Tom Friedman for his latest piece, as if no one would have seen the NYT editorial page without Blogdude's imprimatur... the kind of technologically-based who's-linking-who that Alex talks about is far more interesting than the kind of inane "First!"-shouting you see on, say.

Permalink to Comment

5. bryan on January 21, 2004 8:49 PM writes...

I always try to cite the original source (i.e., NYT, WaPo, Newsweek, CNN, Yahoo! whatever), and the place where I found it. If you're really interested, most of the places where I found it include a reference to where they found it. It can get quite confusing if I'm trying to trace a piece back through numerous bloggers. Feh! I'll give you the original source and one secondary source. Possibly two if the place I found it was just a quote.

Permalink to Comment

6. Jay Currie on January 21, 2004 9:18 PM writes...

Where a statement comes from, whether it is the New York Times or Bob's blog, I always link to source. If I have found the link at another blog I will "Hattip or via" that blog.

This makes sense if someone wants to check to see if what I am quoting is properly contextualized. (And it also gives credit where it's due.)

But I don't think it makes any sense to go more than one hattip back. If the blog I get the material from hattips then it will go back up the chain.

Permalink to Comment

7. Vik Rubenfeld on January 21, 2004 9:25 PM writes...

Your readers may be interested in a new forum, where Bloggers on all blogging systems meet and talk about subjects of mutual interest. It's called Forum4Bloggers, and has already been recognized by GeekPress [].

Sys Admin

Permalink to Comment

8. triticale on January 21, 2004 9:33 PM writes...

As a new, and aspiring to be up-and-coming, blogger, I always try to give discovery credit on anything I have reason to comment on. I figure that the favor will be returned, and I have indeed gotten "hat-tips" for some of the weird stuff I find outside the blogosphere.

Permalink to Comment

9. anon on January 21, 2004 11:55 PM writes...

Discovery credit is interesting, but I usually find it only of value when the author of the post that has brought the original source to light has something unique to say about it - the value add, so to speak. Otherwise it is merely the blog equivelent of the email header information which we so frequently find gets in the way (unless, of course, we recognize the name of a friend of famous figure in the header, but it is rather more infrequent that this matters to the discussion at hand.)

I suppose this is part of my ingrained bias against mere "linkers". I usually have extensive access to a wide variety of sources, and see much of what appears in link list type blogs long before the individuals in question. I am interested in unique authorship and interpretation most of all - the insight which makes that individual worth reading above and beyond the source itself.

These days, its often for the quality of the summary as much as anything else. I find I do not have the time as I would like to read the 1500 plus pages of text a day that comes across my desktop and must ruthlessly triage. The collaborative filtering environment of the blogsphere is good for that, at least.

However, manners dictate that the hat tip is desirable when easily accomplished, as for many aspirants it may be the only tangible recognition (outside traffic numbers) that the effort they expend on their blog has an impact. As for using such credit for any sort of automated tracking or analysis of meme dissemination patterns - this is an overreach, to be sure, both in terms of the technology and the requirement.

And while we seek to explore the emergent properties from an academic perspective, the spectre of such metadata used in litigation is frightening and likely inevitable (say for copyright infringement or other alleged IP law violations, expanded targets for defamation and libel suits, or the deliberate attempts by tyrants and censors to eradicate the free and open speech that threatens them from the refuge of the faceless whispers on a virtual wind passing between those many islands in the blogsphere - or in the net as a whole.)

Permalink to Comment

10. Tom Coates on January 22, 2004 3:45 AM writes...

It's a difficult thing to automate, because - of course - just because something is posted second doesn't mean that it was directly inspired by the first link. That much should be clear from watching links to after a keynote when normally dozens of people - completely without prompting from each other - link to the same pages of new products.

Practically, the emerging form of link-logs has had an impact on via links and giving credit. This has happened for a variety of reasons, but one of which being that many people have organised their sites so that a link-log post constitutes nothing more or less than a link and some link-text. Whether or not that's justifiable is another matter of course.

With regards to the etiquette issue, I tend to believe that there's a clear but unenforceable idea that you should link through to the person from whom you got the link, but I'm beginning to wonder whether or not this is justifiable in the medium term. It could be that we are reaching a time where the commentary or the supplementary content is more important than just the finding of links, where a link should be given because of the value-added, not simply because it was the original source of the link in question. This does - in fact - appear to be the way things are going.

Permalink to Comment

11. Catfish N. Cod on January 22, 2004 8:54 AM writes...

I pondered this problem a good while ago. Please examine my own (now defunct due to work) blog, Every major link for a post that I used, I tracked back as far as I could; then I linked each blog/site in succession.

For instance, on a post relating to an MSNBC interview of David Kay, I would write at the top of the post:

(Link path: Sgt. Stryker, Pejmanesque, Instapundit, Andrew Sullivan, MSNBC)

...and link to each post, in order, that I'd followed to find the original story.

If I followed several paths, I could use a bracket to indicate branches, like this:

(link path: The Command Post, Winds of Change, The Agonist, {U.S. News & World Report; Sydney Morning Herald})

The Agonist linked to two major stories in his post; I linked 'em both.

I'm sure software could be set up to spider Technorati or someplace and automate this process; I did it by hand, which is tiring and time-killing. Most people wouldn't do it. But it could probably be easily integrated as a module into blogging software. If it become ubiquitious, you wouldn't even need to trace every link, just add your name to the front of an ever-lengthening chain of back-reference posts.

In a way, this is similar to isnad, the original citation system developed for the compilation of the Muslim hadith ("sayings"). (A good article is at .) From the article:

"An isnad is in the form "A said that B said that C wrote (in a lost work) that D read in (a work known to exist but also lost) E that Muhammad had said...", where A, B, C, D, E were known figures with known histories."

Same problem, same solution. Convergent evolution!

Permalink to Comment

12. anon on January 22, 2004 11:29 AM writes...

"In a way, this is similar to isnad, the original citation system developed for the compilation of the Muslim hadith" (Catfish N. Cod)

Excellent choice of historical systems analogy. It also illustrates the fundamental problems that emerge in a system for citation in which the path of the citation accretes more complexity (and demands more attention) than the source itself. One does not wish to create something that will impose upon the blogsphere the kind of debates over the authenticity and primacy of sourcing as one finds in discussions of Islamic jurisprudence, for example.

The same problem / same solution does have a superficial appeal because we are used to thinking of these discussions as aspects of a citation problem in a world of works which are ephemeral at best. However, this begins to touch on the root impermanence question that underlies all web content, not just the blogsphere - it is a system of transient communications, not of formal archives (the excellent work of the wayback project aside).

It is my experience that systems which better enable aspects of the transient communications for the right people at the right time fare better than ambitions towards formal archiving which expend tremendous effort to impact users of only a small percentage of its content. (The major exception, which in a way does prove the rule, being Google's archive of the Usenet - a fantastic resource but one that was possible only through the scale achieved by a major search provider - thus the failure of the earlier MyDeja implementation in a stand alone format.)

That said, I myself prefer that which helps to overcome the unbearable lightness of blogging - or of other transient communications which I know inevitably will be lost to time. I unfortunately find myself in a minority of users - and thus forced to expend much time and effort keeping extensive personal files towards this end.

Permalink to Comment

13. cbryton on January 24, 2004 4:42 PM writes...

Perhaps the answer lies in the practice of Brazilian lawyer I know, who heads his page with instructions on the norms of citation to be used when referring to his articles. The Creative Commons license requires attribution at minimum, but has little to say about what form that should take. Know anybody who's ever enforced that provision of their license? Look: Not all of us are academics weaned on Turabian, or the APA style guide, or the system of uniform citation taught in law schools. But I don't think it's so much a question of fairness to the "discoverer" (who gives a damn who got up earliest to read the New York Times?) as of a loss of value as the link gets disseminated: A post that doesn't lead me back to the primary sources isn't worth that much to me, any more than a State of the Union address that cites bogus intelligence to justify the actions of a government, for example. In the case of such a post, I might credit it as having brought the subject to my attention, and then attempt to assemble all the proper facts and citations on my own from other sources. It's a matter of individual responsibility (journalistic ethics), and, among discerning consumers of information, ought to enhance that individual's reputation, status, and value in the marketplace of ideas. But then again, once any idiot can publish on the Internet, most people publishing on the Internet will be idiots. And a lot of them will acquire authority they don't deserve, the susceptibilities of human nature being what they are.

But what're ya gonna do? If you slept through that academic writing prerequisite I taught for six years, I guess there's nothing more I can do.

Permalink to Comment

14. cbryton on January 24, 2004 6:47 PM writes...

Also a propos: HowToQuote from NetVillage.

Permalink to Comment

15. Seb on January 25, 2004 7:12 PM writes...

Open Access News editor and Philosophy prof. Peter Suber wrote cogently (as usual) about this very topic two years ago:

"When I summarize a news story or scholarly article in the newsletter, I
have two sources. One is the full news story or full-text article I'm
summarizing. The other is the web page, email list, discussion forum,
current awareness service, blog, or friend who alerted me to the
former. When I first learn about an article in the same place or from the
same organization that published it, then the two sources are the same.

I always credit the first kind of source by linking to the full story or
full-text article.

How hard should I try to credit the second kind of source?

The rules of scholarly citation generally cover the first but not the
second kind of source. Most non-scholarly electronic publications also
disregard or hide the second kind of source. These practices may be
justified by the traditional function of a citation: to help readers
deepen the inquiry, follow-up the author's discussion, read further on the
same subject, verify that the author has quoted or paraphrased accurately,
or examine the provenance of the cited facts or the credentials of the
cited authority. Giving readers access to the first kind of source is
necessary for these tasks, while giving them access to the second kind is

But now nearly all blogs give credit to both kinds of sources, and a
growing number of online newsletters are doing the same. Their reason is
not to help readers follow-up the specific item containing the citation,
but to show gratitude to the labor of others and to introduce readers to
resources they might find useful. These are good reasons and I share
them. Citing two sources rather than one is more time-consuming for the
writer and perhaps more confusing for the reader, but it has benefits that
might outweigh these costs."

Permalink to Comment

16. Roberta on March 30, 2004 3:35 AM writes...

This is an interesting source considering the article published in MSN "Social Networking in the Digital Age"

Permalink to Comment

17. Seb on June 12, 2004 2:19 PM writes...

There is now an online petition for crediting sources:

The text says: "In order for the blogosphere to be taken seriously as a news medium, bloggers themselves should commit to crediting/sourcing the other members of our community who break legitimate news. In signing this petition, you indicate that you promise to abide by these practices from today on forward. "

Permalink to Comment


TrackBack URL:

Listed below are links to weblogs that reference Link propagation and "discovery credit":


Email this entry to:

Your email address:

Message (optional):

Spolsky on Blog Comments: Scale matters
"The internet's output is data, but its product is freedom"
Andrew Keen: Rescuing 'Luddite' from the Luddites
knowledge access as a public good
viewing American class divisions through Facebook and MySpace
Gorman, redux: The Siren Song of the Internet
Mis-understanding Fred Wilson's 'Age and Entrepreneurship' argument
The Future Belongs to Those Who Take The Present For Granted: A return to Fred Wilson's "age question"