Corante

Authors

Clay Shirky
( Archive | Home )

Liz Lawley
( Archive | Home )

Ross Mayfield
( Archive | Home )

Sébastien Paquet
( Archive | Home )

David Weinberger
( Archive | Home )

danah boyd
( Archive | Home )

Guest Authors
Recent Comments

Ask Fm Anonymous Finder on My book. Let me show you it.

Ask Fm Anonymous Finder on My book. Let me show you it.

mobile games on My book. Let me show you it.

http://www.gunforums.com/forums/showtopic.php?fid/30/tid/15192/pid/111828/post/last/#LAST on My book. Let me show you it.

temecula dui attorney on My book. Let me show you it.

louboutin chaussures soldes on My book. Let me show you it.

Site Search
Monthly Archives
Syndication
RSS 1.0
RSS 2.0
In the Pipeline: Don't miss Derek Lowe's excellent commentary on drug discovery and the pharma industry in general at In the Pipeline

Many-to-Many

« The Cost of Presence | Main | Tag This? »

May 16, 2005

Ontology Is Overrated: Social advantages in tagging

Email This Entry

Posted by Clay Shirky

This spring, I gave a pair of talks on opposite coasts on the subject of categorization and tagging. The first was entitled Ontology Is Overrated, given at the O’Reilly ETech conference in March. Then, in April I gave a talk at IMCExpo called Folksonomies & Tags: The rise of user-developed classification.

I’ve just put up an edited concatenation of those two talks, coupled with invaluable editorial suggestions from Alicia Cervini. It’s called Ontology is Overrated — Categories, Links, and Tags. Though much of it is not about social software per se, I try to extend the argument that the ‘people infrastucture’ hidden in traditional classification systems is an Achilles’ heel for systems that have to operate at internet scale, and that the logic of tagging overcomes that weakness:

DSM-IV, the 4th version of the psychiatrists’ Diagnostic and Statistical Manual, is a classic example of an classification scheme that works because of these characteristics [of the user base]. DSM IV allows psychiatrists all over the US, in theory, to make the same judgment about a mental illness, when presented with the same list of symptoms. There is an authoritative source for DSM-IV, the American Psychiatric Association. The APA gets to say what symptoms add up to psychosis. They have both expert cataloguers and expert users. The amount of ‘people infrastructure’ that’s hidden in a working system like DSM IV is a big part of what makes this sort of categorization work.

This ‘people infrastructure’ is very expensive, though. One of the problem users have with categories is that when we do head-to-head tests — we describe something and then we ask users to guess how we described it — there’s a very poor match. Users have a terrifically hard time guessing how something they want will have been categorized in advance, unless they have been educated about those categories in advance as well, and the bigger the user base, the more work that user education is.

More at Ontology is Overrated — Categories, Links, and Tags.

Comments (15) + TrackBacks (0) | Category: social software


COMMENTS

1. Frank Ruscica on May 16, 2005 2:17 PM writes...

Ontology-directed classification, then, is the happy middle groud...

Permalink to Comment

2. Frank Ruscica on May 16, 2005 2:17 PM writes...

Ontology-directed classification, then, is the happy middle ground...

Permalink to Comment

3. Craig Hubley on May 16, 2005 5:09 PM writes...

You can't learn much about this by talking about it, only by doing it in practical applications that involve very high consequences for errors of classification, but also have the flexibility to change representations that just aren't working to the satisfaction of the users. While military and legal applications have the high consequences, they don't have the flexibility. So politics and psychiatry may be the muddled middle ground where you can actually change the definitions, but where the stakes are very high (people's personal freedom, social well-being and thriving of their surrounding ecosystems at stake).

Those interested in this can review Living Platform in practice - an experience paper about a whole year working with wikis in political platform formation in a minor-becoming-major Canadian federal political party. I'll answer any questions there people want to raise.

Links and tags were the only mechanisms used for organizing. There was deliberately no category nor ontology scheme. Several projects at different levels (the Imagine Halifax municipal platform, some Ontario and Nova Scotia and Alberta organizing efforts) were using similar technologies. Only now are these being combined into the ontology suitable to guide classification of the policies, protocols, etc..

After the departure of the instigators, there was a renewed emphasis on imposed categories at the Green Party of Canada itself. But this was also in context of a general attack on the principles of a self-organized means of writing the platform - a power backlash - so if anything it is evidence that the links and tags do threaten, in time, the categories and ontologies that exist in the minds of the power figures. They just do not notice, until you have stolen nearly all of their dinosaur eggs. ;-)

Permalink to Comment

4. jim wilde on May 16, 2005 7:45 PM writes...

Hi,

Scale matters! Since ideas are everywhere, having the option of using a taxonomy or tagging system makes it so much easier to find and discover them.

Permalink to Comment

5. jim wilde on May 17, 2005 8:32 AM writes...

networks in expanding cultural spaces by Grant McCracken at This Blog Sits at the (Intersection of Anthropology and Economics) shows how - Ideascape - could work to connect seemingly random events. This is a post, accidental magic that goes from random events to using del.icio.us to find exotic solutions to exotic problems.

Permalink to Comment

6. Adam Marsh on May 17, 2005 1:09 PM writes...

Great essay, I especially appreciate the helpful charts! One minor issue: after the "Tag Distributions" chart, you refer to "the characteristic long tail of people who use many fewer tags than the power taggers." I can't see a way that this is a "long tail" according to the common usage: the observation that for many distributions, the number of elements with outlying values (the "tail") may be cumulatively significant compared to the number of elements clustered near the average. More details are in http://www.econometa.com/archives/9; I'd be interested to hear your thoughts.

Permalink to Comment

7. Bill Seitz on May 17, 2005 5:11 PM writes...

Your long dissection of the Yahoo taxonomy ("ontology" may not be the appropriate label) makes it sound as though "that's why more people use Google". Which I don't think is really the case.

Two stronger arguments might be:
* Yahoo only classifies "sites" or "significant sub-sites" dealing with a coherent subject. So a really valuable single HTML page that doesn't fit within the apparent scope of its containing entry will never show up in Yahoo.

* despite only categorizing baskets of content, Yahoo could never really keep up with the flow of new content coming onto the web. Ergo new stuff would not be in there. Maybe this doesn't really matter in terms of the knowledge-augmentation potential for the result-set, but it effects perceived value, and therefore popularity of usage.

These are, of course, both side effects of the "high cost" of human infrastructure, which you *do* cover in detail...

Permalink to Comment

8. Bill Seitz on May 17, 2005 5:24 PM writes...

Is the "next level" to eliminate tags altogether and just have people write free-text paragraphs (e.g. a caption for a Flickr photo), and use full-text-search technologies against that metadata?

Permalink to Comment

9. Peter Jones on May 19, 2005 6:03 AM writes...

Incidentally, I wonder why Clay stops at unique identifiers. They are clearly not necessary to tagging as he describes it. Resources may be unique on the web, but paths to them often aren't resulting in multiple possible addresses. I.e. URIs are /not unique identifiers/. We don't really have unique identifiers on the WWW.

Secondly, much of the del.icio.us appeal is in the flat simplicity of tagging, but users are restricted to that. One wonders what they would have achieved with a system that allowed them to create more complex tagging structures with similar ease. Dmoz.org?

Thirdly, Clay Shirky seems to confuse resource filing with indexing occasionally, or to overemphasise that as a driver for categorisation schemes. But another way of looking at the same pattern is to ask what the shortest meaningful categorisation path down the tree would be /that eliminates confusion in the user/ as it operates.

Permalink to Comment

10. Peter Jones on May 19, 2005 6:04 AM writes...

Incidentally, I wonder why Clay S. stops at unique identifiers. They are clearly not necessary to tagging as he describes it. Resources may be unique on the web, but paths to them often aren't resulting in multiple possible addresses. I.e. URIs are /not unique identifiers/. We don't really have unique identifiers on the WWW.

Secondly, much of the del.icio.us appeal is in the flat simplicity of tagging, but users are restricted to that. One wonders what they would have achieved with a system that allowed them to create more complex tagging structures with similar ease. Dmoz.org?

Thirdly, Clay S. seems to confuse resource filing with indexing occasionally, or to overemphasise that as a driver for categorisation schemes. But another way of looking at the same pattern is to ask what the shortest meaningful categorisation path down the tree would be /that eliminates confusion in the user/ as it operates.

Permalink to Comment

11. Peter Jones on May 19, 2005 6:23 AM writes...

Also, much of the reason for not using the category navigation on say, Yahoo or Google, might be down to ease of interface use and interface 'speed'. If a user could simply float up and down and across the category tree using mouse-overs as is possible in a standalone 3D app then maybe the category trees would be used more.

Permalink to Comment

12. Tim on May 19, 2005 4:49 PM writes...

Really great essay. Nice to see this so clearly stated.

Peter, I was wondering recently if it would be useful to add another layer to the tagging on del.icio.us, like "location:Paris", providing metadata for the metadata, to avoid results matching say, "celebrity:Paris".

My gut tells me it's unnecessary, and that people would be more apt to search "Paris France" or "Paris Hilton" to find their desired result then performing some advanced location search. The difference would be not adding more defined keywords, but just plain more keywords.

Permalink to Comment

13. Star Lancaster on May 20, 2005 8:56 PM writes...

Semantic confirmation of the self.

Permalink to Comment

14. Mark Heumann on May 21, 2005 9:24 PM writes...

The Times Literary Supplement published a scathing review of the American Psychiatric Association's DSM-IV. The review criticized the profession's practice of defining syndromes based on similar behaviors that may have very different origins--in short, creating intellectual constructs that are then treated as phenomenal reality. In other words, unicorns become horses with horns.
The review got me to thinking about the social impact and expense of such taxonomies. In the US, at least, if a syndrome appears in the DSM-IV,
1. The insur*nce companies will pay for treatment of it and therefore
2. The pharmaceutical companies will develop a drug for it, and
3. The doctors will prescribe that drug to those who manifest the symptoms.
(I recommend Dr. Simon Sobo's critique in "A Reevaluation of the Relationship between Psychiatric Diagnosis and Chemical Imbalances" .)
Of course, if it doesn't appear in the DSM-IV, it doesn't get treated because no drug is developed for it because the insur*nce companies won't pay for it. It may exist, but if it isn't represented in the taxonomy, it is not professionally recognized. We database types would say, "That's not a class of information that's valuable to the enterprise."
Note: Asterisks have been entered above because this site's spam filter targeted certain terms. This too is taxonomy at work.

Permalink to Comment

15. Tony on May 24, 2005 10:31 PM writes...

You have a very orderly mind. You'd make a good cataloguer.

Permalink to Comment

TRACKBACKS

TrackBack URL:
http://www.corante.com/cgi-bin/mt/teriore.fcgi/1922.

Listed below are links to weblogs that reference Ontology Is Overrated: Social advantages in tagging:


EMAIL THIS ENTRY TO A FRIEND

Email this entry to:

Your email address:

Message (optional):




RELATED ENTRIES
Spolsky on Blog Comments: Scale matters
"The internet's output is data, but its product is freedom"
Andrew Keen: Rescuing 'Luddite' from the Luddites
knowledge access as a public good
viewing American class divisions through Facebook and MySpace
Gorman, redux: The Siren Song of the Internet
Mis-understanding Fred Wilson's 'Age and Entrepreneurship' argument
The Future Belongs to Those Who Take The Present For Granted: A return to Fred Wilson's "age question"