share
interactive transcript
request transcript/captions
live captions
download
|
MyPlaylist
TOM: But my big task today is to introduce Clay Shirky. Clay and I share a connected social graph that has distinct subgraphs around two different professions, starting with an undergraduate mentor, and extending for both of us through theater, and on into something or other to do with the internet. I would be hesitant to define what I do, and I certainly can't define what he does.
Clay, on the other hand, is much more social and collaborative than I am. He has 228,228 Twitter followers, 925 Facebook friends, 377 connections on LinkedIn, and claims, although I think he may be hiding behind anonymity somewhere, a few dozen articles in Wikipedia. He also has two very widely admired books on social media and collaborative construction of knowledge, an astonishingly huge collection of insightful essays and TED Talks that you can find on the web, and a hairstyle for which I have unbounded admiration. Everybody, here comes Clay Shirky.
[APPLAUSE]
[INAUDIBLE]
CLAY SHIRKY: Thank you, everybody. This doesn't sound so-- let me pick this up. Is that better? Yeah. Great. OK.
Yeah. So as Tom said, I study theory and practice of social media. This is the part of the talk where I have to say, I am not a lawyer. So what I want to talk to you today about is the collaborative construction of knowledge, the way that people can come together, find one another, find shared interests, and produce collaborative work.
And rather than talk about that in the abstract, I want to just start with a story that I think illustrates some of what's changed. And the story starts with this map. About a year and a half ago, DARPA, the Defense Advanced Research Projects Administration, decided that they wanted to see how people collaborate under conditions of imperfect information. So they put up 10 red weather balloons around the United States, these are about 10 feet wide, tethered to the ground, close to the ground.
They were all visible from roadways and walkways. But nobody had this map except for DARPA. And DARPA said, here's the challenge. You've got 30 days.
If you can tell us where all 10 balloons are to within a mile of their actual location, we'll give you $40,000. That was the prize. So a team at MIT gets wind of this. They say, well, we know how to solve this problem, right?
Every balloon has been seen by somebody. We'll just pay people to tell us where the balloons are. So if you tell us where a balloon is, and you turn out to be right, and we win the prize money, we'll split the prize money with you. So now they've solved the information disclosure problem.
Now they have a marketing problem. And the marketing problem goes something like this. You need to do a national ad campaign at vanishingly low cost, and your target audience is 10 people.
Can you use billboards? No, you can't. Can you use 30-second spots on television? No, you can't. None of the traditional media is adequate to the information-finding problem that the MIT team is facing.
So they say, well, we know how to solve this problem, too. We won't just pay you if you tell us where a balloon is. We'll pay you if you introduce us to someone who tells us where a balloon is, and we'll even pay you if you introduce us to someone who introduces us to someone who tells us where the balloon is.
So now there's an incentive to spread this information because the cost of saying, hey, this thing is going on, is very low. And the potential advantage to you is you get a little payout if the MIT team wins. And so Facebook and Twitter lights up with news of this campaign, and so forth.
There's three important things about the MIT team strategy. First is-- oops. The first is that my clicker's not working. The first is that they won. The MIT strategy worked. They found the 10 balloons.
The second thing-- I don't know if you can see this column over here. The MIT team was the only team to correctly identify the location of all 10 balloons. All of the other teams checking in identified 9 or fewer. But the third thing was the time.
DARPA had allotted 30 days for the red balloon challenge to unfold, and the MIT team solved the problem in 9 hours. So DARPA had over-provisioned the amount of time it might take to solve this problem by a factor of 80. And that's what I mean when I say, these tools provide new ways of doing things.
The change here isn't just that formerly hard problems have become easy. The change is that formerly impossible problems have become trivial. And what that means is that these aren't just new tools for doing things the old way. They're tools that actually change the organization that uses them.
So what happens when this comes to academic settings? There was an experiment done a few years back by the Smithsonian also looking at openness. What happens when we share more widely? What happens particularly when we take jobs that were previously reserved solely for curators, and we also invite the public in?
So the Smithsonian put up several thousand images on the photo sharing service Flickr. And these images were all copyright cleared, and so there were no issues about using them or reusing them. And they asked people on Flickr to tag the photos, to give them completely free-form labels with no preexisting ontology anywhere in sight. Just, what does this photo mean to you?
I don't know why I picked that up again since I knew it didn't work. So here are the most popular tags from that photo collection, a whole collection of tags in what's called a tag cloud, where the font size indicates something about the frequency of the tags. Here's a list of just the A's.
And here's a list rendered as if you had a very long, skinny browser. And in fact, you can't really see the tags. But it's an enormously long list produced by this cumulative user population.
So going back to the list of popular tags, I want to call your attention to three in particular, "cyanotype," "moustache," and "steampunk." So the cyanotype tag brings up one of the classic problems with any collection of scholarly materials, which is that some people are interested in the contents, and some people are interested in the form. Now, photos historically get categorized by what? By the objects or images that are pictured in the photo.
But people interested in the early history of photography have to look across the metadata a different way. And this group of people coming in and tagging things on Flickr simply created a cyanotype tag to pull those kinds of photos out. Ditto "moustache." If you were interested in the history of facial hair, then there is a collection now of these photos that show you different men with mustaches.
Now, this is not the sort of thing that professional catalogers would have the time to do, not because they wouldn't be interested, there is, in fact, scholarly work on facial hair, but because that level of detail often defeats the amount of time professional curators can bring to a task like this. And then there's "steampunk," and steampunk is a brand of science fiction that assumes, essentially, that Charles Babbage's difference engine worked. It assumed that many of the inventions we take for granted in the second half of the 20th century actually unfolded in the second half of the 19th century, and imagined a kind of Industrial Revolution that-- early Industrial Revolution that mirrors many of our social settings.
So here are the photos in the steampunk category. Now, this is a completely fictional category. These are photos collected in an alternate universe. But why not? Why not let people pull that kind of value out of the corpus?
The problem with cataloging mustaches is just that curator time couldn't be given over to that. But there's no institution in the world that would generate a steampunk category using professional time. And yet, these are the nation's assets. Why can't that kind of value be created if you use this system?
And so the Smithsonian, seeing this happen, began to experiment with more forms of openness. And among other things, they put up their fish photo collection. So here is a puddingwife wrasse. They put the puddingwife wrasse up on photo alongside a little bit of a description of the photo.
And in the comment thread, people began to discover and report new kinds of values. So the pho-- the two comments in the middle there from the fishiologist said, oh, I'm an admin for a group called Fish Species Fishbase. And Fishbase is a scientific-- a collection of fish being cataloged by ichthyologists. And this photograph is useful for that database.
So here is a scholarly use of the material that was discovered by pull, and not by push. The Smithsonian didn't need to know about Fishbase to make this transaction happen. By making the material available, they made it possible for someone to discover it and pull it in. And the very next picture is someone saying, hey, this looks great. I put it on a purse, right?
[LAUGHTER]
So is this a serious use of scholarly materials, or is this a silly use of scholarly materials? To which the answer is, yes, right? Once you flip the bit to saying, we're going to open this up, you get all kinds of uses.
And you can't divide them into the categories that we're used to like, these people over here are serious, and those people over there are silly. In fact, once you go to a pull-based system, you make the materials open, and people find it and take the value with it that they want, you get surprises and juxtapositions like this. So I'm going to pull out three patterns from the Smithsonian experiment.
The first is, all of this stuff, the ability to identify cyanotypes, and moustache photos, and steampunk, the ability to pull fish photos into a scientific database and a fashion designer's one-off purse, all of that relies on openness. Openness is the basic choice here on top of which all the rest of this value is built. The second is that openness doesn't just produce new kinds of value. It produces surprising kinds of value.
Very often, if you talk to institutions sitting on valuable data and you say, what if you were to open this up, the question comes back, well, what would people do with it? We can't imagine what people would do if we were to, say, take the World Bank hunger data and expose it on the web, or whatever. But the answer to that question is, we don't know what people will do with it. That's why we're opening it up.
You can't predict in advance what people will do, what kind of value people will create, once you open it. And it's actually discovering that value, rather than predicting that value, that becomes the exercise an organization has to go through. And the third is that the serious and the silly get all mixed up together, that the categories we're accustomed to of being able to identify some kinds of institutions as doing good, serious, careful work, and other things as being flippant or offhand, get mixed-up in these environments. And that's actually one of the things that is hardest for institutions to get used to. As you walk institutions through needing to open up, explaining that the fish purse doesn't, in fact, invalidate the Fishbase, is part of what those institutions have to internalize.
So this experiment created value. DARPA, obviously, was able to see problem-solving in a new way. What resource are these organizations relying on? What is it that's making this work? My answer to that question has been the cognitive surplus, which is my name for the collected free time and talents of the industrialized world, coupled with a network that, for the first time, is group-oriented into a way which allows us to accumulate that free time and talent to create large collaborative projects.
So if I'm putting this forward as a new resource, I needed to be able to talk about it to say how big it was, and so forth. And there were no obvious metrics. I started writing a book, which was published in 2010, by the title Cognitive Surplus. And for that book, I needed to be able to say how big is the cognitive surplus.
So I invented a metric, and I started with Wikipedia. Wikipedia is the largest joint collaborative project most people have any kind of repeated experience with. And with my friend Martin Wattenberg, who worked then at-- studied Wikipedia then at IBM, is now at Google, we worked out a "back of the envelope" calculation to the question, how big is Wikipedia as a measure of human effort?
And the answer is, by 2008, Wikipedia had taken something like 100 million hours of human thought to create. Every edit and every article, every argument on every talk page, and every language added up to something like 100 million hours. "Back of the envelope" calculation, but it's the right order of magnitude. So that's obviously an enormous project. Anything that takes 100 million hours is a huge thing.
But how big is it relative to free time in general? So I needed something to compare it to, and I took television, number 1 use of free time worldwide. So imagine a graphic representation of television-watching. How big is that?
Wikipedia is 100 million hours. Television is 200 billion hours in the United States alone every year. Rendered graphically, Wikipedia is here. The entire 10-year history of that project is a rounding error on a single year's television-watching in a single country. This is an enormous surplus.
And what this means is that we don't need some kind of wholesale change in behavior. We don't need a participatory age of Aquarius to see things starting to change. If a small percentage of people simply commit a small percentage of their time not just to consuming, but also to producing and to sharing, the cumulative effects of that can be enormous.
And to put that another way, in terms of the size of the surplus, we spend a Wikipedia project's worth of time every weekend just watching ads. So even small changes in the way people commit their time can yield large results. And this use of the cognitive surplus can bring about really incredible scientific, literary, artistic, and political kinds of value of the highest order.
It also will bring about a lot of lolcats, as you see. Lolcats are cute pictures of cats, made cuter with the addition of cute captions. And when you advance the thesis that the ability of people to coordinate and collaborate and commit their time to things they're interested in is going to be a big deal. The question you often get asked back, as I often do, is, what do you mean? You mean the thing with the cats? That's the thing that's going to change society?
But here's the thing. That always happens. And I don't just mean that always happens with the internet. I mean, that always happens with media. It did not take long after the invention of the printing press for the first erotic novels to appear. Once you have a financial interest in selling books, it was a fairly easy move for someone to say, hey, you know what I bet people would pay for?
[LAUGHTER]
It took another 150 years for anybody to even think of the scientific journal. So we're used to looking at the printing press as being this very big, very serious backstop of all of what we take for granted in colleges and universities. But in fact, it didn't happen that way. We got erotic novels because we're the kind of people we are, and the printing press is the kind of thing it is. That was a fairly obvious move.
We only got scientific journals when a group of people committed themselves to taking the printing press and using it in ways that were counterintuitive, given the society at the time. The printing press was actually an act-- so the scientific journal was an act of incredible imagination. We get lolcats because we're the kind of people we are, and the web is the kind of medium it is. The open question for us is, when we want to take those same tools and press them into service for something more socially engaged, what will we do?
So the academy, colleges, universities, the broad collection of institutions that think of themselves as academic or research institutions, have a pair of biases that make this a hard transition for us. One is improving and sharing knowledge in the world, making life better by generating new knowledge and sharing it. And the other is a high degree of respect for expertise and authority. And those things are in real tension.
And it's no accident, therefore, that when you see people experimenting with radical forms of openness, trying to create new kinds of value in academic settings, they're often doing it outside of the traditional structures. Right now, it looks ad hoc because it is ad hoc. Because if you want to experiment with this kind of openness, you often have to go outside the world you've grown up in if you've in a college, if you're in a university, if you're in a research institution.
So here's an example of that happening. And I think it's one of the most interesting and important examples going on today. This is a post from Tim Gowers' weblog of a couple of years ago. And Gowers is a field medal-level mathematician.
And he had a post which asked, is massively collaborative mathematics possible? He looked at the way mathematics was done, and it was still one or a small number of mathematicians typically working on their own, or in a very tight group, for years on a problem, which would then get submitted to a journal, and would go another-- between half a year and a year and a half, waiting to be refereed and published, at which point, only the people who were reading that journal would see it early. Sometimes, it would spread. Conversation would then grow up in the pages of that journal.
The whole process would involve a handful of people and take half a decade. Gowers said, what if you didn't do it that way? What if you didn't assume that that's the way mathematics was? What if you just assumed that that's the way our institutions happen to have been, but now we could try it a different way?
And in particular, the two things Gowers emphasized. What if anybody who had anything to say about a problem could chip in, even if it was partial, even if it was just to say, I think this is probably wrong, but-- and that kind of contribution-- that kind of contribution wouldn't be a detriment to the process. That's how it would unfold.
And so out of this post, something called Polymath was formed, the Polymath Weblog. And Polymath takes on mathematical problems in massive groups. They will have a group of people who will get together and say, I think there's something here. I think there's a problem.
They'll characterize the problem. If it's interesting enough to the community, they'll post it as a specific problem. And they just number the problem.
So here's a post around the problem that they came to call Polymath 4, which was a discovery of primes problem. And they'd just run it like a wiki, and they'd just have everybody kick in and say, this is someplace I think you could start. Maybe you could do it this way, all of the different variants of the problem. And they'd work out what a solution might look like in a very large group in public.
And in the case of Polymath 4, the case of the prime-finding problem, they actually got a result that they thought was interesting, and then did what academics do next, which is they submitted it to a journal for publication, in particular, The Journal of The Mathematics of Computation. And the editors looked at it, and they sent back this review. I don't know if you can read that with the font, but they said things like, oh, let's cut the space in runtime.
You've got an extra "is" in there. This isn't actually doing much to improve the math. This is proofreading. And I'm all in favor of proofreading and everything, but the actual math was done already.
And then my favorite comment from the editor-- "The paper does not clearly say who the author is," with the sense that, this must have been an omission because obviously, you would know who the author of a mathematical proof was. So the people back on Polymath say, well, they've asked us to give them an author list, or they won't publish us. So let's just make one up because nobody really knows.
What the editors at the journal were incapable of imagining was that the paper had come in without clearly stating who the author was because it wasn't clear who the author was. And so on Polymath, they just said, look. We're just going to put up a page. And if you think you did enough work on this problem to be listed as an author, put your name here. And professional regard, rather than any kind of standard annotation, is the thing that holds the author list together.
Scientific journals-- back from the days of Philosophical Transactions scientific journals used to be in the business of increasing the speed and the scope at which knowledge was disseminated. And now they are in the business of decreasing the speed and decreasing the scope. In fact, the polymath problem had not only been solved, but the results were sitting there, out in the open, for everyone to see, timestamped and edit-checked.
The function of the journal was essentially a rubber stamp. It might as well have been called, from the point of view of the authors, The Journal of I Can Haz Tenure? Because it didn't perform a syndicating function. It just performed the kind of rubber stamp that said, well, if your department needs to see something that says X, we can do X for you. But all of the valuable work that we used to rely on journals for has actually, except for the proofreading, has actually moved onto the blog.
And this clash of open and closed, expert and amateur, how will we manage these systems, this is showing up over, and over, and over again. And because every organization assumes that whatever has happened in the outside world, their problems are special and their organizational model is special, so the things that have happened elsewhere don't apply to them, that pattern unfolds all the time. So many people in the medical profession have observed that if you search for any significant medical condition or procedure, one of the first, if not the first link Google will hand you back, is Wikipedia.
So if you search for "biopsy," the first thing you get is the Wikipedia article on biopsy. So there you go. Off to Wikipedia, an enormous amount of traffic.
And so people at Harvard decided that this was terrible, that sending people to an amateur, group-edited encyclopedia, rather than carefully-vetted information, was not the way to do it. So they said, Wikipedia has got the right idea, but they've got the wrong process. We need Medpedia.
So what we're going to do is we're going to get serious scholars to come in, volunteer for Medpedia, and create a database of actual information the public can rely on and trust. And we will attempt, in as much as we can, to displace Wikipedia as the first stop for people with medical conditions. So now we have a head-to-head test. There's one experiment in really radical global openness, and there's the other, which says, we've got the experts who are going to create the medical encyclopedia that should exist if it was written by doctors, which ours is.
So in one of these-- here is the article for the biopsy in one of these examples. This is the article for biopsy in its entirety. This is not a quote.
So that's pretty lame. It tells you not much more than you could have gotten by asking a slightly more knowledgeable friend. And then on the other of these reference works, you actually get a long list, different kinds of biopsies, the procedures, the effects, the dangers, all carefully annotated and organized. But the funny thing is, the good one is Wikipedia's version.
So why, given this list of incredible contributors, did the Medpedia version fail? Why did the biopsy article amount to no more than a half-hearted paragraph? And the answer is, experts hate talking to amateurs.
There's no reason for someone to take time out of their day to go and write an article on biopsy. Now, the articles on rare liver cancers on Medpedia, those were quite good because that's the kind of thing that an expert wanting to show their work to another expert could get credit for. But the idea of involving the general public as consumers has worked better when we also involve the general public as producers.
It doesn't mean that the general public has access to the expert information, but it does mean that the general public can frame the debate so that when the experts come along, they can add the information in a form that's actually designed for explaining things. The happiest situation for an expert to be in is, if I do nothing, nothing changes. And that's what killed Medpedia.
If there was an article of the sort that a doctor should obviously write, and nobody did anything, it would just sit there. You don't have that advantage. You don't have that regard on Wikipedia. Someone is going to write the biopsy article. And so if it's already out there and it's already getting traffic, you might as well make it better.
And this is what Medpedia missed. They already had everyone's attention who spoke English. All they had to do was have the people they recruited make the Wikipedia articles better.
But instead, their bias was, no. We're going to pull behind the castle keep. We're going to have just the experts do it. And the results were the ones you saw.
And so this is what I mean about, these are not just new tools for doing things the old way. They don't make things faster or better. These are tools that change the organizations that adopt them.
If you're really committed to moving medical information, for example, into the public sphere, you have to go to where the public already is, which brings me to the law. Now, we're all-- obviously, all citizens are affected by the law, lawyers and non-lawyers. And the law is plainly a moving, annotated body of text that matters enormously both in its expert interpretation, but also in the way that citizens understand it. Because we all bend our lives towards or away from things that are legal or illegal, even if we are not lawyers.
Now, there has been in the last 30 years an absolute revolution in shared annotation of text documents. If you look at what the programmers are doing, if you look at the way that they manage source code-- and something that I'm especially interested in right now is GitHub, which is the social layer on top of the most popular open source code management system. You see incredible abilities to manage very large-scale participatory networks of people who are annotating a shared textual document. And almost none of this right now has showed up in legal practice.
And there have been several threads in various question-answering forums asking, why is that? Why is it that the revolution in shared annotation hasn't actually come into legal circles? I'm partial to this explanation posted on Quora--
[LAUGHTER]
--a few months back, which is part of a thread that was exactly this conversation. What lawyers do all day, and particularly, legislators creating the law, is a problem very similar to what programmers are managing with source code. And yet, they don't seem to use any of these tools. But as dire as this graph seems to be, this situation is actually changing.
If you look, you can see on GitHub, on the large-scale social sharing service for source code, people are starting to post not just source code, but legal documents. So for example, the state code of Utah is up there. The New York Senate has started a project called Open Legislation, which doesn't just use GitHub as a repository, but has a citizen-friendly front end in which you can see your legislator and see which laws that they have-- he or she has, or-- has either sponsored or voted on.
So this is a way of actually trying to bring the law into focus with politics. I elect someone. They are my representative. What is it they're doing?
These kinds of experiments are also spreading, in some cases, to national governments, although not yet to the US government. There's an experiment in putting the laws of Germany up online. There's an experiment in putting the laws of Iceland up online. So more work needs to be done on this movement, obviously. But it is happening now.
The overlap of lawyers and GitHub is starting to move into a larger overlap in that Venn diagram. There is movement to make these documents accessible, and visible, and change-tracked, and written in markup so that they can be consumed by all kinds of tools. And that's wonderful. We need that we need a lot of that. But that's not all we need.
Because when you go back to the Smithsonian photographs, when you look at what happens when people share these documents, it isn't just enough to have access to them. People have to be able to understand them as well. And that's the next big push. It's the thing that set Medpedia apart from Wikipedia, which is that Medpedia still fell back on the habit of experts deferring to one another, and not writing in a way, either at a length or in a language, that would help lay people understand.
And yet, that's the piece that people are crying out for, which is, how do I understand the laws being made in my name? And so the annotation movement is, I think, one of the really interesting ones right now, in part because it's so new, and in part because, as I said before, most of the attempts to do this are ad hoc and outside of traditional structures. Because it's such a strange idea that, rather than falling back on 6-point Agate type and a 52-page iTunes legal agreement that you lie and say you read just so you can get back to playing your music, the idea that you would actually go out of your way to explain to people what's going on is a remarkable change, and one that's going to take a considerable effort on top of just making the legal documents open.
So there are, obviously, great work on annotation for things like Legal Information Institute, or the State Decoded, which, in fact, has just recently released their fourth version on GitHub. But it isn't just in the places that say, we want to set up a general institute for doing this kind of legal explanation. This is also being pulled into the environment because there are new kinds of need for individuals and businesses. Let me give you a little case study.
This is a site called Thingiverse. And full disclosure, I'm on the board of the parent company, MakerBot. MakerBot makes a 3D printer.
And in the recent release of the most updated 3D printer, what's called the Replicator II, MakerBot moved to an open-plus-closed source strategy, somewhat like Apple, which is to say, closed source in the engine, open source on the interface. And this caused great consternation among the community that uses it. But MakerBot also hosts Thingiverse, which is a collection of files you can download and print on 3D printers. So for instance, you can download and you can get the instructions and download and print this UPS truck, the parts that make this UPS truck toy.
And on Thingiverse, there was an uproar around the changed terms of service for MakerBot. So people went and looked at the terms of service on Thingiverse, and there is a little clause in Thingiverse, in the Thingiverse terms of service, that makes reference to users' moral rights. And this is a technical term.
It's a term of art. It has more to do with IP legislation, or at least enforcement in Europe, than it does here. But just that phrase, as read by the civilians, sounded like an ethical assertion over-- of Thingiverse over the users' creations, rather than just being a legal strategy for managing risk of lawsuit. And the people who went and looked at that term wrongly believed that Thingiverse had just changed those legal terms at the same time as they launched Replicator II. And they believed that, even though the terms had changed back in March, because nobody ever reads those documents.
When you click to agree, what you're really saying is, I will accept the consequences of not understanding this. And we do that over, and over, and over again. But this time was different because Thingiverse relies on community participation. It only works if people say, hey, I made this cool UPS truck, and somebody takes that same file and said, OK, I modified that to turn it into a panel truck, or, I modified that so that it had a different-shaped window. And that ability to exchange files and to work off of each other's work is common to all of these examples of openness.
So on the Thingiverse blog, the company lawyer had to do something that rarely happens. He had to go out and describe, in plain English, to his own users what they were trying to do with the legal terms, which was basically, we have a non-exclusive right to use this material if it's on Thingiverse. You can always get a copy. You can always do with it as you'd like.
And so he had to go and explain what was a carefully-constructed legal document. He had to also explain it in terms the users could understand. And that's new.
We're used to these kinds of legal documents being disclaimers of the CYA variety. In an increasing number of community-oriented cases, the legal agreement between the users and the hosting site becomes a critical piece of corporate strategy. So now there is a real business need not just for annotation as the right thing to do. This is about sharing information. This is about helping people understand their world.
There's now a need for annotation because if people don't understand the contract, they often won't trust you. But the language in which the contract is written, for a variety of reasons, can't be expressed in plain English. And where the law and the real world have a gap, someone has to fill that gap.
I recently filed paperwork, asked to review legal documents as the member of a board of another company, for an employee who was leaving. And there was a clause in that document that said, the employee leaving will hand over all copies of the documents relevant to this company. And I wrote the CEO and I said, you understand that if he was able to do that, he should have been fired earlier because it meant he wasn't using backup software. But if he's using Dropbox, or Google Docs, or any of the rest of it, he actually can't hand over all copies. It's not--
If he's doing his job, then what the contract requires him to do isn't possible. And the CEO went to the lawyer. And the lawyer came back and said, yes, we know that this has no bearing on a world of electronic documents. But we actually can't write a contract that describes what we mean without pretending like everything is on paper and can be handed over. And those kinds of gaps, which used to be oddities of the law hidden in the back in the Agate type, are now increasingly front-and-center.
So let me end with what I had thought of as a kind of remarkable push in this direction of annotation as a service that's come forward in the culture, but has become considerably more remarkable in the last couple of days, and that's a service called Rap Genius. You may have heard of Rap Genius because Marc Andreessen, the man who wrote the original graphic browser for the World Wide Web. Andreessen Horowitz recently invested $15 million in Rap Genius. And Rap Genius is simplicity itself.
It puts up rap lyrics. You go and you click on a lyric and, you can add an annotation. So here is Buju Banton's "Champion." "Watch how dem girls wahgwan." And then the translation. "Watch how the girls dance to this."
Every line can be annotated in that way. And it's up there because rappers are overpaid orators who speak in their own incomprehensible argot. And I will allow you to make the parallel with lawyers on your own.
[LAUGHTER]
But the interesting thing that just happened with Rap Genius-- and, I mean, after I'd put-- in fact, Tom will tell you. I was taking new screenshots this morning. The interesting thing that just happened to Rap Genius is, as of this morning, if you log in, you see on Rap Genius things like Pusha T's "Pain" lyrics, or PAZ's "Cott's Face" lyrics. But you also see the lyrics to the Mayflower Compact, or the lyrics to the fifth Virginia convention.
People are not waiting for Rap Genius to be turned into a general purpose platform, which was Andreessen Horowitz's original idea. This will become a general purpose platform for annotating text. They're just jamming the legal documents into Rap Genius as it exists today. And Rap Genius adds the word "lyrics" to the end of everything.
[LAUGHTER]
So when you put the Mayflower Compact up there, it sounds like the pilgrims are the rap group, and the Mayflower Compact is their new hit song. And when you go to the pilgrim compact, you see that people are annotating this often with material pulled from, yes, Wikipedia. So I had meant to end this talk by saying, oh, Rap Genius. Look. You can look at Buju Banton.
And standing from here, you can maybe see your way to a new tool for legal annotation. And between the time we started talking about this conference, and now that has already happened. And it hasn't even happened because the people at Rap Genius said, oh, we could expand into annotation of political documents. It's happened because once the general purpose tool was up there, people started pushing legal documents into it, even though you still get the Rap Genius aura around the whole thing, the orange-on-black color scheme, and the fact that "lyrics" gets added to everything.
So I want to end where I started, which is the three things that I think the Rap Genius example shows us. One, again, openness is the baseline of everything. You can't talk about these other kinds of value if people can't get to them. And that doesn't mean complete openness for everybody all the time.
There are lots of experiments like Docracy or the crowdsourced legal annotation project that are specifically open to people who have been educated or trained as lawyers. But there's still an openness there that says they don't all work for one firm or they're not all working in one institution. The second thing is the surprises.
Rap Genius, as you can tell from the URL, was not put up to annotate the Mayflower Compact. And even though the founders have been talking about general purpose text animation, text annotation, the surprise here is that people have found their way to this tool and pushed in the direction of legal annotation before anybody was ready for it. These are the same kinds of surprises as you see with a DARPA project being finished not in 30 days, but 9 hours. These are the same kinds of surprises as people using a photo of a fish both for a scientific database and for textile design.
And third, and Rap Genius illustrates this point better than I ever could have imagined it would, there's no more saying, all the serious people over here, and all the silly people over there. All the real uses of this tool over here, and all the cultural uses of this tool that seem off to the side, on this side. We forget that the printing press used to be like that, too. The erotic novels and the scientific journals were all mixed-up.
It took a long time for us to regard books as being serious because we were going to agree to ignore all the stuff that wasn't. And so the reverence in which we hold the printing press is often overblown. If you took the contents of a medium-sized bookstore and you shook it out on the street with no categorization at all, you could tell someone, there's some Auden in there. But if they waded in, they're going to get Chicken Soup for the Hoosier Soul.
The self-help section is vastly much larger than the poetry section, but we block that out because we know how to read a bookstore. We know how to get to the serious stuff if we care about it. We're still in the confused era where the serious stuff and the silly stuff are side-by-side. And that will sort itself out not in the world so much as in our minds.
We'll be able to figure out, these people are annotating the law, and those people are annotating T-Pain's lyrics. But for now, we've got to get comfortable with the serious stuff and the silly stuff existing side-by-side because when new capabilities like this show up in the world, people are so desperate for annotation. People are so desperate for tools that take arcane language of any sort and make it comprehensible to a wide audience that they will use those tools wherever they show up.
And any institutional bias against things that don't look serious is going to end up in this environment, being an institutional bias against taking advantage of this kind of value. And there I'll end. Thank you very much.
[APPLAUSE]
And Tom, do we have time for a couple of questions, or--
TOM: Yeah, sure. Yeah.
CLAY SHIRKY: I think we have time for a couple questions. Oh, good. Lights.
AUDIENCE: Question.
CLAY SHIRKY: Yes, sir.
AUDIENCE: My question is [INAUDIBLE] to online communities. What do you see as the difference between the ones that actually [INAUDIBLE]
CLAY SHIRKY: Right. Right.
AUDIENCE: [INAUDIBLE].
CLAY SHIRKY: Right. Right.
AUDIENCE: You talked a little bit about [INAUDIBLE].
CLAY SHIRKY: Oh, many, many, many, many more. In fact, the normal case of online community is failure, as-- and I'm sorry. Which online community are you?
AUDIENCE: [INAUDIBLE] called [INAUDIBLE]. It's an online community of about 60,000 [INAUDIBLE].
CLAY SHIRKY: Oh, wow.
AUDIENCE: [INAUDIBLE].
CLAY SHIRKY: Uh-huh. Right. And I should repeat the question, in case everybody didn't hear it, which is, when you look at communities that succeed, it looks like maybe there's a template. But then when you survey all the communities that have been tried, most of them fail. And so for every Rap Genius, you get a lot of sites that are just crickets.
The first answer, really, the main answer, is luck. I wish I could tell you different. But YouTube was one of just many services, they had a slight edge on uploading and file conversion, but one of many services trying to do video sharing. It happened to be the service people were using when the "Lazy Sunday" video from Saturday Night Live came along.
So Friday night, there was a competition for which video sharing service was going to be a big deal. Monday morning, it was done. It's just done. So they got lucky.
Once you get past admitting that, which nobody wants to hear, but which is the main story, the other is this. Everything, and I mean everything that you see in the community world that's big and good, started small and good, and the work was-- went into making it bigger, rather than starting big and mediocre, and having the work go into making it better. And one of the reasons that large media companies have historically had a hard time with anything communal is that they count, if they're newspapers, in units of tens or hundreds of thousands. If they're television, they count in units of millions.
And so when you say, the way to start a community is to get three dozen people who really like something and let them talk about it until another three dozen people find their way there, no one wants to wait. They want to say, let's start with 100,000 people. That'll be great because we know what a community of 100,000 can do. If you start with 100,000, it's not a community.
And so Wikipedia started with 12 articles. Twitter started with this little intuition about sharing stuff on SMS. Flickr started with a handful of people sharing photos. You see these growth curves where--
When Facebook came along, Myspace was enormous, and the idea that they could ever catch up was seen as ridiculous. But of course, if your growth curve goes like that, then at some point, the two curves intersect. So the attempt to-- the attempt to flood a community with so many people that it works at large scale, first crack out of the box, that seems, to big organizations, to be intuitively the right thing to do because you start where you mean to end up. But in fact, it turns into something that's so anonymous and disconnected that you never get the cultural core you can build on.
And convincing people that they should do something small and good, particularly inside big organizations, is so difficult that most of the ones that have worked well inside big organizations were hidden from the bosses until they had gotten to a certain scale. Because the temptation to just say, well, we'll just pour some Agent Orange on it and see how fast it will grow-- Agent Orange, at one-- in one dose is a fertilizer, and another dose is a defoliant because it's the same process. And pushing scale pushes it to the defoliant end of things. Yes, sir.
AUDIENCE: [INAUDIBLE] putting your content out there and important ideas because you [INAUDIBLE] Rap Genius [INAUDIBLE] suggested giving people a cool tool for annotation, in this case, [INAUDIBLE] is also important because you don't want to let people--
CLAY SHIRKY: Yep. Yep.
AUDIENCE: [INAUDIBLE] you operate free law sites but some of us [INAUDIBLE] have the resources, we're faced with a choice. We can next put the Kentucky Supreme Court opinions online, content, or we could build a new tagging feature.
CLAY SHIRKY: Yeah, yeah, yeah.
AUDIENCE: Which do you prioritize?
CLAY SHIRKY: That's a really interesting question. Let me say first about the Smithsonian thing. I under-emphasized the fact that the tagging tool they used had been developed by Flickr. And so, in a way, there is-- I think your point is absolutely right. But in the case of the Smithsonian, they went into a fairly robust environment.
That is a really hard question to answer. What I would say is, very often, people talk about what-- the minimum viable product, which is to say, what's the basic thing you can do? Set aside the bells and whistles for a moment.
So on Facebook, for example, in the early days, you could link to people in two ways, and you could see what they were posting on their page. Almost everything we associate with Facebook now, the wall, and the timeline, and so forth, that came later. The basic piece was this social graph. Ditto Twitter, but they had a directed social graph, and so forth.
So I would say, if you have a community that's gathered around the content you have and is working hard on it, then more tools may-- then you have some sense of what the next tool you might work on might be. If you don't yet have that, then you have to figure out, is it because the community is not interested in the content, or because the tools aren't there yet? That's the harder problem, in a way, which is to talk to the people trying to use the site and say, would you rather us bring another state in, or would you rather us give you a tool to do X?
Reaching out and talking to your own community, I think, can't be overstated as a valuable way to do this, partly because you gather information, as you would with any research effort. But also, partly when they see you caring, you create what's called the Hawthorne effect, which came from a study of a factory in Hawthorne, New York, an IBM factory, where they would go and tell the workers, we're doing a study on worker efficiency. And they'd lower the lights, and worker efficiency would go up.
And they'd make it brighter, and worker efficiency would go up. And they'd move breaks around, and worker efficiency would go up. And they'd cancel breaks, and worker efficiency would go up. And it turned out, the thing that made worker efficiency go up was saying, we're studying you.
None of the other stuff made anything like the difference of saying, we're paying attention. So I'd go to your community. And again, if they're working fine now, you can guess that more refined ways of working will probably animate them more.
But if they're not, I'd go and talk with them. Is it that you're not finding the content interesting, or is that you're not finding the tools useful? In many cases, they'll say, we'd like both right away, please, at no cost. I don't mean to over--
You're not going to get a neatly-sorted list either. But you will get some input here. The other thing I'd say is, to the degree that you can make it easier over time for the users to bring in the content, your organization, and this is on a longer time scale than the question you were asking, will come to focus more on the tools.
So if you say, here are the various ways you could import state laws, and we're going to make those tools open for you, and you can do some of that work, you will, A, be able to shift more to the tool and platform end of things. But B, the next set of laws that gets pulled in, the next set of documents of any sort that gets pulled in, is likeliest to be stuff that's of interest to existing users. And you won't have to guess at that anymore. And then that, over the long term, is where you want to be, in part because guessing what your users will do next is a mug's game comparing to letting them do what they were going to do next anyway. Yes, sir.
AUDIENCE: I want to ask about the-- when you were talking about [INAUDIBLE].
CLAY SHIRKY: Yep.
AUDIENCE: One of the things you said is we're going to have to get accustomed to living side-by-side [INAUDIBLE]. And I wonder if you feel like that distinction is even a useful one.
CLAY SHIRKY: I don't. But people think in those terms. Right.
AUDIENCE: I see what you're [INAUDIBLE]
CLAY SHIRKY: Right. Right. Yeah. No, this is right, whether the serious-silly distinction I was driving is a useful one. No. I--
I have a hard time telling you that the steampunk people are doing something less useful or less valuable than the cyanotype people. I don't feel like I'm in the position to adjudicate. I use that language, frankly, because within the academy, that's the objection that you often hear, which is, we let this stuff happen.
And then we got lolcats. And the-- you don't get the lolcats with-- I mean, you don't get the scientific journals without also getting the erotic novels. You don't get the Rap Genius without also getting the lolcats because it's the abundance itself that's at stake here. It's the ability of people to waste things that were previously precious while trying new things.
But the academy is so naturally conservative as an institution. And in many ways, we want it to be conservative. It's there to conserve historical continuity with a conversation that's been going on for millennia. But at the same time, we also have to have it adapt.
And so I split the difference by talking about the serious and silly stuff because I'm not-- the important argument isn't, I think, take the steampunk people seriously. The important part of the argument is, don't regard the presence of steampunk authors on your site as being evidence that it's failing. Don't regard the fact that a textile designer found as much value, or possibly more, out of this photo of a fish as an ichthyologist.
And frankly, if you can get people to tolerate that, to accept that those kinds of uses are going to be a side effect of openness, and at the very least, not to regard them as harmful, you're well on your way to opening it up. If you can get them to regard it as valuable, so much the better for experimentation. But from my point of view, the real hurdle is not regarding silly uses of the tool as being dispositive of the failure of the-- or dispositive of the value, rather, of the service as a whole. Yes, ma'am?
AUDIENCE: [INAUDIBLE]
CLAY SHIRKY: No. Awesome. Awesome.
[LAUGHTER]
AUDIENCE: [INAUDIBLE] because she makes incredible [INAUDIBLE], but she also contributes to [INAUDIBLE] in an incredibly rich way. The same [INAUDIBLE]
CLAY SHIRKY: Right. Right. That's wonderful. What's her name?
AUDIENCE: I'm sorry?
CLAY SHIRKY: What's her name?
AUDIENCE: Penny Elrich.
CLAY SHIRKY: Penny Elrich because she just had a username on the-- that's fantastic. And there it is. Also, the value you get out of that kind of participation is often not the kind you could either plan or pay for. That's fantastic. Yes, sir.
AUDIENCE: A lot of the communities that have succeeded in creating shared content have had content that a large number of people have the ability to share.
CLAY SHIRKY: Mhm.
AUDIENCE: So this is a stochastic process of selecting from a large population of eligible people, also a population that usually has not had an opportunity of expressing themselves.
CLAY SHIRKY: Right.
AUDIENCE: As you move from those collections to collections that require both some amount of expertise and that provides their practitioners already mechanisms to express themselves in the existing trying to replicate the same kind of success becomes, I would have thought, quite a bit harder.
CLAY SHIRKY: It becomes harder. It's not clear how much harder it becomes. If you look, for example, at the Harvard Med experiment, Medpedia, that had the characteristics that you just described. And yet, everybody who came to Medpedia was biased to improve the information the average user was getting, which is to say, they had already signed up for exegesis that would reach the general public.
And the obstacle there wasn't so much the desire to participate as the inability to participate in where the public already was. So in that case, the hurdle is, are you willing, as a doctor, to go in and improve the Wikipedia article on biopsy, or will you only do it in some place that feels like a refereed journal? And so that is a hard problem. But it's culturally, I think, a different problem.
I've emphasized places where, as you say, the corpus is open. I think the shareability of the content is, frankly, less an issue than the thing you said about working professionals having existing outlets. But if you look at, say, Docracy or The Collaborative Legal Annotation Project, those are projects that assume that lawyers will get involved. And one of the surprising estuaries in this-- the ocean of the amateur and the river of the experts, one of the surprising middle zones turns out to be students. Wikipedia is sometimes called the ABD SmackDown. The-- yeah.
[LAUGHTER]
Essentially, it's people who know enough to make the Wikipedia article better, but have not yet finished their doctorate and are looking for any kind of writing to do other than working on the actual dissertation, which will get you to 100 million hours like that.
[LAUGHTER]
In many cases, the amount of information a member of the general public needs is perfectly clearly delivered by a knowledgeable student of the subject, rather than by the senior-most expert of the subject. And I think that some of that kind of opening up-- that we actually have a very large pool of people who don't have public platforms to talk about this material but are eager to explain it to their peers. That's turned out to be--
And again, none of this is universally generalizable. But that's turned out to be a big source of this kind of material. In fact, Wikipedia ran a project recruiting in law schools to try and improve some of its legal. There are a few too many jailhouse lawyers editing the Wikipedia articles, "jailhouse lawyers" in the colloquial-- in the metaphorical sense, and not enough people who've studied the subject. And so they went and recruited not law school professors, but law school students to improve those articles.
TOM: We have time for one more.
CLAY SHIRKY: One more? One more question.
Yes, sir.
AUDIENCE: Do you think contributions will get better as TV gets worse?
[LAUGHTER]
CLAY SHIRKY: It's funny. I'm not even convinced exactly that TV's getting worse. But that's Stephen Johnson's territory, not mine. I think the contributions will get better as people get better at it, which is to say--
Etienne Wenger's work on Communities of Practice, the work he started in the '90s about groups of people who come together to get better at a task without having a shared boss or a shared purpose, that those networks turn out to be, A, incredibly powerful, and B, remarkably easily activated on the internet. There's a great study of Silicon Valley in the-- gosh, 10, 12 years ago now, where a programmer who ran into a problem that they were working on was far likelier to call a friend who worked for a competitor than to talk to their own boss.
And that says something about boss culture, too. But it says something about this network business. So the thing that Wenger has found-- one of the most robust findings in the Communities of Practice literature is, all other variables controlled, for when groups of people who do something together talk to each other about it, everybody gets better, not from any one mechanism. There's just a slowly rising tide.
And as people get better at the idea of contribution and they get better at the kind of midway dance of, you've done something that I can improve on, but I don't want to piss you off, and so I'm going to edit it lightly and talk to you lightly, and together, we'll do something about it, those kinds of tools-- and I think we'll see it on Rap Genius. I just-- I just literally found this happening. I was astonished to open it up and see these two things at the top.
We'll see what happens on Rap Genius. That will be a really interesting place to watch because those people are going to have to recruit an annotative community completely separate from the people going in and annotating Buju Banton. They are two very different kinds of texts. They have very different annotative needs. And I think if I could point to anything that will make this better that's different than better tools or better content, to the earlier question, I think it will be our growing mutual regard for one another's participation. Anyway, thank you very much, and thank you for the [? LII. ?]
[APPLAUSE]
The overlapping but not synonymous categories of expertise and authority have been tied up with the concepts of access to canonical information and the right to certain forms of public speech. Who can see what information, and who can talk about it in sanctioned ways, have co-evolved with the mechanisms of access and speech themselves.
As the digital revolution alters the cost of ubiquitous access to information, and provides the material basis for every consumer of information to be a producer, the practical possibilities inherent in the current media landscape are now at odds with decades (and in some cases centuries) of assumptions about how expertise and authority interact.
Clay Shirky, Writer, Consultant, and Teacher on New Technology and Social Media at NYU, delivered this keynote address at the 2012 Law Via the Internet Conference on October 9, 2012. Shirky is today's leading voice on the social and economic impact of internet technologies.