SmoothSpan Blog

For Executives, Entrepreneurs, and other Digerati who need to know about SaaS and Web 2.0.

Swartz Stole Academic Articles? Why Weren’t They Free In The First Place?

Posted by Bob Warfield on July 19, 2011

Aaron Swartz, one of the founders of Reddit, is garnering a great deal of attention after having been indicted on charges he stole over four million academic documents.  Being a quasi-celebrity and political activist, the indictment is prompting a number of articles in my blog reader, and it has certainly rung Techmeme’s bell more than once.  The details of the alleged theft are certainly titillating:

This time around, Swartz circumvented MIT’s guest registration process altogether when he connected to MIT’s computer network. By this point, Swartz was familiar with the IP addresses available to be assigned at the switch in the restricted network interface closet in the basement of MIT’s Building 16. Swartz simply hard-wired into the network and assigned himself two IP addresses. He hid the Acer laptop and a succession of external storage drives under a box in the closet, so that they would not be obvious to anyone who might enter the closet.

But here is my problem: I can’t understand why those academic articles weren’t in the public domain anyway.  Why should he have had to steal them?

I read a lot of academic articles, and it has seemed to me for a long time that this business of having to pay a small fortune for each and every article reprint was a total ripoff and contrary to what the institutions funding the research that led to the articles should be standing for.  You don’t have to search very long on some academic subject before you run into a link similar to this one.  That’s $31.50 for one single academic paper that should be available for free.  It’s published by a prof from Cal. Berkeley and another from U Illinois, Champagne-Urbana.  These are two very fine institutions.  Why then do they want to condone the research they paid for costing $31.50 for others to review for academic purposes?  What’s the justification for doing this?

Schools should be Open Sourcing all of their research to foster sharing and the general advancement of knowledge.  Once upon a time that sharing depended on articles appearing in scholarly journals that collected in University Libraries.  Aren’t we past that?  Doesn’t it make more sense to publish on the Internet?  Shouldn’t I be able to go to any University’s web presence and find a page that has every publication any of their researchers has ever made?  Beyond that, shouldn’t the Library of Congress be collecting and making them freely available if any Federal Funding at all went into paying for the research?  Can anyone in this time of Open Source formulate a good argument for why it isn’t in the best interests of education to open up these scholarly papers?

And if we have to pay anything at all, what in the world is the justification for it being the princely sum of $31.50?  How about 99 cents?  The incremental cost of each article is zilch to add to these databases.  Give them to Amazon if the publishers can’t figure out how to do it for less than $31.50.  They won’t have a problem.

Swartz’s biggest problem is he apparently broke into a Walled Garden that should never have been built to start with.  Someone needs to start a movement to shine a light on this and push schools to publish their research in a much more modern way and with a lot less profit motive.  It would be a boon for the advancement of knowledge just to have it all searchable by the public search engines.


3 Responses to “Swartz Stole Academic Articles? Why Weren’t They Free In The First Place?”

  1. dbmoore said

    One could argue that the research underlying those articles were paid for not by the universities but by the US taxpayer, and thus the papers are required to be in the public domain …

  2. A Sharma said

    I agree with you Bob. Moreover, even from a purely selfish perspective – the professors and researchers are hurting themselves. The key to long term success of research is adoption of their ideas by others including industry and other academics. By charging $5 or $20, they are essentially locking themselves out of the market of ideas. Meanwhile the entrepreneurs (who are arguably supposed to be driven by money) are freely sharing ideas. No wonder, by way of example – research in collaboration for over 2 decades did much less than what Facebook, Twitter, Chatter, Google Plus etc. are doing.

    Academia risks irrelevance by sticking to this path – just so they can continue to fund their conferences and journals – missing the point of all this – sharing ideas.

  3. Jay Godse said

    If research is publicly funded, then the public should be able to read it for no more than the minimal cost of distribution. $1 sounds about right. After all, Apple manages to distribute songs for that much and still make huge profits.

    However, the next generation of people is bypassing this academic system. It’s called Wikipedia and Wikipedia lets people contribute knowledge to the commons with very liberal licenses. Github enables software people to contribute working software to the commons with very liberal licenses (e.g. Apache2.0, GPLv3, etc). With these licenses other users can study the knowledge or source code, and improve it by creating a derivative work and republishing it on Wikipedia or Most academics still pooh-pooh Wikipedia, but over time it is becoming a better and better source of knowledge for the people who matter – the same people who contribute taxes which ultimately fund most research.

    That is the original intent of “pub”-lishing papers. Publishing is intended to make the knowledge public. Github and Wikipedia will win, not because their publishers are necessarily smarter, inventive, or whatever, but because peer review is possible instantly, and because peers can work together on a promising piece of work to rapidly improve it. With many Github projects, each contributor is listed. You can also look at the history of each project to see who contributed what. I suspect that something similar is possible with Wikipedia. The whole benefit of peer review is to make research better by providing feedback. Github and Wikipedia make the archiving and transmission of new work almost instantaneous, which enables rapid feedback. Just as with research, poor research or poorly communicated research can be published, but nobody reads it, understands it, uses it, contributes to it, or cares. And it happens a lot faster.

    In fact, with Github, positive peer review is tacitly achieved when more and more people start contributing to a project or start using a project. It’s one thing to say that you think a piece of software is brilliant or useful. But it’s a lot more believable when you start using it in your own projects.

%d bloggers like this: