Josh Lerner and Jean Tirole*
This paper is an initial exploration of the determinants of open source license choice. It first enumerates the various considerations that should figure into the licensor’s choice of contractual terms, in particular highlighting how the decision is shaped not just by the preferences of the licensor itself, but also by that of the community of developers. The paper then presents an empirical analysis of the determinants of license choice using the SourceForge database, a compilation of nearly 40,000 open source projects. Projects geared toward end-users tend to have restrictive licenses, while those oriented toward developers are less likely to do so. Projects that are designed to run on commercial operating systems and those geared towards the Internet are less likely to have restrictive licenses. Finally, projects that are likely to be attractive to consumers—such as games—are more likely to have restrictive licenses. A more tentative conclusion based on a much smaller sample is that projects that involve software developed in a corporate setting are likely to have more restrictive licenses. These findings are broadly consistent with theoretical predictions.
An extensive body of work has examined the economics of technology licensing. In particular, theoretical studies have intensely scrutinized several aspects of how profit-maximizing firms should license their intellectual property, including the timing of the licensing transaction (i.e., whether before or after the discovery has been made), whether exclusive licenses should be employed, and the nature of the fees that should be charged (e.g., the tradeoff between royalties and flat fees).1
But the question of the optimal scope of technology licenses has been much less thoroughly scrutinized. More concretely, should the licensee be free to use the technology as he sees fit, being able to commercialize follow-on inventions, or should his use be narrowly circumscribed? This paper examines this question in a special context: the licensing of open source software.
The open source process—a method of software development in which contributors freely submit code to a project leader, who in turn makes the improved code widely available—is an interesting arena to start thinking about license scope because the standard considerations (e.g., timing, exclusivity, fee structure) are irrelevant. Users of open source software must typically consent to a licensing arrangement, which may impose a variety of restrictions. For instance, the user may be limited in his ability to distribute a modified version of the program as a proprietary commercial product without releasing the underlying source code.2
This paper first explores the various considerations that figure into the licensor’s decision of how restrictive a license to employ. It highlights the complex set of motivations that may drive the choice of license. It then suggests that permissive licenses will be more common in cases where projects have strong appeal to the community of open source contributors, and restrictive ones commonplace when the appeal is more fragile. We suggest that projects geared towards developers may be more likely to fall into the former category, while those geared towards individual end users are more likely to fall into the latter.
The paper then presents an empirical analysis of the prevalence of different types of open source licenses. The analysis employs the SourceForge database, a compilation of nearly 40,000 open source projects that has hitherto been largely unexplored by academics. We focus on two critical characteristics of these licenses:
Whether the license requires that when modified versions of the program are distributed, the source code must be made generally available. Such a provision is sometimes referred to as a “copyleft” provision. In the empirical analysis in this paper, we term such licenses as “restrictive.”
Whether the license restricts modified versions of the program from mingling their source code with other software that does not employ such a license. Such a clause is sometimes termed a “reciprocal” or a “viral” provision. For purposes of the empirical analysis in this paper, we term this a “highly restrictive” requirement.
These licenses, it should be acknowledged, are complex legal documents that have not yet been tested in court.3 Significant ambiguities remain about their interpretation. What is critical for our analysis, however, is the relative ordering of the restrictiveness of the agreements, not their absolute restrictiveness. We will consider three classes of licenses: unrestrictive (for example, the BSD license), restrictive (e.g., LGPL), and highly restrictive (GPL). (See below for a fuller discussion of these licenses.)
The results are largely consistent with the framework above: more restrictive licenses are more common in projects geared towards end users and in such applications as games and desktop applications. We explore the robustness of the results to the use of a variety of definitions of the independent variables. In an exploratory analysis using a much smaller sample, we examine the licensing terms of projects that are spun-out of corporations. The results are at least broadly consistent with theoretical suggestions.
The Legal Foundations of Open Source Licensing
Software developers have long been able to obtain copyright protection for their works. When for-profit companies manufacture proprietary software products, these copyrighted works are typically licensed rather than sold. By licensing the software, software manufacturers can limit their liability if the product does not work effectively, and restrict the rights that the users would normally have (e.g., the ability to simultaneously run the software on several computers). (For a detailed rationale for this approach, see Neukom and Gomulkiewicz .)
In the early days of the computer software industry, however, much of the software was made available without an explicit license governing its use.4 (For a history of the open source movement, see Lerner and Tirole  and the references cited therein). By the early 1980s, programmers had become disturbed by instances of behavior that they deemed to be unethical.5
In response to these events, MIT programmer Richard Stallman developed a new approach to distributing software in the mid-1980s. Rather than dedicating the software to the public domain, he required users to license the code under the GNU Public License, or GPL.6 This license essentially required that the program’s source code (the underlying programming commands) must be freely available and that modifications to the code must be allowed. One of Stallman’s major concerns, however, related to those who sought to commercialize modifications to the code. He limited the ability of software developers to undertake such activities in two critical ways: by insuring that any derivative works remain subject to the same license and by prohibiting the mixing of open and closed source software in any distributed works. In this way, he limited the danger of commercial exploitation of these discoveries. A variant of the GPL, known as the Lesser GPL, or the LGPL, allows greater flexibility in regard to the “mixing” requirement: in particular, programs are allowed to link with (or employ) other programs or routines that are not themselves available under an open source license. In other respects, though, the LGPL is similar to the GPL.
Meanwhile, several alternative licenses were introduced:
Perl, a UNIX-based programming language that allows for the automation of many system administration tasks, was originally made available by its founder, Larry Wall, under the GPL. He soon decided that the terms were too restrictive, and developed what was termed the “Artistic License.” With a few limitations, users were free to develop commercial products based on the Perl code. Nor were any limitations placed on the mingling of proprietary and open source code.
Another variant was the family of BSD7-type licenses, which also allowed a great deal of flexibility to users, as long as credit was given to the University of California for the underlying code in the documentation of any derivative version. BSD-type licenses, which have been adopted by many projects (including the Apache web server), are today the most popular alternative license to the GPL and the LGPL.
Another family of alternative licenses is those introduced by commercial companies that have “opened up” some of the proprietary code (i.e., made the source code available to open source programmers). These programs have frequently added specialized provisions to address copyright and liability concerns of the corporate parent.
In 1998, a variety of open source leaders came together to establish a consistent set of criteria for what constituted an open source license, which they termed the “Open Source Definition.” Among the requirements for the license of a program to be considered “open source” were that:
The source code for the program must be available at little or no charge.
Redistribution of the program, in source code or other form, must be allowed without fee.
Distributions of modified software must be allowed without discrimination.
The distributions of those modifications on the same terms as the original program must be permitted.
This definition was broad enough to both encompass the GPL and those licenses which allow users greater liberty in how they use the code.8
Table 1 summarizes the leading open source licenses. For each license that has been approved as falling under the “Open Source Definition” (as well as two other broad classes of related licenses), we report, as discussed in the introduction, whether the license has what we term “restrictive” and “highly restrictive” features.9
Despite uncertainties surrounding the enforceability of open source licenses,10 it is clear that software developers care critically about the choice of license used. Decisions to switch between license types11—for instance, the WINE project’s recent move from the BSD-like X11 license to the LGPL license12—have proven intensely controversial.
The Choice of License: Some Considerations