The Scope of Open Source Licensing




старонка4/8
Дата канвертавання25.04.2016
Памер415.4 Kb.
1   2   3   4   5   6   7   8

The Determinants of Open Source Licenses


We then examine the determinants of the license types employed in these contracts. We first explore the individual licensing components, and then use an index of license scope.
We first summarize the distribution of projects along two measures of license scope that we discussed in Section 2: whether the license is restrictive or not and whether it is highly restrictive or not. These tabulations are challenging, because of the complexity of some situations. Some projects operate under multiple licenses: in these instances, different sections of the code may be under different licenses, or the contributor may be able to choose the license he wishes to govern his contribution. In other cases, a single license may allow a user to choose the degree of protection he wishes to have. We thus code each project as to whether all or some of the code contributed was subject to restrictive or highly restrictive provisions.
Table 4 highlights several patterns:

        • Highly restrictive licenses are less common for more mature projects. This pattern may reflect a “vintage effect”: it may have more common for older projects to employ licenses other than the GPL. Alternatively, this may reflect a “survival effect.” Projects with the GPL may have been less successful in attracting contributions. (When we examine the impact of these factors on different “vintages” of projects below, we will be able to shed some light on this question.)

        • Highly restrictive licenses are less common for projects operating in commercial environments such as Microsoft Windows or Apple’s Cocoa. But projects operating in the X11 environment—a network-transparent window system developed at MIT which runs on a wide range of computing and graphics machines—are more likely to be highly restrictive.

        • Highly restrictive licenses are significantly more common for projects that run under the POSIX family of operating systems, as opposed to other proprietary ones (or those which are operating system independent).

        • Consistent with the framework in Section 3.B, highly restrictive licenses are more common for applications geared towards end-users, but significantly less common for those applications aimed towards software developers. Highly restrictive licenses are also more common for projects geared to systems administrators, which may reflect either the weak community appeal of these efforts or the intrinsic preferences of the licensors (since commercial benefits are likely to be low).

        • Also consistent with the above framework, applications that are consumer oriented—e.g., desktop tools and games—are substantially more likely to have highly restrictive licenses. Those geared to the software development process are much less so. Similarly, products geared to technical users (e.g., scientific and engineering programs and database software) are less likely to have highly restrictive licenses.

        • Highly restrictive licenses are much more common for projects whose natural language is other than English, with the exception of Japanese.

When we examine in Table 5 the presence of restrictive provisions, we find a similar pattern. Exceptions include the absence of any significant pattern involving products geared to system administrators, and a somewhat different mixture of topics where restrictive provisions are commonplace.
Tables 6 and 7 then examine these patterns in a regression framework. Reflecting the fact that the dependent variable is in each case a dummy, we employ a probit specification. For each class of variables, we delete one of the independent variables from the specification: the dummy variables denoting projects in the planning stage, those operating in a Console (Text) environment, those geared towards other audiences, those whose natural language is English, those geared toward an other operating system, and those with an other topic.
The primary differences in the results from those in the univariate analyses are as follows:

  • Software geared toward developers is sharply different from that geared towards other users, being much less likely to have highly restrictive licenses.

  • Among the projects less likely to have highly restrictive licenses are those related to software development, desktop applications, the Internet, multimedia, and printing. The tendency to see fewer such licenses in Internet-related projects is consistent with the arguments concerning standard setting above.

  • Projects whose natural language is Japanese are far less likely to have highly restrictive licenses, while German and Spanish ones are much more likely to be so.

The results in Table 7 are similar, with the exception again of no significant pattern involving products geared to system administrators, and a somewhat different mixture of topics where restrictive licenses are commonplace.
These effects are not only statistically significant, but economically meaningful as well. Consider, for instance, the first regression in Table 6. A project in the planning stages (the omitted case) has a 12% higher predicted probability of all licenses being highly restrictive than one in the mature stages. A project geared towards individual end-users has a 23% higher probability of all licenses being highly restrictive than one oriented to developers.
The regression analysis in Table 8 looks at restrictive and highly restrictive licenses in a single specification. To do this, we employ indexes, which measure whether the project has various licensing provisions. Because of the ambiguities surrounding the interpretation of cases where there are alternative licenses, we proceed in two ways. In the first regression, the index takes on the value 4 if all licenses are highly restrictive; 3 if some are highly restrictive; 2 if all licenses are restrictive but none are highly restrictive; 1 if some are restrictive but none are highly restrictive; and 0 otherwise. In the second regression, the index takes on the value 2 if all licenses are highly restrictive; 1 if all are restrictive and some (but not all) are highly restrictive; and 0 otherwise.
We estimate ordered logit regressions because of the nature of the dependent variable. In an ordered logit specification, a license that was rated as a “4” would be treated as having a narrower scope than one rated as a “2,” but not necessarily twice as much so. The findings in Table 8 are largely consistent with the analyses reported above, particularly those in Table 7.
One concern with the analysis in Table 8 is the presence of projects with multiple licenses. We explore the robustness of the results in unreported regressions. Rather than denoting projects that have “all highly restrictive” and “some highly restrictive” licenses, we treat the cases with multiple licenses in two different ways. We first re-estimate the equations, eliminating all projects that have multiple licenses. We also rerun the regressions employing the maximum degree of restrictiveness of any license. The results are little changed in either case.
We also undertook an analysis that attempted to control for the age of the open source project. As noted above, we were concerned that a survival effect might be at work: the characteristics of older projects might be different from others. This effect might lead to the conclusion that a given feature affected the choice of license, when it was actually the age that was critical.
While we do not know the date at which the project was initiated, we do have a proxy for this measure: when the project was added to the SourceForge database. (Because the database only began operations in 1999, this measure does not allow us to identify the oldest projects.) We employ this measure in several ways. Table 9 shows the most direct approach. We re-estimate the regression reported in the first column of Table 6, first restricting the sample to the oldest projects (those added to the SourceForge database in its first year of operations) and the youngest (those added in 2002).
The patterns relating to stage of development disappear in these regressions, underscoring the suggestion that this measure may be capturing a vintage effect. But at the same time, the key explanatory variables differ little across the time periods. Projects geared toward end-users tend to have highly restrictive licenses, while those oriented toward developers are less likely to do so. Projects that are designed to run on commercial operating systems are less likely to have highly restrictive licenses. Finally, types of projects that are likely to be attractive to consumers—such as games—are more likely to have highly restrictive licenses.
In unreported regressions, we explore the impact of time in a variety of ways. We employ dummy variables denoting the year the project was added to the SourceForge database as independent variables. We also include interaction terms between the data of inclusion and the other key independent variables. These changes have only a very modest effect on the results.
One prediction offered in Section 3.B was that projects that were borne out of corporations should differ from other ones. We suggested that in cases where a corporation made its own code available to third parties, the license type should be particularly constraining. We examine this possibility in an exploratory analysis. From a careful examination of news stories and corporate web sites, we identified 51 entries where we could unambiguously determine that the project originated with proprietary software developed by a corporation. While the number of such cases is modest, such an approach allows us to at least tentatively explore this theoretical suggestion.
As Table 10 reports, projects that involve software developed in a corporate setting are likely to have more restrictive licenses. While the effects are in the predicted direction, and the magnitude of the coefficients are in some cases substantial, the results never become statistically significant. Nonetheless, the results are at least suggestive.
We also address the concern that the inactive projects (ones where no code contributions are made to the SourceForge site) listed on the site are identified in a manner that introduces some biases. We rerun the regressions reported here, restricting the sample to the approximately ten thousand observations with code contributions. We also repeat the analysis, weighting the observations by a number of activity measures: the numbers of bugs reported, the number of active developers, and the percentile of activity of the project. While, as discussed in the Footnote 25, the mixture of licenses employed changes somewhat when such weights are employed, the magnitude and significance of the key independent variables are little changed.
Another concern was that the ideological considerations discussed in Section 3.A may distort the decisions being made. To partially address this concern, we re-ran the regressions reported in Table 8, eliminating those with BSD and GPL licenses, the two licenses whose use has attracted the most polarized debate. The results remained similar: for instance, those projects geared toward end users and system administrators were likely to be more restrictive, while those oriented toward developers were significantly more permissive.



  1. Conclusions

This paper examines the scope of licensing in open source software, a topic of both academic and practical interest. We first enumerate the various considerations that should figure into the licensor’s choice of contractual terms. We highlight how the decision is shaped not just by the preferences of the licensor itself, but also by that of the community of users. For instance, a commercial company releasing software to the open source community may choose a more restrictive license because of suspicion about its ultimate intentions.
The paper then presents an empirical analysis of the prevalence and success of different types of open source licenses, employing the SourceForge database, a compilation of nearly 40,000 open source projects that has hitherto been largely unexplored by academics. The results are largely consistent with the framework above:

        • Restrictive licenses are less common for projects operating in commercial environments or that run on proprietary operating systems.

        • Consistent with the framework in Section 3.B, restrictive licenses are more common for applications geared towards end-users and system administrators, but significantly less common for those applications aimed towards software developers.

        • Also consistent with the framework, applications that are consumer oriented—e.g., desktop tools and games—are substantially more likely to have restrictive licenses. Those geared to the software development process are much less so.

        • Similarly, products geared to technical users are less likely to have restrictive licenses.

This version of the paper leaves a number of issues open, which we hope will be explored in subsequent work. In particular, two avenues seem promising ones for further study:



  • The first of these is getting a better understanding of the other key inputs that go into the choice of license. For instance, how does the fear of adverse outcomes such as hijacking, forking, and the failure to develop complementary software products change with the type of project, its stage of development, and the nature of the licensor? How do the license terms of complementary software products impact with the choice of license?

  • Second, the consequence of the choice of license on project success is an interesting issue. To what extent does this decision matter? It might be possible to identify cases where licensors were constrained in their choice of license, which might allow the implications of license type to be identified.



References
Bezroukov, Nikolai, 2002, “BSD vs. GPL, Part 2: The Dynamic Properties of BSD and GPL Licenses in the Context of the Program Life Cycle,” http://www.softpanorama.org/Copyright/License_classification/social_dynamics_of_BSD_and_GPL.shtml (accessed September 17, 2002).
Dodd, Jeff C., and Brian Martin, 2000, “Building a Cathedral Over the Bazaar: A Preliminary View of Certain Licensing Practices in the Open Source and Free Software Communities,” Unpublished working paper, Mayor, Day, Caldwell & Keeton.
Gallini, Nancy T., 1984, “Deterrence by Market Sharing: A Strategic Incentive for Licensing,” American Economic Review, 74, 931-941.
Gallini, Nancy, and Brian D. Wright, 1990, Technology Transfer under Asymmetric Information,” Rand Journal of Economics, 21, 147-60.
Gandal, Neil, and Katharine Rockett, 1995, “Licensing a Sequence of Innovations,” Economics Letters, 47, 101-107.
Hammerly, Jim, Tom Paquin, and Susan Walton, 1999, “Freeing the Source: The Story of Mozilla,” in Chris DiBona, Sam Ockman, and Mark Stone, editors, Open Sources: Voices from the Open Source Revolution, Cambridge, Massachusetts, O’Reilly, pp. 197-206.
Katz, Michael L., and Carl Shapiro, 1986, “How to License Intangible Property,” Quarterly Journal of Economics, 101, 567-589.
Lee, Steve H., 1999, “Open Source Software Licensing,” Unpublished working paper, Harvard University.
Lerner, Josh, and Jean Tirole, 2002, “Some Simple Economics of Open Source,” Journal of Industrial Economics, 52, 197-234.
McGowan, David, 2001, “Legal Implications of Open-Source Software,” University of Illinois Law Review, 2001, 241-304.
Mundie, Craig, 2001, “The Commercial Software Model,” http://www.microsoft.com/presspass/exec/craig/05-03sharedsource.asp (accessed September 17, 2002).
Neukom, William H., and Robert W. Gomulkiewicz, 1993, “Licensing Rights to Computer Software,” in Technology Licensing and Litigation 1993, (Practicing Law Institute Patents, Copyrights, Trademarks and Literary Property Course Handbook Series No. G4-3897, 1993), New York, Practicing Law Institute, pp. 775-___.
Perens, Bruce, 1999, “The Open Source Definition,” in Chris DiBona, Sam Ockman, and Mark Stone, editors, Open Sources: Voices from the Open Source Revolution, Cambridge, Massachusetts, O’Reilly, pp. 171-188.
Rockett, Katharine E., 1990, “Choosing the Competition and Patent Licensing,” Rand Journal of Economics, 21, 161-172.
Shepard, Andrea, 1987, “Licensing to Enhance Demand for New Technologies,” Rand Journal of Economics, 18, 360-368.
Williamson, Oliver, 1975, Markets and Hierarchies: Analysis and Antitrust Implications, New York, The Free Press.
Williamson, Oliver, 1985, The Economic Institutions of Capitalism, New York, The Free Press.

Figure 1: Illustration of license choice.

2



Table 1: Open source software licenses. The table summarizes all Open Source Initiative-approved licenses, as well as selected others. The final two columns indicate the number of observations of each license type in the SourceForge database.


License Name

Restrictive?

Highly

Observations

Observations with







Restrictive?

in Sample

Activity Data

OSI Approved Licenses













Apache Software L

N

N

301

121

Apple Public Source L 1.2

Y

N

15

3

Artistic L

N

N

736

223

BSD L

N

N

1,708

618

Common PL

Y

N

34

18

Eiffel Forum L

Y

N

5

3

General PL

Y

Y

18,133

5,801

IBM PL 1.0

Y

N

33

7

Intel OSL

N

N

10

6

Jabber OSL

Y

N

20

7

Lesser General PL

Y

N

2,501

1,047

MIT L

N

N

395

151

MITRE Collaborative Virtual Workspace La

Y

Y/N

5

1

Motosoto L

Y

N

0

0

Mozilla PL 1.0

Y

N

229

76

Mozilla PL 1.1

Y

N

134

62

Nethack PL

Y

N

16

6

Nokia OSL

Y

N

5

2

Open Group Test Suite L

N

N

1

0

Python (CNRI) L

N

N

162

53

Python Software Foundation L

N

N

0

0

Qt PL

Y

N

136

39

Ricoh Source Code L

Y

N

5

3

Sleepycat L

Y

N

5

2

Sun Industry Standards Source Lb

N

N

26

9

Sun PL

Y

N

0

0

University of Illinois/NCSA OSL

N

N

1

1

Vovida Software L 1.0

N

N

1

0

W3C L

N

N

0

0

X.Net L

N

N

0

0

Zope PL 2.0

N

N

125

47

zlib/libpng L

N

N

0

0
















Other/Proprietary

?

?

531

220
















Public Domain

N

N

820

244


Definitions:

Restrictive: Y implies that the source code from modifications to the program must be made available.

Highly Restrictive: Y implies that the program cannot be compiled with proprietary programs.
Abbreviations:

L = License

OS = Open Source

PL = Public License


Notes:

aLicensees can choose between two possible options.

bDeviations from certain industry standards, however, must be documented.
1   2   3   4   5   6   7   8


База данных защищена авторским правом ©shkola.of.by 2016
звярнуцца да адміністрацыі

    Галоўная старонка