Geometric growth of the Linux kernel

Yesterday I read the article “Growth, Evolution, and Structural Change in Open Source Software” by Michael Godfrey and Qiang Tu. The article analyses the growth of Linux kernel in terms of lines of code since the first release. The statistical model developed for the uncommented lines of code is shown below. Even after crossing two million lines of code the Linux kernel enjoys a geometric growth in terms of lines of code. The model for the commented lines of code shows a similar trend.

Model: y = .21* x2 + 252 *x + 90,055


y = size in uncommented LOC

x = days since vl.0

r2 = .997 (coefficient of determination calculated using least squares)

Linux enjoys the active support form the ever increasing open source developer community which enables it to sustain such a tremendous growth. More than half of the code is for the various drivers which are independent of the system. The authors have also analyzed the Fetchmail, GCC compiler and VIM editor applications and concluded that ‘the evolution of each open source system is different and cannot be generalized’.

The interesting question is: what’s the trend for the total open source development? Is it increasing linearly, geometrically or may be decreasing. Successful projects like MySQL, Apache, Eclipse, SugarCRM, and OpenOffice indicate that Open Source must be increasing at a super linear rate. Still a formal analysis of open source is required to validate our hypothesis.

No comments: