The birth of the open source enterprise stack
Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net. |
For the first ten years of its life, free software was largely a hacker's tool. All the early programs Emacs, GCC, Perl, Linux were written by coders for coders (usually themselves). It was the rapid uptake of the Internet by business in the mid-1990s that led to free software being used by companies, not just their employees.
The unplanned nature of this move online meant that computer departments were often asked to create a Web presence without being allocated extra funds. Free software was the obvious solution. The ready availability of GNU/Linux and Apache, whose first official release had appeared in December 1995, meant both were soon found in many companies, but generally unofficially. Software engineers knew it was easier simply to install the code than to go through formal approval processes that were bound to be skeptical of this new kind of software. The same was true for Samba, which allowed IT departments to add low-cost file and print servers to Windows networks.
At this stage, then, free software was on the periphery of companies, providing non-critical functions, and often invisibly as far as management was concerned. Gradually, though, word got out about the reliability and attractive price-performance characteristics of free software in general, and GNU/Linux in particular.
Similarly, software suppliers were discovering that their engineers were not only using free software but sometimes had even ported major proprietary software packages to GNU/Linux on their own initiative as happened with Software AG's Adabas D database, which shipped in 1996 as part of the Caldera Solutions CD. This fact, together with the growing use of GNU/Linux within companies, prompted the release in 1998 of official ports of the main enterprise-level databases: those from Oracle and Informix in July, and IBM's in December. It was a significant moment in the rise of open source in companies: free software was now countenanced officially, and started to play a mission-critical role.
At the same time, free software began to provide more complex business solutions through the deployment of what came to be termed the LAMP stack: GNU/Linux, Apache, MySQL and Perl/PHP/Python - the term LAMP was coined in 1998 by Michael Kunze in the German c't magazine. The stack represented a more sophisticated version of the approach based around the earlier Common Gateway Interface, which was used to interface Web servers with external applications like databases.
MySQL had first appeared in 1995. As well as representing an important breakthrough for open source application software in the enterprise, it also brought with it a new business model. In the beginning, the copyright for open source code had either been assigned to the Free Software Foundation to allow more effective enforcement of the GNU GPL, or remained with the various individual coders who had contributed. In the case of the MySQL code, though, it is the software house MySQL AB, which was created around the software, that owns all the copyrights.
Because of this, MySQL AB is able to employ a dual-licensing policy, offering its database under the GNU GPL or a commercial license. Some have seen this development as a threat to the core ethos of the open source world, because it raises the specter of a new, more subtle kind of vendor lock-in. Although the most popular, MySQL is by no means the only free database program: others include Firebird, Ingres and PostgreSQL.
The early years of the 21st century were ones of steady gains for free software within the enterprise. In the wake of the dotcom crash, which saw first-generation open source companies like Linuxcare, TurboLinux and VA Linux scaling back their operations dramatically, there were relatively few venture capitalists or IT start-ups that were willing to take a chance on new areas of free software. But corporate use of GNU/Linux in particular flourished, as the free operating system was increasingly used to save money by allowing companies to move from expensive proprietary hardware running Unix to commodity systems based on Intel processors.
One open source company that did appear during this time was Gluecode. It offered a commercial version of Apache Geronimo, the J2EE server project of the Apache Foundation. This was an important development, because it moved open source closer to the heart of the enterprise. Gluecode received a validation of sorts in 2005, when IBM bought the company, and added the open source product to its WebSphere Application Server line as a Community Edition.
IBM presumably preferred to cannibalize its own sales rather than see another increasingly-popular open source middleware company, JBoss, do the same. The JBoss project began in 1999, and, like MySQL, introduced a novel business approach to working with open source. It effectively commissions code for free software projects by hiring their top coders, thereby adding an element of commercial direction to the open source development process that was hitherto lacking. Also like MySQL, JBoss the company generally retains the copyright in the JBoss code. The JBoss way received its own vote of confidence when the company was acquired in April 2006 for $350 million by Red Hat, after being courted by Oracle, which has been on something of an open source spending spree.
The acquisition of Gluecode and JBoss, and Oracle's interest in the latter, firmly establishes middleware as the new hotspot for enterprise open source. Alongside IBM's WebSphere Application Server Community Edition and JBoss, there are several other free programs, including Enhydra, JOnAS and WSO2 Tungsten. Together, they represent a key piece in the creation of an open source enterprise stack, with GNU/Linux as the foundation.
It is here, rather than on the desktop, that free software's next big gains are likely to take place, and a subsequent feature will explore the surprising richness of the upper layers of the emerging open source enterprise stack, in areas such as systems management, customer relationship management, business intelligence, enterprise content management, enterprise resource planning and communications.
Glyn Moody writes about open source at opendotdotdot.
Index entries for this article | |
---|---|
GuestArticles | Moody, Glyn |
(Log in to post comments)
The birth of the open source enterprise stack
Posted Jun 26, 2006 18:32 UTC (Mon) by dmantione (guest, #4640) [Link]
Lets put it blunt:Now the whole world uses a crap database, which, despite recent
imporvements is still hopelessly behind. The whole world uses a slow,
unelegant language PHP, which is full of pitfalls like SQL and HTML
injection vulnerabilities. Despite improvements over the years, PHP still
sucks.
The result is that there are many, many software packages out there, that
are not as good aso they could be. (phpBB security leaks anyone?)
So, what should we be proud of? An insecure, unreliable web?
Seriously, there were better to much alternatives available at the time,
(PostgreSQL, AOLserver) which became niches, of which one reason was the
LAMP marketing.
So, LAMP a success? It is a matter of taste.
The birth of the open source enterprise stack
Posted Jun 26, 2006 19:38 UTC (Mon) by dskoll (subscriber, #1630) [Link]
I agree. PHP and MySQL suck. We use them both, unfortunately, because there are some very useful apps that demand them (eg SugarCRM). However, our database of choice is PostgreSQL, and we're moving to Perl for most product development. (Yeah, you can bash Perl, but it's a model of orthogonality compared to PHP.)
The birth of the open source enterprise stack
Posted Jun 30, 2006 7:13 UTC (Fri) by tekNico (subscriber, #22) [Link]
> Yeah, you can bash Perl
Ok, I'll take it: I bash Perl. What next, you'll ask us to dash Ruby, or even (shudder) ash Python? ;-)
The birth of the open source enterprise stack
Posted Jun 26, 2006 20:54 UTC (Mon) by jeroen (guest, #12372) [Link]
As Jeff Waugh said in his closing talk at FOSDEM, LAMP means Linux, Apache, Most of our scripting languages start with P, PostgreSQL. :-)
The birth of the open source enterprise stack
Posted Jun 26, 2006 21:53 UTC (Mon) by tjc (guest, #137) [Link]
Seriously, there were better to much alternatives available at the time, (PostgreSQL, AOLserver) which became niches, of which one reason was the LAMP marketing.I don't recall that LAMP had much in the way of marketing (other than good press), but I might not have been paying attention.
One thing that PHP and MySQL do have in common is good documentation, and they are both easy to install, configure, and use, which are things that other projects often overlook or put off, sometime indefinitely.
The birth of the open source enterprise stack
Posted Jun 27, 2006 7:33 UTC (Tue) by nix (subscriber, #2304) [Link]
PostgreSQL's documentation isn't half bad either, of course, unless the several thousand pages of the docs are purely an illusion :)
The birth of the open source enterprise stack
Posted Jun 27, 2006 14:22 UTC (Tue) by tjc (guest, #137) [Link]
It's not the amount of documentation that's important, it's the how well it's written, and -- especially important -- how well it's indexed (or searchable -- I'm a big fan of "on one page" manuals).I can't comment on PostgreSQL's current state of documentation, but five years ago when I was doing a lot of internet programming I found MySQL and PHP both to have decent documentation, which is one of the major reaons I used them. I was able to learn enough PHP in a few days to get started, owing mostly to it's C-like syntax and comprehensive online function index, and MySQL was greatly aided by Paul DuBois' excellent book on the subject.
At the time PostgreSQL was complicated to configure, and Phython didn't look much like C. If I were doing internet programming today (and had more time), I might choose differently.
The birth of the open source enterprise stack
Posted Jun 29, 2006 6:29 UTC (Thu) by nix (subscriber, #2304) [Link]
PostgreSQL still is complex to configure, but only a very small amount needs to be done unless you're running a large site (in which case MySQL would need configuration too, if it coped at all).
The default config values for some things (shared memory size, notably) are laughably small: suitable for testing only. I don't know why they start out *so* small, but it's easy to change them.
The birth of the open source enterprise stack
Posted Jun 27, 2006 19:57 UTC (Tue) by hingo (guest, #14792) [Link]
I think you forget the most important reason. MySQL was available for Win32, while PostgreSQL was not (until now). Remember that only a few years ago 99% of students learning to do websites had modem connection to the internet, and 99% of them did not use Linux on their desktop computer.I honestly think this is the one single reason MySQL beat PGSQL. Those kids simply never got to even try PGSQL. I also think this same issue has played a major role in the popularity of many Linux desktop apps, and how ironic is that!
The birth of the open source enterprise stack
Posted Jun 27, 2006 21:04 UTC (Tue) by tjc (guest, #137) [Link]
Remember that only a few years ago 99% of students learning to do websites had modem connection to the internet, and 99% of them did not use Linux on their desktop computer.I'm not so sure about that. Most universities use UNIX or Linux in the student labs. Maybe some community colleges and trade schools use Windows, and self-taught people learning at home, but I wouldn't think that it would be anywhere near 99%.
The birth of the open source enterprise stack
Posted Jun 30, 2006 7:44 UTC (Fri) by grouch (guest, #27289) [Link]
"One thing that PHP and MySQL do have in common is good documentation [...]"I would agree that PHP has good documentation, but MySQL's is atrocious. The comparison of documentation between MySQL and PostgreSQL alone is sufficient to use and recommend PostgreSQL.
The birth of the open source enterprise stack
Posted Jun 26, 2006 22:33 UTC (Mon) by davidw (guest, #947) [Link]
Marketing? Yes, but not quite like you say.
I'm a big Tcl fan (see http://antirez.com/articoli/tclmisunderstood.html for those of you who aren't familiar with the language, or formed your ideas about it in 1996), but I'm going to have to say that rather than a LAMP success, Tcl really fell down on the marketing front a few times:
*) John Ousterhout leaving the helm created lots of confusion. There was no longer a focal point for development, a Larry Wall, a Matz, someone who was the face of Tcl.
*) AOLserver was late to the open source game and didn't "play nicely" with what was already *the* web server. When I wrote mod_dtcl (subsequently Apache Rivet) it certainly wasnt the first server side Tcl system, but I think it was the first open source one, and this was in 1998, when PHP was already starting to be used...
As far as Mysql is concerned, I guess it's just a case of worse is better - "oh, I don't care about data integrity, I just need *fast*" - followed by "well, now we need to add payments, and mysql hasn't really given us trouble so far, and I guess we can handle transactions in PHP...".
The birth of the open source enterprise stack
Posted Jun 26, 2006 23:08 UTC (Mon) by khim (subscriber, #9252) [Link]
Don't know about PHP but for MySQL... they are mostly right. I've seen huge systems based on MySQL which are handling billions of $$ every year - and they do work. IMO it's mostly matter of programming culture and less matter of pure SQL-capability of the system used. While MYSQL is not perfect (and neither is PostgreSQL) it does scale, you can make it work - may be not as easily as PostgreSQL, but it can be done: I've seen it done.
PHP... I'm yet to see big, complex, reliable system with heavy use of PHP. Sometimes PHP is used "on the side", but when system becomes complex and PHP is in the middle of it... it's inevitable disaster...
P.S. I've seen huge systems written in Python, Java (of course) and even Ruby (hmm, it works) or Perl (horrors), but PHP... nope. Don't really know why... But may be it's the attitude ? PHP is the only language where "security update" can break half of the scripts on your site and when the question will be raised in the mailing list the answer will be "oh, these scripts where never written to the specifications, so it's Ok to break them - you just need to fix them".
The birth of the open source enterprise stack
Posted Jun 27, 2006 7:44 UTC (Tue) by nix (subscriber, #2304) [Link]
In any case, it's amazing how often huge systems can get away without using transactions, as long as their load is comparatively low with respect to system speed and the things they connect to are reliable enough that they don't need to roll back often.
One of my least finest hours was a day in 1999 when I broke the transaction management in a (large!) financial application by typoing in a hook and making all rollbacks throughout that application into commits instead. Nobody noticed for *six months*, until a major stockmarket feed went down. (Then all our customers tried to roll back at once...)
(And as for isolability, we all know how good certain major databases are at *that*. The PostgreSQL manual makes the point that perfect isolation isn't possible without giving the database what amounts to a theorem prover *and* complete knowledge of your app's control flow: but most databases, including major expensive ones with clowns with E in their surnames as CEOs of their controlling company, don't even try: you have to *ask* for half-decent isolation, and when you do your transaction goes read-only! PostgreSQL never had *that* problem, but it doesn't seem to stop people using said major database...)
The birth of the open source enterprise stack
Posted Jun 30, 2006 6:43 UTC (Fri) by oak (guest, #2786) [Link]
> One of my least finest hours was a day in 1999 when I broke the> transaction management in a (large!) financial application by typoing
> in a hook and making all rollbacks throughout that application into
> commits instead. Nobody noticed for *six months*, until a major
> stockmarket feed went down.
This seems an important functionality which should have had a test case
(preferably written before the functionality was fully implemented)
which tests both conditions where rollbacks should happen and where not.
How easy it's to write test-cases for LAMP (and particularly PHP)
software? Is there some test-suite that could be used? I would think
the end user UI tests (needing to use "HTTP API") to be particularly
onerous, but at least LAMP would be good for storing and visualizing
the test results. ;-)
The birth of the open source enterprise stack
Posted Jun 30, 2006 17:06 UTC (Fri) by nix (subscriber, #2304) [Link]
Yes, it should have had a testcase. Alas the environment in question had such an insane build system that I hadn't at the time written a testsuite framework for it. (I wrote one immediately after this, ahem, incident: I only need my face rubbing in the bleeding obvious *once* ;) )
There are several test frameworks for webbish stuff: Cactus (part of Apache Jakarta) is half of what's needed. I don't have much experience with any of them (due to violent allergies to HTML and webbish stuff in general).
The birth of the open source enterprise stack
Posted Jul 5, 2006 19:48 UTC (Wed) by pimlott (guest, #1535) [Link]
it's amazing how often huge systems can get away without using transactionsIt is amazing, but it doesn't make it right! When something finally does go wrong, it's extremely hard to debug and fix. Yes, I know, sometimes the cost of cleaning up the occasional mess is lower than the cost of doing it right. But given the ubiquity of DBMSs with transactions, it's hard to see why you'd risk it.
The PostgreSQL manual makes the point that perfect isolation isn't possible without giving the database what amounts to a theorem prover *and* complete knowledge of your app's control flowI'm curious what you're talking about, because I can't find anything like that in the PostgreSQL manual. I think perfect isolation is straightforward using read and write logs, and that PostgreSQL does exactly that. What am I missing?
that perfect isolation isn't possible without giving the database what amounts to a theorem prover *and* complete knowledge of your app's control flow: but most databases, including major expensive ones with clowns with E in their surnames as CEOs of their controlling company, don't even try: you have to *ask* for half-decent isolation, and when you do your transaction goes read-only!I agree that they don't even try. It's a travesty that the isolation level required for reliability in most applications--that the SQL standard says must be the default--is not the default in any DBMS I know. And when you do turn it on, many systems perform badly; PostgreSQL, with multi-version concurrency control, does quite well. But what do you mean about transactions going read-only?
The birth of the open source enterprise stack
Posted Jul 7, 2006 0:38 UTC (Fri) by nix (subscriber, #2304) [Link]
See section 12.2.2.1 of the PostgreSQL manual, and the discussion of predicate locking. (Note, when it says `the details of every query', it means the details. In the presence of random-access cursors I suspect this reduces to solving the halting problem.)The PostgreSQL manual makes the point that perfect isolation isn't possible without giving the database what amounts to a theorem prover *and* complete knowledge of your app's control flowI'm curious what you're talking about, because I can't find anything like that in the PostgreSQL manual. I think perfect isolation is straightforward using read and write logs, and that PostgreSQL does exactly that. What am I missing?
But what do you mean about transactions going read-only?If you turn on serializable isolation in some transaction in said expensive proprietary RDBMS, you can no longer carry out INSERTs or UPDATEs in that transaction. It makes it really very useful :/
The birth of the open source enterprise stack
Posted Jul 7, 2006 1:24 UTC (Fri) by pimlott (guest, #1535) [Link]
See section 12.2.2.1 of the PostgreSQL manual, and the discussion of predicate locking.Thank you, I am enlightened. I guess this could be solved by logging at the table level: when the where clause is non-trivial, mark the entire table read, and mark the table written on insert. Then, when you commit, you would see (in one of the transactions) that the table has been modified since you queried it.
If you turn on serializable isolation in some transaction in said expensive proprietary RDBMS, you can no longer carry out INSERTs or UPDATEs in that transaction.Hmm... I've used serializable in (if memory serves) Informix, Oracle, and Sybase, reading and writing. Have I not gone expensive enough?
The birth of the open source enterprise stack
Posted Jul 8, 2006 22:58 UTC (Sat) by nix (subscriber, #2304) [Link]
My experience of Oracle is that at least in some releases, it had (and was documented as having!) the nasty read-only semantics I described :(
The birth of the open source enterprise stack
Posted Jun 27, 2006 15:48 UTC (Tue) by dmantione (guest, #4640) [Link]
The fact that MySQL powers big $$ systems does mean it should power thosesystems. I also still need to meet the first person that concludes
PostgreSQL is too slow for his application. In fact, I have yet to see
the first installation where MySQL was consiously chosen after comparison
to other databases.
The birth of the open source enterprise stack
Posted Jun 29, 2006 6:01 UTC (Thu) by jwb (guest, #15467) [Link]
Well, I'd be happy to introduce you to some of these applications. For an example, I have written a system which caches the information about the last trade of hundreds of thousands of stocks. During active trading, this information comes at the rate of thousands of updates per second. The amount of information to be stored is basically fixed, as there are a fixed number of stocks to be traded. In MySQL, the database size stays nice and constant, I get atomic updates, and I also get a transaction rate up to 10k/sec on hardware costing less than $2000.
If I were to use PostgreSQL for this load, I would end up with a huge, ever-growing file, because updates are handled like inserts. I would have an always-running VACUUM process, but the file would probably grow slightly at peak times anyway. I would likely have unbounded index growth, as VACUUM does not reclaim index space (only REINDEX does this, and it requires exclusive access in some versions). In all likelihood, I would need a nightly maintenance period, which I can't afford because there are stock markets all over the planet.
So there's an example of a load where you'd better use MySQL.
The birth of the open source enterprise stack
Posted Jun 29, 2006 16:08 UTC (Thu) by dskoll (subscriber, #1630) [Link]
If I were to use PostgreSQL for this load, I would end up with a huge, ever-growing file, because updates are handled like inserts.
Correct, sort-of. VACUUM would reclaim the space.
I would likely have unbounded index growth, as VACUUM does not reclaim index space
This is no longer true. Since 8.0 (and possibly since 7.4), VACUUM reclaims index space.
The birth of the open source enterprise stack
Posted Jun 30, 2006 17:07 UTC (Fri) by nix (subscriber, #2304) [Link]
... and therefore so does the autovacuum daemon in 8.1+.
The birth of the open source enterprise stack
Posted Jul 4, 2006 7:57 UTC (Tue) by nlucas (subscriber, #33793) [Link]
You might want to check SQLite for this kind of applications.No server setup as it's only a library using a single file as a database, but *very* fast (but ACID by default, so you need to know how to make it fast).
It's not good for multiple simultaneous writers, because it locks the entire database on a write, but most applications only write once and read many.
The birth of the open source enterprise stack
Posted Jun 29, 2006 13:41 UTC (Thu) by jkouyoumjian (guest, #34201) [Link]
Wow. Thank you for that wisdom.
Microsoft, Sun, Oracle, SAP and the rest are slow, insecure and suck too, especially if you configure them wrong and don't understand what they were designed to do.
The difference is that I would rather not pay for my sucky software.
It is possible to write good, secure and fast applications with any of these tools. They don't have to suck. Unfortunately, things don't just work. You have to understand how to use them.
FOOT - the desktop stack
Posted Jun 26, 2006 20:22 UTC (Mon) by ayeomans (guest, #1848) [Link]
"It is here, rather than on the desktop, that free software's next big gains are likely to take place" - I doubt this is true in numerical terms. maybe not even in financial terms.
My bet on the biggest gains is free software which runs on Windows as well as other OS platforms, in particular FOOT -
Firefox>
OpenOffice.org
Thunderbird
OFBiz
Posted Jun 26, 2006 22:37 UTC (Mon) by davidw (guest, #947) [Link]
For those interested in really 'businessy' sorts of things, one system to keep an eye on is OFBiz, at http://www.ofbiz.org. It's a system with modules for accounting, ecommerce, CRM, content management, warehouse management, manufacturing, and so on. But more than that, it's a flexible toolkit and community of developers who deal with those sorts of problems every day, which makes it a good group to learn from and share ideas with.
It is a relatively heavy duty Java app, which isn't everyone's cup of tea, but Java does seem to be strong for the sorts of things the application does.
(Disclaimer: I'm helping them become part of the Apache Software Foundation, but don't really have any other direct interests in the project).
MySQL didn't invent the dual-licensing biz model
Posted Jun 29, 2006 6:21 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]
Aladdin was doing it with Ghostscript earlier on, as was Cygnus with Cygwin (that is, allow use in proprietary software products in exchange for money, but give the world a free one under the GPL).
Russ Nelson's Crynwr also used dual licensing, with the slogan "When I write proprietary software I want to get paid".
The birth of the open source enterprise stack
Posted Jun 29, 2006 12:13 UTC (Thu) by jpmcc (guest, #2452) [Link]
It was the rapid uptake of the Internet by business in the mid-1990s that led to free software being used by companies, not just their employees.
Not so: open-source was born in companies in the early dawn of computing. The earliest documented example of an open-source project in the literature is the SHARE project launched by 17 early adopters of the IBM 704 computer in August 1955. By its first anniversary, the project had created 700 programs and had saved of the order of 1.5 million US Dollars.
Open-source started in the commercial world - it made sense for companies over fifty years ago; it makes even more sense today. Shame that the IT industry lost the plot somewhere along the line.
.
The birth of the open source enterprise stack
Posted Jun 30, 2006 6:36 UTC (Fri) by oak (guest, #2786) [Link]
Maybe free software here refers to the licenses which maintainthe freedoms that FSF and Debian are interested about?