|
|
Subscribe / Log in / New account

KS2009: How Google uses Linux

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jonathan Corbet
October 21, 2009
LWN's 2009 Kernel Summit coverage
There may be no single organization which runs more Linux systems than Google. But the kernel development community knows little about how Google uses Linux and what sort of problems are encountered there. Google's Mike Waychison traveled to Tokyo to help shed some light on this situation; the result was an interesting view on what it takes to run Linux in this extremely demanding setting.

Mike started the talk by giving the developers a good laugh: it seems that Google manages its kernel code with Perforce. He apologized for that. There is a single tree that all developers commit to. About every 17 months, Google rebases its work to a current mainline release; what follows is a long struggle to make everything work again. Once that's done, internal "feature" releases happen about every six months.

This way of doing things is far from ideal; it means that Google lags far behind the mainline and has a hard time talking with the kernel development community about its problems.

There are about 30 engineers working on Google's kernel. Currently they tend to check their changes into the tree, then forget about them for the next 18 months. This leads to some real maintenance issues; developers often have little idea of what's actually in Google's tree until it breaks.

And there's a lot in that tree. Google started with the 2.4.18 kernel - but they patched over 2000 files, inserting 492,000 lines of code. Among other things, they backported 64-bit support into that kernel. Eventually they moved to 2.6.11, primarily because they needed SATA support. A 2.6.18-based kernel followed, and they are now working on preparing a 2.6.26-based kernel for deployment in the near future. They are currently carrying 1208 patches to 2.6.26, inserting almost 300,000 lines of code. Roughly 25% of those patches, Mike estimates, are backports of newer features.

There are plans to change all of this; Google's kernel group is trying to get to a point where they can work better with the kernel community. They're moving to git for source code management, and developers will maintain their changes in their own trees. Those trees will be rebased to mainline kernel releases every quarter; that should, it is hoped, motivate developers to make their code more maintainable and more closely aligned with the upstream kernel.

Linus asked: why aren't these patches upstream? Is it because Google is embarrassed by them, or is it secret stuff that they don't want to disclose, or is it a matter of internal process problems? The answer was simply "yes." Some of this code is ugly stuff which has been carried forward from the 2.4.18 kernel. There are also doubts internally about how much of this stuff will be actually useful to the rest of the world. But, perhaps, maybe about half of this code could be upstreamed eventually.

As much as 3/4 of Google's code consists of changes to the core kernel; device support is a relatively small part of the total.

Google has a number of "pain points" which make working with the community harder. Keeping up with the upstream kernel is hard - it simply moves too fast. There is also a real problem with developers posting a patch, then being asked to rework it in a way which turns it into a much larger project. Alan Cox had a simple response to that one: people will always ask for more, but sometimes the right thing to do is to simply tell them "no."

In the area of CPU scheduling, Google found the move to the completely fair scheduler to be painful. In fact, it was such a problem that they finally forward-ported the old O(1) scheduler and can run it in 2.6.26. Changes in the semantics of sched_yield() created grief, especially with the user-space locking that Google uses. High-priority threads can make a mess of load balancing, even if they run for very short periods of time. And load balancing matters: Google runs something like 5000 threads on systems with 16-32 cores.

On the memory management side, newer kernels changed the management of dirty bits, leading to overly aggressive writeout. The system could easily get into a situation where lots of small I/O operations generated by kswapd would fill the request queues, starving other writeback; this particular problem should be fixed by the per-BDI writeback changes in 2.6.32.

As noted above, Google runs systems with lots of threads - not an uncommon mode of operation in general. One thing they found is that sending signals to a large thread group can lead to a lot of run queue lock contention. They also have trouble with contention for the mmap_sem semaphore; one sleeping reader can block a writer which, in turn, blocks other readers, bringing the whole thing to a halt. The kernel needs to be fixed to not wait for I/O with that semaphore held.

Google makes a lot of use of the out-of-memory (OOM) killer to pare back overloaded systems. That can create trouble, though, when processes holding mutexes encounter the OOM killer. Mike wonders why the kernel tries so hard, rather than just failing allocation requests when memory gets too tight.

So what is Google doing with all that code in the kernel? They try very hard to get the most out of every machine they have, so they cram a lot of work onto each. This work is segmented into three classes: "latency sensitive," which gets short-term resource guarantees, "production batch" which has guarantees over longer periods, and "best effort" which gets no guarantees at all. This separation of classes is done partly through the separation of each machine into a large number of fake "NUMA nodes." Specific jobs are then assigned to one or more of those nodes. One thing added by Google is "NUMA-aware VFS LRUs" - virtual memory management which focuses on specific NUMA nodes. Nick Piggin remarked that he has been working on something like that and would have liked to have seen Google's code.

There is a special SCHED_GIDLE scheduling class which is a truly idle class; if there is no spare CPU available, jobs in that class will not run at all. To avoid priority inversion problems, SCHED_GIDLE processes have their priority temporarily increased whenever they sleep in the kernel (but not if they are preempted in user space). Networking is managed with the HTB queueing discipline, augmented with a bunch of bandwidth control logic. For disks, they are working on proportional I/O scheduling.

Beyond that, a lot of Google's code is there for monitoring. They monitor all disk and network traffic, record it, and use it for analyzing their operations later on. Hooks have been added to let them associate all disk I/O back to applications - including asynchronous writeback I/O. Mike was asked if they could use tracepoints for this task; the answer was "yes," but, naturally enough, Google is using its own scheme now.

Google has a lot of important goals for 2010; they include:

  • They are excited about CPU limits; these are intended to give priority access to latency-sensitive tasks while still keeping those tasks from taking over the system entirely.

  • RPC-aware CPU scheduling; this involves inspection of incoming RPC traffic to determine which process will wake up in response and how important that wakeup is.

  • A related initiative is delayed scheduling. For most threads, latency is not all that important. But the kernel tries to run them immediately when RPC messages come in; these messages tend not to be evenly distributed across CPUs, leading to serious load balancing problems. So threads can be tagged for delayed scheduling; when a wakeup arrives, they are not immediately put onto the run queue. Instead, the wait until the next global load balancing operation before becoming truly runnable.

  • Idle cycle injection: high-bandwidth power management so they can run their machines right on the edge of melting down - but not beyond.

  • Better memory controllers are on the list, including accounting for kernel memory use.

  • "Offline memory." Mike noted that it is increasingly hard to buy memory which actually works, especially if you want to go cheap. So they need to be able to set bad pages aside. The HWPOISON work may help them in this area.

  • They need dynamic huge pages, which can be assembled and broken down on demand.

  • On the networking side, there is a desire to improve support for receive-side scaling - directing incoming traffic to specific queues. They need to be able to account for software interrupt time and attribute it to specific tasks - networking processing can often involve large amounts of softirq processing. They've been working on better congestion control; the algorithms they have come up with are "not Internet safe" but work well in the data center. And "TCP pacing" slows down outgoing traffic to avoid overloading switches.

  • For storage, there is a lot of interest in reducing block-layer overhead so it can keep up with high-speed flash. Using flash for disk acceleration in the block layer is on the list. They're looking at in-kernel flash translation layers, though it was suggested that it might be better to handle that logic directly in the filesystem.

Mike concluded with a couple of "interesting problems." One of those is that Google would like a way to pin filesystem metadata in memory. The problem here is being able to bound the time required to service I/O requests. The time required to read a block from disk is known, but if the relevant metadata is not in memory, more than one disk I/O operation may be required. That slows things down in undesirable ways. Google is currently getting around this by reading file data directly from raw disk devices in user space, but they would like to stop doing that.

The other problem was lowering the system call overhead for providing caching advice (with fadvise()) to the kernel. It's not clear exactly what the problem was here.

All told, it was seen as one of the more successful sessions, with the kernel community learning a lot about one of its biggest customers. If Google's plans to become more community-oriented come to fruition, the result should be a better kernel for all.

Next: Performance regressions

Index entries for this article
KernelGoogle
KernelScalability


(Log in to post comments)

Perforce

Posted Oct 21, 2009 4:35 UTC (Wed) by SLi (subscriber, #53131) [Link]

What on earth makes people (or companies) use Perforce? Especially Google, who I can't believe makes these decisions based on marketing hype and buzzwords? What's good in Perforce?

Perforce

Posted Oct 21, 2009 5:19 UTC (Wed) by bradfitz (subscriber, #4378) [Link]

An increasing number of us use a clever coworker's git-atop-perforce bridge,
so we feel like we're working in a git world, but certain operations sync to
Perforce and use the many perforce presubmit/postsubmit checks/scripts/etc,
enforcing normal code review processes and such.

So with stuff like that, or git-svn, if you're going to have a blessed
"central" repo anyway, who really cares if that repo is actually git, svn,
perforce, etc, as long as you can use your DVCS of choice at the edge?

The alternative is changing years of accumulated tools & checks every time a
new VCS comes out and you change your master repo's storage format.

*shrug*

Perforce

Posted Oct 21, 2009 22:02 UTC (Wed) by ianw (guest, #20143) [Link]

Same inside VMware, another big user of Perforce. We've got an increasing number of developers using git-p4 wrappers, and even now tools using it - the git based pre-submission build testing stuff evens twitters at @vmwarepbs if you submit something that breaks the build :)

Although everyone always talks about getting rid of it, there is so much build, QA and release infrastructure built around it, I can't fathom it could ever happen. But, using git wrappers, us developers can pretty much forget that it's even there :)

Perforce

Posted Oct 21, 2009 6:07 UTC (Wed) by dlang (guest, #313) [Link]

there are cases that current free VCS sytems cannot handle well, if at all. These are related to storing large binary blobs (including compiled binaries) in the VCS, which is a fairly common thing to do in corporate environments. instead of storing lots of tar files of releases, they like to check all the files of each release into the VCS.

Perforce handles this well (for all it's shortcomings in other areas)

Perforce

Posted Oct 21, 2009 7:46 UTC (Wed) by epa (subscriber, #39769) [Link]

svn also copes fairly well with big blobs. But perhaps when Google started out to pick a version control system all those years ago, svn wasn't mature enough.

Git sounds like it should cope well with large objects in the repository, but the general view is that it doesn't perform so well. I wonder why not.

Perforce

Posted Oct 21, 2009 9:25 UTC (Wed) by cortana (subscriber, #24596) [Link]

I just ran into one of the reasons yesterday. We were trying to check in a bunch of large binary
files, some several hundred megabytes large. Git ran out of memory with a fairly uninformative
error message while 'packing' objects, whatever that means...

Fortunately, a short trip to #git revealed the cause of the problem: git compresses objects
before sending them to a remote repository; it simply ran out of virtual memory while
compressing some of the larger files.

There were two fixes.

1. Use a 64-bit version of git. I'd be happy to, but there isn't an x64 binary download available
from the msysgit web site.

2. Tell git not to perform the compression; 'echo * -delta > .git/info/attributes'. Somewhat
undocumented, but at least I will be able to search for this LWN comment if I ever run into this
problem again. :)

-delta

Posted Oct 21, 2009 23:31 UTC (Wed) by joey (guest, #328) [Link]

Me and my 50 gb git repos thank you for that! But since finding LWN comments
in future is not my strong suite, I sent in a patch to document it on
gitattributes(1) ;)

Looks to me to make large object commits fast, but git pull will still
compress the objects, and still tends to run out of memory when they're
large.

-delta

Posted Oct 21, 2009 23:51 UTC (Wed) by cortana (subscriber, #24596) [Link]

Thanks so much for that. I would have suggested a patch, honest, but I'm super busy at work at the moment... ;)

Presumably git-pull running out of memory would be a server-side issue? And in that case, if you're not running a sensible 64-bit operating system on your server then you deserve what you get... ;)

Perforce

Posted Oct 21, 2009 12:13 UTC (Wed) by dlang (guest, #313) [Link]

Quote:
Git sounds like it should cope well with large objects in the repository, but the general view is that it doesn't perform so well. I wonder why not.

git mmaps the files to access them, and the pack definition is limited to no more than 4G (and since the over-the-wire protocol for download is the same as a pack file you run into limits there)

4G is _huge_ for source code. especially with the compression that git does, but when you start storing binaries that don't diff against each other the repository size can climb rapidly.

this has been documented several times, but it seems to be in the category of 'interesting problem, we should do something about that someday, but not a priority' right now

Perforce

Posted Oct 21, 2009 14:30 UTC (Wed) by drag (guest, #31333) [Link]

Yes. Git is exceptionally good at managing text.

Pretty shitty at everything else. Its too bad because I'd like to use it for synchronizing my desktop.

Perforce

Posted Nov 1, 2009 18:54 UTC (Sun) by mfedyk (guest, #55303) [Link]

You probably want to look at couchdb and the fuse driver for it.

Perforce

Posted Oct 21, 2009 9:22 UTC (Wed) by nix (subscriber, #2304) [Link]

It's also useful if your project consists of a large number of loosely-related changes, so you really don't *want* tree-wide commits. Amazing though it may sound some organizations actually depend on this sort of thing.

Perforce

Posted Oct 21, 2009 11:54 UTC (Wed) by jonth (guest, #4008) [Link]

Speaking as someone who works for one of those companies, I can give you a background into the reasons why we chose and continue to use Perforce.

First, some background. The company had grown over the course of it's first three years to a two-site, 100 person company. Up to that point, we had used CVS with a bunch of bespoke scripts to handle branches and merges. We used GNATs (I _really_ wouldn't recommend this) as our bug tracking system. These decisions had been taken somewhat by default, but by 2005 it was clear that we needed something new that would scale to our needs in the future. Our requirements were

a) Integration of bug tracking and source control management. For us, we felt that it was vital to understand that SCM is only half of the problem. I think that this tends to be overlooked in non-commercial environments.
b) Scalable to multi-site and 100s of users.
c) Ease of use.
d) Support.
e) Stability.
f) Speed.
g) Windows/Linux support. We're pridominantly a Linux shop, but we have teams who write Windows drivers.

We looked at the following systems (in no particular order):

a) git. Git had been active for about 6 months when we started looking at it. We liked the design principles, but at that time there was no obvious way to integrate it into an existing bug tracking system. It also had no GUI then (although I'm a confirmed command line jockey, a GUI for these things definitely improves productivity) and there was no Windows version of git. Finally, the underlying storage was still somewhat in flux, and all in all, it seemed just too young to risk the future of the company on it.
b) Mercurial. Many of the problems we had with git also applied to Mercurial. However, even then it did integrate with Trac, so we could have gone down that route. In the end, like git, it was just too new to risk.
c) Clearcase/Clearquest. Too slow, too expensive, and rubbish multi-site support.
d) Bitkeeper. Nice solution, but we were scared of the "Don't piss Larry off" license.
e) Perforce/Bugzilla. Provided "out of the box" integration with Bugzilla, worked pretty well with multi-site using proxies, had a nice GUI, scaled well, was stable (our major supplier had used it for a few years), had client versions for Windows and Linux, and was pretty quick, too.
f) MKS. No better than CVS.
g) SVN. In many ways, similar to Perforce in terms of how it is used. In fact, one part of the company decided to use SVN instead of Perforce. However, this lasted for about 6 months. I don't know the details but due to some technical difficulties, they gave up and moved over to Perforce.

All in all, Perforce integrated with a customized version of Bugzilla, while not perfect (git/mercurial/bk's model of how branches work is more sensible I think), gave us the best fit to our needs. We now have ~200 users spread all over the world, with no real performance problems. The bug tracking integration works well. Perforce's commercial support is responsive and good, we've never lost any data and we can tune the whole system to our needs.

If we had to revisit the decision, it's possible that Mercurial/Trac would have fared better, but to be honest the system we chose has stood the test of time and so there is no reason to change.

Perforce

Posted Oct 21, 2009 12:16 UTC (Wed) by ringerc (subscriber, #3071) [Link]

I wouldn't be too surprised if their svn issues involved use of the berkeley DB backend and, as an almost inevitable result unless incredible care is taken, corruption of the berkeley DB.

BDB is great if used in the (optional) transactional mode on an utterly stable system where nothing ever goes wrong. In other words, in the real world I like to see its use confined to throw-away databases (caches, etc).

I've been using SVN with the fsfs backend for years both personally and in several OSS projects and I've been very happy. Of course, the needs of those projects and my personal needs are likely quite different to your company's.

Perforce

Posted Oct 21, 2009 19:21 UTC (Wed) by ceswiedler (guest, #24638) [Link]

What on earth makes people use git? Especially you, who I can't believe makes these decisions based on word-of-mouth hype and buzzwords? What's good in git?

People use Perforce because it works very well for centralized version control, and that's what a lot of companies need. It enforces user security, integrates with a lot of other software, can be backed up centrally, and has a lot of very good tools. On the other hand, it doesn't scale as well as DVCSs do, and can't be used offline.

Git it

Posted Oct 21, 2009 21:11 UTC (Wed) by man_ls (guest, #15091) [Link]

Git is lightning fast (at least for code, I don't know for binaries), it's distributed and (surprise surprise) it's addictive! The cycle of 'commit, commit, commit, push when you're ready' is amazingly productive. I'm using it in my first project as single developer and I wouldn't change it for anything else I've used -- including cvs, svn, ClearCase, AccuRev and a few others too shitty to mention.

Git it

Posted Oct 31, 2009 4:55 UTC (Sat) by Holmes1869 (guest, #42043) [Link]

I'm in the exact same situation. Been using it for 3 months now for a personal project (hope to put it up on Gitorious.org someday), and I've just been blown away. "git rebase -i" (I use it a lot since no one else is depending on me) is just amazing. "git add -i" (for committing individual pieces of a file) has single-handedly made me despise using SVN at my day job. I really used to love SVN too.

That being said, I feel that some of the git features will only ever be used by people that take source control seriously. The people I work with check-in code without commit messages, mistakenly commit files that they forgot they changed (or other random files that ended up in their sandbox), and don't ever perform a simple 'svn diff' (or Subclipse comparison) just to make sure they are checking in what they want. Do you think these people care that they can re-order or squash a commit to create a single pristine, neat, atomic commit to fix exactly one particular bug? Probably not unfortunately. I hope to one day work with people that do care.

Perforce

Posted Oct 22, 2009 7:38 UTC (Thu) by cmccabe (guest, #60281) [Link]

> What on earth makes people (or companies) use Perforce?

I've worked with perforce, subversion, and git in the past. The three systems all have very different philosophies.

perforce has some abilities that none of the other systems have. When you start editing a file, it tells you who else has it open. You can look at their changes, too.

Both perforce and subversion can check out part of a repository without checking out the whole thing. Git can't do this. Arguably, you should use git subprojects to solve this problem. I've never actually done that, so I don't know how well it works.

Of course, git allows you to work offline, which neither perforce nor subversion can do. git also allows you to merge changes from one client to another ("branch," in git lingo). I've definitely been frustrated in the past by having to manually port changes from one perforce client to another-- even wrote scripts to automate it. What a waste.

"p4 merge" is a powerful command, much more powerful than "svn copy." p4 preserves the "x was integrated into y" relationships between files, whereas svn does not. Imagine a company that has branches for product 1.0, 2.0, and 3.0. It periodically integrates changes from 1.0 into 2.0, and 2.0 into 3.0. In this situation, the relative lack of sophistication of svn copy is a real Achilles heel. Imagine how much pain renaming a file in version 2.0 causes for the hapless svn copy user. Each time the build monkey does the integration from 1.0 to 2.0, he has to remember the files that were renamed. Except that with perforce, the system remembers it for him.

git I think has heuristics to detect this sort of thing. In general git was built from the ground up to do merging on a massive basis.

perforce also has excellent Windows support, a pile of GUI tools, and was about a dozen years earlier to the party. git and svn are catching up with these advantages, but it will take some time.

C.

Perforce

Posted Oct 22, 2009 19:17 UTC (Thu) by dsas (guest, #58356) [Link]

Subversion 1.5 has merge support, it's not as good as say bzr or git but it's better than svn copy.

Perforce

Posted Oct 30, 2009 21:57 UTC (Fri) by lkundrak (subscriber, #43452) [Link]

Our company's just moving away from subversion, though it serviced us well for some time. As the project grows, the svn merge support is really "worse than nothing." With a development team of ~30 engineers, I am constantly seeing issues like unable to merge after moving a file with a mergeinfo property from subtree merge, wasting considerable time on fixing up mergeinfos after engineers branch off other branches and do cross-branch merges. Moreover, the merge mechanism is different in 1.4, 1.5 and 1.6. Cryptic error messages tends to scare people off a bit too. Squashing all the commits of merge source into one commit in target greatly devaluates certain tools, such as annotate too; which was not an issue when the project was smaller, but gets rather annoying with the history growing.

Perforce

Posted Oct 29, 2009 3:05 UTC (Thu) by tutufan (guest, #60063) [Link]

> When you start editing a file, it tells you who else has it open.

Wow. I can almost hear the punch card reader in the background. Talk about an obsolete mindset. If I'm editing file X, do I really want to know whether somebody, somewhere, working on some idea that I have no idea about, is trying out something that also somehow involves file X, something that ultimately may never see the light of day? No.

If we get to the point of merging, I think about it then (if necessary).

Perforce

Posted Nov 4, 2009 21:44 UTC (Wed) by jengelh (subscriber, #33263) [Link]

Repository size is one issue. Just imagine if all of kernel.org, gnome.org and freedesktop.org (perhaps add in all of XFree86 too for fun) lived in a single repository. Oh and don't forget binary blobs that are popular to be added to corporate repos. Initial clone with --depth=MAX, anyone?

Sure you could split it up, but uh, all too tightly integrated. Should anything go git in a future, I would guess all repositories will start with a fresh slate.

KS2009: How Google uses Linux

Posted Oct 21, 2009 9:13 UTC (Wed) by sdalley (subscriber, #18550) [Link]

> Mike noted that it is increasingly hard to buy memory which actually works, especially if you want to go cheap.

Wow. Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

I'd prefer my code, documents and calculations not to be twiddled with behind my back, thank you.

Reminds me of a new improved version of an HP text editor I used once, backalong. We were maintaining a large Fortran program which started misbehaving in odd ways, then stopped working altogether. Turned out that each time you saved the file, the editor would lose a random line or two.

KS2009: How Google uses Linux

Posted Oct 21, 2009 9:37 UTC (Wed) by crlf (subscriber, #25122) [Link]

> Wow. Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

The issue is one of probability and large numbers.. Memory errors are already common today, and the continued increase in density will not help matters tomorrow.

KS2009: How Google uses Linux

Posted Oct 21, 2009 12:28 UTC (Wed) by sdalley (subscriber, #18550) [Link]

Thank you, very interesting paper.

So, one chance in three per annum of suffering a memory error on a given machine, roughly.

With ECC memory which Google use as standard, 19 out of 20 of these errors will be transparently corrected.

With non-ECC memory, as in commodity PCs, stiff biscuit every time.

KS2009: How Google uses Linux

Posted Oct 21, 2009 21:08 UTC (Wed) by maney (subscriber, #12630) [Link]

You imply that denser chips will cause higher error rates, but that is not what they found:

We studied the effect of chip sizes on correctable and un- correctable errors, controlling for capacity, platform (dimm technology), and age. The results are mixed. When two chip configurations were available within the same platform, capacity and manufacturer, we sometimes observed an increase in average correctable error rates and sometimes a decrease.

There were other, also mixed, differences when comparing only memory module sizes, but that mixes together differences in chip density and number of chips on the module - and quite possibly chip width as well.

The best we can conclude therefore is that any chip size effect is unlikely to dominate error rates given that the trends are not consistent across various other confounders such as age and manufacturer.

Which, I think, summarizes decades of experience that refuted various claims that the ever-shrinking memory cells just had to lead to terrible error problems. I may still have an old Intel whitepaper on this from back in the days when chips sizes were measured in Kbits.

KS2009: How Google uses Linux

Posted Oct 21, 2009 12:30 UTC (Wed) by nye (guest, #51576) [Link]

>Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

Anyone who wants to buy real memory that exists in the physical world, really.

KS2009: How Google uses Linux

Posted Nov 7, 2009 21:01 UTC (Sat) by jlin (guest, #61855) [Link]

I've read from a conference that Google actually goes to DRAM manufacturers and buys bulk memory chips that failed QA, and makes the ram dimm modules themselves in order to take advantage of scale. For Google, the low price outweighs the troubles of validating the failed RAM chips themselves for salvagable parts.

KS2009: How Google uses Linux

Posted Oct 21, 2009 14:23 UTC (Wed) by cma (guest, #49905) [Link]

What a waste of resources...Google could just work tied with the kernel community. Come on Google what are you waiting for? Besides this fact, if linux kernel code is GPLv2 why don't they release their code and respect GPLv2 license terms?

KS2009: How Google uses Linux

Posted Oct 21, 2009 14:36 UTC (Wed) by drag (guest, #31333) [Link]

I don't know if you noticed or not, but the GPL licensing terms only kick in
during distribution. Seeing how a corporation is a independent legal person I
don't think that moving software and hardware around internally really counts
as distribution. And I don't think that Google has any plans on selling its
systems to other people.

So the GPL is pretty irrelevant.

So it is just a business case of whether working with the kernel community is
going to be more profitable or not. And so far they decided that taking care
of stuff internally is a better approach. Maybe that will change.

GPL doesn't require, but maintenance kills you

Posted Oct 21, 2009 15:00 UTC (Wed) by dwheeler (guest, #1216) [Link]

Correct, the GPL doesn't require the release of this internal source code. However, the GPL does have an effect (by intent): Google cannot take the GPL'ed program, modify it, and sell the result as a proprietary program. Thus, what Google is doing is almost certainly wasting its own resources, by trying to do its own parallel maintenance. They could probably save a lot of money and time by working with the kernel developers; it's a short-term cost for long-term gain. And as a side-effect, doing so would help all other kernel users.

There's probably some stuff that will stay Google-only, but if they worked to halve it, they'd probably save far more than half their money. Google can do this, in spite of its long-term inefficiencies, because they have a lot of money... but that doesn't mean it's the best choice for them or anyone else.

Appliance kernel source?

Posted Oct 21, 2009 15:11 UTC (Wed) by dmarti (subscriber, #11625) [Link]

If you buy a Google Search Appliance, you should be able to request a copy of the source code to any GPL software on it. (Could be that they're maintaining a whole extra kernel for the GSA, though.)

Appliance kernel source?

Posted Oct 21, 2009 18:17 UTC (Wed) by ncm (guest, #165) [Link]

By their reports, you wouldn't _want_ to see the code.

Appliance kernel source?

Posted Oct 27, 2009 9:13 UTC (Tue) by dsommers (subscriber, #55274) [Link]

That's exactly why I would insist on seeing the code ;-)

Appliance kernel source?

Posted Oct 30, 2009 22:25 UTC (Fri) by cdibona (guest, #13739) [Link]

You guys are killing me, we've had this up The GSA Mirror for years and years. Enjoy!

Chris DiBona

KS2009: How Google uses Linux

Posted Oct 21, 2009 21:57 UTC (Wed) by jmm82 (guest, #59425) [Link]

I believe the reasons were outlined why they are not contributing code into the kernel.

1. They are not using kernels that are close to linus git head.
2. Some code would not be wanted in the mainline kernel.
3. Some code is not good enough to get into the mainline kernel.
4. They don't want to have 30 people saying the code will only get in if it does this. Aka. They don't want to make it support the features they are not using.
5. Some code is proprietary and they want to protect the IP. As was stated above as long as they are not distributing the code the changes are their property.
6. A lot of there patches are code backported from mainline, so it is already in the kernel.

I think moving forward that you will see Google have a few developers working on mainline to try and influence future kernels because it will be financially cheaper to carry as few patches as possible. Also, I feel they will always have some patches that they feel are too valuable IP to gave back and will continue to maintain those outside the mainline.

KS2009: How Google uses Linux

Posted Oct 21, 2009 22:46 UTC (Wed) by cpeterso (guest, #305) [Link]

If Google simply made their (non-proprietary) patches or git trees available to the public, then other kernel developers could adapt (or be inspired by) Google's work to the mainline kernel.

KS2009: How Google uses Linux

Posted Oct 21, 2009 22:21 UTC (Wed) by dany (guest, #18902) [Link]

thanks for report, as always its interesting reading

KS2009: How Google uses Linux

Posted Oct 23, 2009 22:13 UTC (Fri) by dvhart (guest, #19636) [Link]

A couple of thoughts while reading the article.

1) sched_yield with CFS trouble
Does the /proc/sys/kernel/sched_compat_yield flag help? This is the third time I've ran into sched_yield() behavior issues in the last couple weeks, all related to userspace locking. I'd really like to know what we (kernel developers) can do to make this recurring problem go away. Big applications often seem to have need of better performing locking constructs. Adaptive mutexes, implemented with futexes, seem like they could address a lot of this, with a couple exceptions: the spin time is not user configurable AFAIK, the additional accounting, etc. done in support of POSIX seems to make even the uncontended calls into glibc too expensive. A common response seems to be that userspace locking isn't the right answer and they should rely on OS primitives. Unfortunately, as Chris Wright mentioned during Day 1, these developers of empirical evidence to the contrary. I was reviewing some for a different project today, sometimes the performance difference is staggering.

2) Mike asked why the kernel tries so hard to allocate memory - why not just fail to allocate if there is too much pressure. Why isn't disabling overcommit enough?

KS2009: How Google uses Linux

Posted Oct 24, 2009 1:26 UTC (Sat) by Tomasu (subscriber, #39889) [Link]

2) probably because they actually want some over-commit, but they don't want the OOM thread to go wild killing everything, and definitely not the WRONG thing.

KS2009: How Google uses Linux

Posted Oct 25, 2009 19:24 UTC (Sun) by oak (guest, #2786) [Link]

In the Maemo (at least Diablo release) kernel source there are
configurable limits for when kernel starts to deny allocations and when to
OOM-kill (besides notifying user-space about crossing of these and some
earlier limits). If process is set as "OOM-protected", its allocations
will also always succeed. If "OOM-protected" processes waste all memory
in the system, then they can also get killed.

KS2009: How Google uses Linux

Posted Nov 8, 2009 13:08 UTC (Sun) by vbeeno (guest, #61876) [Link]

Wow, LInux totally rocks. Always has and always will.

Jess
www.private-web.se.tc

LWN web design

Posted Nov 9, 2009 9:33 UTC (Mon) by kragil (guest, #34373) [Link]

This article finally hit digg ( http://digg.com/linux_unix/How_Google_uses_Linux_2 ) and from the comments you can see that LWN does not really appealt to most of the Digg crowd (even in the Linux section).

I think hiring a web designer from this century (new colours, css stuff) could really improve this site. I am not talking lot of JS or Flash, just a newer more modern look.

Kernel hackers seem to complain that new blood is lacking, but for an ignorant observer a lot of stuff seems stuck in 1996.(just compare Rails,Gnome,KDE,etc news and planet sites to what the kernel has .. )

I won't even mention microblogging :)

LWN web design

Posted Nov 9, 2009 10:32 UTC (Mon) by k8to (guest, #15413) [Link]

I for one greatly appreciate the current design.

LWN web design

Posted Nov 9, 2009 12:59 UTC (Mon) by quotemstr (subscriber, #45331) [Link]

Good. Dig repellent.

LWN web design

Posted Nov 9, 2009 14:17 UTC (Mon) by kragil (guest, #34373) [Link]

True, most current subscribers probably like or don't mind the design.
But it sure as hell does not appeal to new subscribers (in general)
And the general lack of good web design (I am talking good layout, fonts and colours .. not JS, Flash or anything) is probably keeping new contributors and so some extend even new Linux users away.

LWN web design

Posted Nov 9, 2009 15:00 UTC (Mon) by anselm (subscriber, #2796) [Link]

In what way is the LWN.net design not »good«? It does what it is supposed to do in a very unobtrusive way -- unlike many of the newer sites. Not chasing after the latest visual fashions does not automatically make its layout, fonts, and colours »bad«. Exactly what about these do you think is keeping new users away?

(Having said that, table-based layout isn't exactly the done thing these days, but you only addressed »layout, fonts, and colours«, not HTML-level implementation, and as far as I'm concerned there's nothing whatsoever wrong with those. Also, registered users can tweak the colours to their own liking, and it probably wouldn't be impossible to allow the fonts to be tweaked, too.)

I'm sure that the LWN.net team will welcome constructive hints as to how to improve the LWN.net »experience« without giving up the strengths of the site, i.e., no-frills Linux info and commentary. For the time being, however, I'd much rather Jon & co. spend their time on giving us the best Linux news possible than chase the newest fads in web design. After all, people come here for the content, not to see great web design.

LWN web design

Posted Nov 9, 2009 15:57 UTC (Mon) by kragil (guest, #34373) [Link]

I never said anything about coming here for great web design, just about not thinking geocities is still alive.

And I am no web designer, but AFAIK you can do a lot of nice stuff with just CSS (even rounded corners etc.). A lot of small changes done by a professional would certainly add up. And it would still be backwards compatible and no-frills.

Just read a few comments on the link above to look beyond your little bubble.
I don't think a more professional look would be bad for LWN. Quite the opposite.

LWN web design

Posted Nov 9, 2009 16:46 UTC (Mon) by anselm (subscriber, #2796) [Link]

Right. Rounded corners. Rounded corners will definitely make all the difference! Honestly, if you can't come up with better suggestions than this ...

I just had a look at the comments on the Digg article you quoted above and I'm not convinced that encouraging that sort of crowd to come here is something Jon & co. should spend time and energy on. If the web design is what keeps them away then I would surely recommend keeping things as they are.

(Incidentally, looking at the Digg site itself, I'll take current LWN.net over the Digg design any day, thank you very much. Maybe it's just me, but I like my fonts readable and navigation where I can actually find it. Also very incidentally, unlike you I in fact pay LWN.net money every month so they can keep doing what they are doing so well. I wonder how many of the Digg users you revere so much would -- even if LWN.net looked like Digg?)

LWN web design

Posted Nov 9, 2009 23:26 UTC (Mon) by kragil (guest, #34373) [Link]

Again, I never said LWN should work, look or do anything like digg and only mentioned rounded corners as one of the more advanced things you can do with CSS. You can also do gradients, text shadows etc and a lot more stuff. Google is your friend here. I guess you read what you want to read.
I just think a more professional more modern look that appeals to _more_ people (not just digg users) and makes a good impression has no downsides, but a lot of people hate change and want to do and have things the same until eternity. Fine I guess that is how life conditions them in the long run, I just happen to have a less conservative attitude and most websites that never change and don't adapt have a good change of dying, even with the big stream of money you send their way.

LWN web design

Posted Nov 9, 2009 23:40 UTC (Mon) by quotemstr (subscriber, #45331) [Link]

most websites that never change and don't adapt have a good change of dying
Yes, but that adaption doesn't necessarily have to come in the form of the latest design trends. Objectively speaking, LWN is perfectly usable. What you're objecting to is LWN's not following web fashion. Not following web fashion hasn't seemed to hurt craigslist, despite the existence of many more hip competitors.

LWN web design

Posted Nov 10, 2009 0:22 UTC (Tue) by kragil (guest, #34373) [Link]

I never visited craigslist until now, but I think the reason for their continued success is the big userbase, even langpop uses CL as a data source. And you could say they have a super minimalistic design (no pics, just standard link colours and nothing else, although for cologne they have a little de for Germany next their name.So they have design. LWN not so much)

LWN web design

Posted Nov 13, 2009 1:11 UTC (Fri) by foom (subscriber, #14868) [Link]

I just noticed the program "Marketplace":
Marketplace... Craig's List. Without the Ugly.

Love Craig's List but hate how painful and ugly it is? Me too. So I made Marketplace. It takes the pain and ugly out and leaves the good stuff in.

I agree with the other comments that LWN could do with some sprucing up. It doesn't really bother me that it's currently ugly, I still read (and pay for) it. But I wouldn't mind it being nicer, either, and it might keep other readers from running in terror.

LWN web design

Posted Nov 10, 2009 0:10 UTC (Tue) by anselm (subscriber, #2796) [Link]

If your resources are limited (as LWN.net's are) it makes sense to stay away from stuff that is essentially eye candy for people who must always have the latest and greatest, and concentrate on stuff that benefits all your readers, like compelling content. If I was in charge of HTML/CSS development for LWN.net, I would consider some changes but they would not in the first instance touch the visual appearance -- I would probably move to a purely CSS-based (instead of table-based) layout to make the site more accessible. I might change some of the visual design but only to improve readability, not to make substantial changes to the layout as it appears. IMHO, such changes would be worthwhile but they would not be changes for change's sake the way you seem to be advocating. (Feel free to suggest anything specific that will actually improve the reader's experience if you can think of something.)

As far as Google is concerned, when the site was new it looked completely unlike all the other search engines precisely because it went back to the basics and presented essentially a logo, text entry widget and search button. In spite of this »conservative attitude« it still went on to become the most popular search engine on the planet. Again, it was the content that made the difference, not the (lack of) bling; people were attracted more by the good results Google produced than they were turned off by the straightforward appearance. Also, in spite of not changing its appearance substantially during the last decade or so, www.google.com isn't likely to go away anytime soon, either.

Finally, the »big stream of money« from subscribers is, to a large degree, what keeps LWN.net going. Jon & co. may, in fact, be very interested in updating their web design but perhaps they can't afford to spend the time or money at this moment. So if you want them to be able to contemplate non-essential tasks like HTML/CSS hacking, instead of whining about how LWN.net will go away if they don't »adapt« you should really contribute to their bottom line by taking out a subscription, which will certainly have a much greater impact than any amount of whining.

LWN web design

Posted Nov 10, 2009 1:11 UTC (Tue) by kragil (guest, #34373) [Link]

Well, sometimes you have to spend money to earn it.
And I suggested that you google for CSS capabilities and designs.

LWN web design

Posted Nov 10, 2009 10:51 UTC (Tue) by anselm (subscriber, #2796) [Link]

This is getting silly. If you can't point to anything specific and constructive that you would actually change to improve LWN.net other than »use rounded corners, they're cool« this must mean that the site isn't so bad to begin with, so I'll politely and respectfully suggest you shut up.

LWN web design

Posted Nov 10, 2009 12:06 UTC (Tue) by kragil (guest, #34373) [Link]

Again you don't listen. My suggestion: (I spell it out especially for you because you seem to be unable to grasp simple stuff.(Disrespect goes ways))

Get a good web designer that knows modern web design (page layout, usability, style, colours, logo etc.) and have him/her improve the crappy first impression this site makes. That may or may not include rounded corners. I don't know as I am not a web designer as I already mentioned and explained that that was just one tiny technical example, which still does not fit into your brain. All I know is that this sites design (unprofessional logo with green font, annoying flashy ads, black,red or blue text, grey and orange boxes, no advanced CSS etc.) undeniably makes a bad impression, which does not help anybody.
This doesn't have cost a lot. There are a lot of talented young web monkeys out there that don't charge a lot per hour. First thing though is to acknowledge that not everything is peachy.

LWN web design

Posted Nov 10, 2009 12:31 UTC (Tue) by hppnq (guest, #14462) [Link]

Another cup of Open Source, anyone?

LWN web design

Posted Nov 10, 2009 12:33 UTC (Tue) by quotemstr (subscriber, #45331) [Link]

My original point, if I may flesh it out a bit, is that the kind of person bothered by LWN's layout probably won't get much out of LWN's content in the first place. LWN's attraction to me is the deep, literate, and mature coverage, and to a lesser extent, the informative and useful comment section. I couldn't care less how the site looks, and would be just as happy (no, happier) if I could read it over NNTP. Changing lwn.net to pander to the Digg crowd would compromise what makes LWN worthwhile in the first place. Franky, the kind of person who judges a site based on how Web 2.0 it is would find the articles here boring, and would post vapid comments saying so. It'd be an Eternal September.

LWN web design

Posted Nov 10, 2009 12:40 UTC (Tue) by nye (guest, #51576) [Link]

This idea may be absolutely unthinkable to you, but it is actually possible to appreciate good design without being a sub-literate fool, despite what your prejudices may lead you to feel.


Copyright © 2009, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds