Thoughts on the ext4 panic

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jonathan Corbet
October 29, 2012

In just a few days, a linux-kernel mailing list report of ext4 filesystem corruption turned into a widely-distributed news story; the quality of ext4 and its maintenance, it seemed, was in doubt. Once the dust settled, the situation turned out to be rather less grave than some had thought; the bug in question only threatened a very small group of ext4 users using non-default mount options. As this is being written, a fix is in testing and should be making its way toward the mainline and stable kernels shortly. The bug was obscure, but there is value in looking at how it came about and the ripples it caused.

The timeline

On October 23, user "Nix" was trying to help track down an NFS lock manager crash when he ran into a little problem: the crash kept corrupting his filesystem, making the debugging task rather more difficult than it would otherwise have been. He reported the problem to the linux-kernel mailing list; he also posted a warning for other LWN readers. The ext4 developers moved quickly to find the problem, coming up with a hypothesis within a few hours of the initial report. Unfortunately, the hypothesis turned out to be wrong.

Before that became clear, though, a number of news outlets had posted articles on the problem. LWN was not the first to do so ("first" is not at the top of our list of priorities), but, late on the 24th, we, too, posted an item about the issue. It quickly became clear, though, that the original hypothesis did not hold water, and that further investigation was in order. That investigation, as it turns out, took a few days to play out.

Eric Sandeen eventually tracked the problem down to this commit which found its way into the mainline during the 3.4 merge window. That change was meant to be a cleanup, gathering the inode allocation logic into a single function and removing some duplicated code. The unintended result was to cause the inode bitmap to be modified outside of a transaction, introducing unchecksummed data into the journal. If the system crashed during that time, the next mount would encounter checksum errors and refuse to play back the journal; the filesystem was then seen as being corrupt.

The interesting thing is that, on most systems, this problem will never come about because, on those systems, the journal checksums do not actually exist. Journal checksumming is an optional feature, not enabled by default, and, evidently, not widely used. Nix had turned on the feature somewhat inadvertently; most other users do not turn it on at all, even if they are aware it exists. Anybody who has journal checksums turned off will not be affected by this bug, so very few ext4 users needed to be concerned about potential data corruption.

As an interesting aside, checksums on the journal are a somewhat problematic feature; as seen in this discussion from 2008, it is not at all clear what the best response should be when journal checksums fail to match. The journal checksum may not be information that the system can reasonably act upon; indeed, as in this case, it may create problems of its own.

Eric's patch appears to fix the problem; corrupted journals that were easily observed before its application do not happen afterward. There will naturally be a period of review and testing before this change is merged into the mainline — nobody wants to create a new problem through undue haste — but kernel releases with a version of the fix (it has already been revised once) should be available to users in short order. But most users will not really care, since they were not affected by the problem in the first place. They may care more about the plans to improve the filesystem test suites so that regressions of this nature can be more easily caught in the future.

Analysis

In retrospect, the media coverage of this bug was clearly out of proportion to that bug's impact. One might attribute that to a desire for sensational stories to drive traffic, and that may well be part of what was going on. But there are a couple of other factors that are worth keeping in mind before jumping to that judgment:

Many media outlets employ editors and writers who, almost beyond belief, are not trained in kernel programming. That makes it very hard for them to understand what is really going on behind a linux-kernel discussion even if they read that discussion rather than basing a story on a single message received in a tip. They will see a subject like "Apparent serious progressive ext4 data corruption," along with messages from prominent developers seemingly confirming the problem, and that is what they have to go with. It is hard to blame them for seeing a major story in this thread.
Even those who understand linux-kernel discussions (LWN, in its arrogance, places itself in this category) can be faced with an urgent choice. If there were a data corruption bug in recent kernels, then we would be beyond remiss to fail to warn our readers, many of whom run the kernels in question. There comes a point where, in the absence of better information, there is no alternative to putting something out there.

The ext4 developers certainly cannot be faulted for the way this story went. They did what conscientious developers do: they dropped everything to focus on what appeared to be a serious regression affecting their users. They might have avoided some of the splash by taking the discussion private and not saying anything until they were certain of having found the real problem, but that is not the way our community works. It is hard to imagine that pushing development discussions out of the public view is going to make things better in the long run.

Thus, one might conclude that we are simply going to see an occasional episode like this, where a bug report takes on a life of its own and is widely distributed before its impact is truly understood. Early reports of software problems, arguably, should be treated like early software: potentially interesting, but likely to be in need of serious review and debugging. That's simply the world we live in.

A more serious concern may apply to the addition of features to the ext4 filesystem. Ext4 is viewed as the stable, production filesystem in the Linux kernel, the one we're supposed to use while waiting for Btrfs to mature. One might well question the addition of new features to this filesystem, especially features that prove to be rarely used or that don't necessarily play well with existing features. And, sure enough, Linux filesystem developers have raised just this kind of worry in the past. In the end, though, the evolution of ext4 is subject to the same forces as the rest of the kernel; it will go in the directions that its developers drive it. There is interest in enhancing ext4, so new features will find their way in.

Before getting too worried about this prospect, though, it is worth thinking about the history of ext4. This filesystem is heavily used with all kinds of workloads; any problems lurking within will certainly emerge to bite somebody. But problems that have affected real users have been exceedingly rare and, even in this case, the number of affected users appears to be countable without running out of fingers. Ext4, in other words, has a long and impressive record of stability, and its developers are determined to keep it that way; this bug can be viewed as the sort of exception that proves the rule. One should never underestimate the value of good backups, but, with ext4, the chances of having to actually use those backups remain quite small.

(Log in to post comments)

Exceptions and rules

Posted Oct 29, 2012 14:52 UTC (Mon) by rfunk (subscriber, #4054) [Link]

Argh. This is an excellent article, marred only by the serious (mis)use of the nonsensical expression "exception that proves the rule". The net is full of explanation about the original meaning of this phrase, and why almost every modern use of it is wrong.

Exceptions and rules

Posted Oct 29, 2012 14:57 UTC (Mon) by corbet (editor, #1) [Link]

We have an internal reviewer who will be gratified by your comment.. sorry to have marred your experience. I'll make a rule not to use that phrase again, with only rare exceptions.

Exceptions and rules

Posted Oct 29, 2012 16:53 UTC (Mon) by Trelane (subscriber, #56877) [Link]

> I'll make a rule not to use that phrase again, with only rare exceptions.

I cannot take exception to that exceptional rule.

Exceptions and rules

Posted Oct 30, 2012 11:40 UTC (Tue) by zack (subscriber, #7062) [Link]

> I'll make a rule not to use that phrase again, with only rare exceptions.

Chapeau.

Exceptions and rules

Posted Oct 29, 2012 15:33 UTC (Mon) by nix (subscriber, #2304) [Link]

It wasn't a misuse. This exception did indeed prove (i.e. test) the rule, and find it to be unchanged. ext4 is stable as hell.

Exceptions and rules

Posted Oct 29, 2012 15:40 UTC (Mon) by rfunk (subscriber, #4054) [Link]

If there is an actual exception, then the rule fails the test. If the rule passes the test, there's no exception.

Exceptions and rules

Posted Oct 29, 2012 16:29 UTC (Mon) by khim (subscriber, #9252) [Link]

If there is an actual exception, then the rule fails the test. If the rule passes the test, there's no exception.

Yup. 1001th wrong interpretation of the phrase. If you'll stop complaining for the minute and actually read the meaning of phrase in question you'll find out that original principle is exceptio probat regulam in casibus non exceptis ("the exception confirms the rule in cases not excepted") and that this article used phrase correctly

The very fact that this tiny quirk which needed non-default (and non-recommended) mount options warranted such a huge ruckus means that the rule is ext4 is rock-solid in accordance to the exceptio probat regulam in casibus non exceptis principle.

Exceptions and rules

Posted Oct 29, 2012 16:38 UTC (Mon) by rfunk (subscriber, #4054) [Link]

I was responding to nix's interpretation above. You're correct about the original meaning, of course. But I disagree that the usage here fits the original meaning. Even if I accept that it does, however, just the fact the the phrase is widely misunderstood and misused should consign it to the dustbin; if most people don't understand it correctly, communication isn't happening.

Exceptions and rules

Posted Oct 29, 2012 16:49 UTC (Mon) by khim (subscriber, #9252) [Link]

I was responding to nix's interpretation above. You're correct about the original meaning, of course. But I disagree that the usage here fits the original meaning.

It does. We had a rule: ext4 is rock-solid and is robust no matter what you throw on it (unless you are doing something totally insane like using broken RAID controller). Now we have an exception: if you use ext4 with non-default mount option journal_checksum then you may expect data corruption in some rare cases. The fact that such a small exception raised such a ruckus can be used as a support for the original rule (after all if the file system corrupts data all the time then yet-another-bug in it will hardly be a newsworthy material).

Even if I accept that it does, however, just the fact the the phrase is widely misunderstood and misused should consign it to the dustbin; if most people don't understand it correctly, communication isn't happening.

Most people on this planet don't understand English. Should we stop using it and close LWN?

Exceptions and rules

Posted Oct 29, 2012 18:40 UTC (Mon) by randomguy3 (subscriber, #71063) [Link]

See Wikipedia's article on the phrase, which enumerates 5 usages (from most correct to least correct) from Fowler's Modern English Usage. I think that the article's usage would fit in number 3 (loose rhetorical): the use of the phrase is to draw attention to the fact that something is uncommon. The existence of the bug does not imply that bugs are uncommon. Nor is the ruckus itself an exception from anything.

Most people on this planet don't understand English. Should we stop using it and close LWN?

"Most people on the planet" is irrelevant. "Most people who read LWN" is much more relevant, and "tech-savvy English speakers" is a good approximation of that.

Exceptions and rules

Posted Oct 29, 2012 16:29 UTC (Mon) by epa (subscriber, #39769) [Link]

I think the idea is that the apparent exception to the rule, on closer examination, turns out not to be an exception after all. So the rule still holds. It is silly, though; as you say, if the rule holds then there was no exception after all.

Exceptions and rules

Posted Oct 29, 2012 15:37 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

"almost every modern use of it is wrong"

Or, to look at it another way, perhaps it's the usage mavens who are wrong. We don't all have to be Humpty Dumpty, but if the vast majority of people choose to use a word, a phrase or even an entire language in a way that overturns precedent you eventually have to bend to their will or be broken by it. This isn't an example like misnegation where you can argue that perhaps someone didn't mean what they wrote, the author intended exactly these words, they just didn't intend it to mean quite what it once did.

Even eggcorns, which begin with a mistaken re-analysis of a spoken word or phrase, sometimes enter the mainstream. Once upon a time "upmost" was clearly sometimes a mistake for "utmost" today it's unclear, in another century there may be usage mavens insisting "upmost" should be preferred as more logical and "down the pipe" might likewise displace "down the pike".

Exceptions and rules (digression)

Posted Oct 29, 2012 15:55 UTC (Mon) by Richard_J_Neill (guest, #23093) [Link]

The modern use of this is confusing. After all, "The proof of the pudding is in the eating" means "The right way to test a pudding is by eating it".
Also, "80% proof whisky" means "tested, contains 40% ethanol".

Exceptions and rules (digression)

Posted Oct 29, 2012 19:54 UTC (Mon) by davidstrauss (guest, #85867) [Link]

It's "80 proof," not "80% proof." "Proof" is like "percent," except with a denominator of 200 -- and only really used for liquor.

Exceptions and rules (digression)

Posted Oct 29, 2012 23:10 UTC (Mon) by barryascott (subscriber, #80640) [Link]

I like the historic origin of testing alcohol content by seeing if gunpower will ignite.

http://en.wikipedia.org/wiki/Alcohol_proof

Exceptions and rules

Posted Oct 29, 2012 15:56 UTC (Mon) by rfunk (subscriber, #4054) [Link]

But those exact words don't mean anything sensible in most cases. An exception cannot logically show that a rule is valid in the usual modern sense of "prove", and this is what most people mean. The modern idea seems to be that because this phrase exists, it must be valid to use an exception to prove that a rule is valid, which is of course logical nonsense. Logical nonsense is much different from simply giving new definitions to words or coining new words.

Exceptions and rules

Posted Oct 29, 2012 16:34 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

Yes, I think we're all familiar with the "But it's not logical" argument from the Lynne Trusses of this world. The problem with that argument generally, and here specifically is that language isn't about sharing propositions of formal logic. Interpreting other people's utterances as logical propositions makes for a briefly amusing diversion in a light drama but it's a terrible way to carry on a real conversation.

"It's like the Somme out there"
"How does it resemble the Somme?"
"I mean it's incredibly muddy"
"But I am in fact willing to believe that it is muddy"

The legal origin of this idea of an "exception that proves the rule" is fascinating, but the phrase has taken on a life of its own. You will notice that people are also content to say "third time is a charm" (there is no logical reason to believe that third attempts are special in any way) and "If a job's worth doing it's worth doing well" (likewise, a shoddy job may be the only economic or practical option) and many other phrases which can't be defended logically. They're not doing it wrong, these phrases aren't intended to be truthful statements about the world, any more than anybody thought exceptions /actually/ prove a rule.

Exceptions and rules

Posted Oct 29, 2012 16:42 UTC (Mon) by rfunk (subscriber, #4054) [Link]

The other examples you cite at least have internal logic. People may or may not believe them (and many people do in fact believe them as truths about the world), but they make internal sense without needing to believe them as true.

Exceptions and rules

Posted Oct 29, 2012 16:41 UTC (Mon) by khim (subscriber, #9252) [Link]

"Exception that proves the rule" is part of the phrase. The full legal principle is exceptio probat regulam in casibus non exceptis ("the exception confirms the rule in cases not excepted") as you, I hope, know.

Classic example will be Special leave is given for men to be out of barracks tonight till 11.00 p.m.; "The exception proves the rule" means that this special leave implies a rule requiring men, except when an exception is made, to be in earlier. The value of this in interpreting statutes is plain.

Modern example will be Now we know that exceptional case of ext4 with journal_checksum is not stable. Application of exceptio probat regulam in casibus non exceptis principle will mean that: ext4 is stable unless non-standard option journal_checksum is used - and this is indeed the case.

IOW: this article is rare case where phrase "exception that proves the rule" is used correctly.

P.S. When you start badmouthing people and explain that the net is full of explanation about the original meaning of this phrase, and why almost every modern use of it is wrong it's good idea to refresh your own knowledge and see if you understand what the phrase means and when it's appropriately used.

Exceptions and rules

Posted Nov 8, 2012 12:01 UTC (Thu) by Wol (subscriber, #4433) [Link]

This comment actually gets to the root of the problem. The modern understanding of the word "prove" has changed.

Hence, the saying no longer means what it did, because the meaning of the words have changed underneath it.

Cheers,
Wol

Exceptions and rules

Posted Oct 30, 2012 20:01 UTC (Tue) by cmccabe (guest, #60281) [Link]

Usually comments on LWN are very informative. This thread however, is the exception that... never mind.

Exceptions and rules

Posted Oct 31, 2012 0:43 UTC (Wed) by nix (subscriber, #2304) [Link]

Agreed. I guess this just shows that the ext4 problem really *was* a tempest in a thimble (too small to be a teapot): thoroughly dull once the panic died away and its complete lack of terrifying implications sank in.

Thoughts on the ext4 panic

Posted Oct 29, 2012 15:43 UTC (Mon) by nix (subscriber, #2304) [Link]

introducing unchecksummed data into the journal

More serious than being unchecksummed, this was a modification made outside all journal transactions. That's... not supposed to happen.

it is not at all clear what the best response should be when journal checksums fail to match

It is clear that the current response is far suboptimal, but the right approach (aborting only those blocks with corrupted checksums) isn't implemented, and since it relates only to an obscure feature that basically nobody turns on, implementing it hasn't been considered terribly important.

it has already been revised once

Twice. Eric complexified the patch, then Ted simplified it again.

the media coverage of this bug was clearly out of proportion to that bug's impact
[...]
Many media outlets employ editors and writers who, almost beyond belief, are not trained in kernel programming.

In other news, a degree of wind expected on the US east coast today. :)

the addition of features to the ext4 filesystem

... surely doesn't apply here. This feature has been around for years and years, about as long as ext4 itself. Even this bug caused no more than the loss of a few log entries and half a dozen emails. fsck cleaned up the mess remarkably well, well enough that I thought nothing of smashing the corruption hammer into the same filesystem over and over again while helping characterize the bug.

Quote of the week goes to jcorbet

Posted Oct 29, 2012 17:08 UTC (Mon) by dskoll (subscriber, #1630) [Link]

Many media outlets employ editors and writers who, almost beyond belief, are not trained in kernel programming.

Jon, that was priceless.

Thoughts on the ext4 panic

Posted Oct 29, 2012 17:29 UTC (Mon) by cesarb (subscriber, #6266) [Link]

> One should never underestimate the value of good backups, but, with ext4, the chances of having to actually use those backups remain quite small.

"One should never underestimate the value of good backups, but, with ext4, the chances of having to actually use those backups _because of ext4_ remain quite small."

Fixed. As far as I know, ext4 does nothing to reduce the need for backups (unlike some other filesystems with things like built-in mirroring to separate disks or even separate machines).

Thoughts on the ext4 panic

Posted Nov 8, 2012 6:00 UTC (Thu) by kevinm (guest, #69913) [Link]

Mirroring, whether to separate disks or separate machines, also does nothing to reduce the need for backups.

In the case of application- or user- level corruption, mirroring will simply replicate the error very efficiently.

Thoughts on the ext4 panic

Posted Nov 8, 2012 7:49 UTC (Thu) by dlang (guest, #313) [Link]

to err is human, to really foul things up requires a computer

and to _really_ trash things requires automation :-)

automated replication (including mirroring) is a wonderful way to propagate corruption (as someone who has automated the trashing of several hundred systems with a single command, I can really attest to this one)

Thoughts on the ext4 panic

Posted Nov 8, 2012 15:34 UTC (Thu) by nix (subscriber, #2304) [Link]

Quite. I did, after all, spot this problem on a hardware RAID-5 array. The RAIDness of the array didn't help: the corruption was faithfully written out. It had neither the evil nor corrupted bits set, after all.

Thoughts on the ext4 panic

Posted Nov 11, 2012 2:16 UTC (Sun) by steffen780 (guest, #68142) [Link]

Sorry, but that claim is just wrong. Mirroring and replication do REDUCE the need for backups. They do, however, not ELIMINATE it. They protect against many forms of hardware failure, including what I would assume is the most common cause of hw-caused data loss: disk failures. Hence they reduce the need for backups. Of course they do not get rid of user-/app-level failures. They're not supposed to (and by definition cannot), and hence they do not eliminate the need for backups... but they do definitely reduce it.

Thoughts on the ext4 panic

Posted Nov 12, 2012 3:58 UTC (Mon) by kevinm (guest, #69913) [Link]

The need for backups is binary - you either need backups, or you don't need backups.

Thoughts on the ext4 panic

Posted Nov 12, 2012 5:10 UTC (Mon) by neilbrown (subscriber, #359) [Link]

The backups you need can take any of a range of values from none to offsite copies that let you reconstruct the state of anything at any moment.

Linguistic bug threatens linux users!!!!!

Posted Oct 29, 2012 20:36 UTC (Mon) by bkw1a (subscriber, #4101) [Link]

In the news:

"A nasty language error in a recent Linux Weekly News article has been proven capable of causing a massive cascade of user comments. The error slipped by previously-respected LWN editor Jonathan Corbet. What does an error of this magnitude say about the process of Linux article development?"

Next up: FRANKENSTORM !!!!!

Linguistic bug threatens linux users!!!!!

Posted Oct 29, 2012 22:25 UTC (Mon) by nix (subscriber, #2304) [Link]

My understanding is that massive hurricanes with the accompanying power surges and days submerged in dirty water can cause significant damage to ext4 filesystems. This is *obviously* the fault of inadequate testing and the people responsible should be ashamed of themselves for not running ext4 tests at the bottom of swimming pools like any *real* filesystems developer would.

(Sorry, just channelling Phoronix forum users for a moment.)

Linguistic bug threatens linux users!!!!!

Posted Oct 30, 2012 12:47 UTC (Tue) by man_ls (guest, #15091) [Link]

What does an error of this magnitude say about the process of Linux article development?

Don't worry, Corbet is just the exception that proves the rule.

Thoughts on the ext4 panic

Posted Oct 29, 2012 21:27 UTC (Mon) by bronson (subscriber, #4806) [Link]

Nice job Nix. Your tenacity has made the world a better place. Not necessarily because many people were going to lose data to this bug, but because you showed how something that looks like an obviously simple and safe feature can cut in surprising ways.

Thoughts on the ext4 panic

Posted Oct 29, 2012 22:36 UTC (Mon) by nix (subscriber, #2304) [Link]

Oh yes. For why async commits require checksumming, see fs/jbd2/recovery.c:do_one_pass(), the switch case for JBD2_COMMIT_BLOCK.

As an aside, this function provides a fairly strong argument against one of eight-character tab indentation and/or wrapping at 80 characters... the lines are so squashed and short that it's really terribly hard to read. Personally I'm more in favour of a '70 character lines, disregarding indentation' rule, which prevents lines becoming too long to be readable without constraining indentation depth hugely: in do_one_pass(), the effective line length due to one while loop plus a switch statement plus a loop or conditional inside one of the cases is about thirty characters, which is just ridiculous. (As for pandering to the tiny minority of people still writing code on VT102s, they can go and get some hardware built sometime in the last twenty years. Punched cards are obsolete: so are 80-character terminals, IMNSHO.)

Thoughts on the ext4 panic

Posted Oct 30, 2012 0:55 UTC (Tue) by ikm (subscriber, #493) [Link]

When one runs a terminal emulator, it's usually 80-chars wide by default. I have a strong feeling that the 80-chars wrapping is still considered relevant largely because of that default.

Thoughts on the ext4 panic

Posted Oct 30, 2012 4:28 UTC (Tue) by martinfick (subscriber, #4455) [Link]

Or perhaps enough programmers still like to put more than one terminal or editor on their screen at once side by side? I suspect that who use IDEs forget that this is even possible.

Thoughts on the ext4 panic

Posted Oct 30, 2012 7:14 UTC (Tue) by Aliasundercover (subscriber, #69009) [Link]

What is it with the IDEs not allowing multiple source windows? I can't imagine working without my screen filled with various views of my source. The IDEs I look at give tabs but no way to display multiple at once. When writing code I normally want to see a header defining the structures I'm using, some other source file which my code interacts with, yet another source file where I did something vaguely similar once before ...

I don't see a Linux IDE which allows this. Visual Studio on Windows does, I just have to tell it to work in its multiple document mode.

If given the choice between a pack of x-terms and vi straight out of the 80s and a modern IDE with all the bells and whistles but just one source window I would take x-terms and vi. Hard to imagine how anyone can use these IDEs much less design them the way they are.

Thoughts on the ext4 panic

Posted Oct 30, 2012 12:42 UTC (Tue) by fb (guest, #53265) [Link]

> What is it with the IDEs not allowing multiple source windows? [...]
> I don't see a Linux IDE which allows this.

FYI, just checked and Eclipse (at least Juno/4.2) allows this. I can drag a source tab and place it next to my main source view.

Thoughts on the ext4 panic

Posted Oct 30, 2012 17:26 UTC (Tue) by marcH (subscriber, #57642) [Link]

It is indeed much easier to split and generally manage the screen with a full screen Eclipse than with most window managers.

Thoughts on the ext4 panic

Posted Oct 30, 2012 21:17 UTC (Tue) by khc (guest, #45209) [Link]

But only if all the windows you need to manage are eclipse windows, of course. I recently started to play with Dart using the Dart Editor which is basically eclipse, it works ok, but I don't know if I'd consider managing a full screen eclipse along with other windows "easy"

Thoughts on the ext4 panic

Posted Oct 31, 2012 8:40 UTC (Wed) by marcH (subscriber, #57642) [Link]

> but I don't know if I'd consider managing a full screen eclipse along with other windows "easy"

No it's not easy; you are not supposed to do that. Instead, you are supposed to install and use whatever plug-ins give you the functionality you need *within* Eclipse.

Eclipse is the new Emacs: a Developer Operating System :-)

http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.ecl...

Thoughts on the ext4 panic

Posted Nov 1, 2012 3:52 UTC (Thu) by Aliasundercover (subscriber, #69009) [Link]

> FYI, just checked and Eclipse (at least Juno/4.2) allows this. I can drag a source tab and place it next to my main source view

Hmm, I last poked at Eclipse a while back and got stuck in tile land till I gave up and moved on to try other IDEs which also left me in tiles. I find tiles most unsatisfying as I wind up spending my time fiddling with them and scrolling too small window fragments even on the biggest screen. I much prefer overlapping windows I can customize. I have been working in Kate which lets me arrange things as I like, permits multiple windows into the same source file and shares the screen well with windows from other programs.

Taking another look based on your message I got the Eclipse 3.7 Ubuntu offered me and found I can indeed get decent looking windows if I fiddle enough. It will take time to know if the fiddling stays excessive or I can tame it with more knowledge.

Of course it wants to mangle my code as I type. It really wants me to follow someone's idea of correct style and I know I will be some time getting it to stop helping me by making an awful mess of my code. Why exactly does it insist on extra * characters on each line? (No, don't answer that, I just want to turn it off.)

/*
* I can really do without these fool *s.
* Much preference fussing and still I get *s.
*/

Will have to see. Looking better than last time. Qt Creator and KDevelop never got me this far, not past tiles and tabs with them. So much IDE stuff filling the screen, so little space for the source code I actually care about.

Thanks

Thoughts on the ext4 panic

Posted Oct 31, 2012 9:25 UTC (Wed) by cesarb (subscriber, #6266) [Link]

> What is it with the IDEs not allowing multiple source windows? [...] I don't see a Linux IDE which allows this.

VIM has split/vsplit. Qt Creator also has split modes. And as others have already noted, Eclipse also has something of the sort. IIRC, Emacs can split the screen too.

Thoughts on the ext4 panic

Posted Oct 31, 2012 13:20 UTC (Wed) by nix (subscriber, #2304) [Link]

Emacs has always been able to split the screen, since long long before X support existed. (It calls the panes 'windows' and the GUI windows 'frames' because Emacs does everything differently.)

Emacs 24 can split the screen, both manually and automatically, in so many ludicrous ways I have not begun to explore the possibilities. (The window-management code was rewritten by Martin Rudalics, and the new code is... very flexible. Almost too flexible to understand.)

Thoughts on the ext4 panic

Posted Nov 1, 2012 9:32 UTC (Thu) by ncm (guest, #165) [Link]

"So many ways"?

1. Wrong

Any others?

Thoughts on the ext4 panic

Posted Nov 1, 2012 19:50 UTC (Thu) by daglwn (guest, #65432) [Link]

> I don't see a Linux IDE which allows this.

Emacs/CEDET.

Thoughts on the ext4 panic

Posted Oct 30, 2012 10:38 UTC (Tue) by nix (subscriber, #2304) [Link]

Well, yeah. Given how wide modern monitors are, 80 chars is still silly narrow. I can easily fit *three* 120-char-wide panes side by side. (Now maybe I use a smaller font than most, but even a normal xterm font should be able to fit two.)

Thoughts on the ext4 panic

Posted Oct 30, 2012 5:00 UTC (Tue) by wblew (subscriber, #39088) [Link]

Interestingly enough, I have not used a terminal window, for source code viewing, that is less than 150 characters for a number of years.

I truly enjoy programming my 1920x1200 27" monitor.

Thoughts on the ext4 panic

Posted Oct 30, 2012 5:01 UTC (Tue) by MTecknology (subscriber, #57596) [Link]

I love 80 char limits. Thanks to larger screens we can make that a soft limit too! I find that it still makes a crazy about of sense, especially when it comes to readability. I believe that's covered in chapter two of the Linux kernel coding style? "The limit on the length of lines is 80 columns and this is a strongly preferred limit."

Thanks nix! I love that summary. It saves me from having to read anything or do any of my own research and still sound smart when I explain it to others! :D

Thoughts on the ext4 panic

Posted Oct 30, 2012 10:39 UTC (Tue) by nix (subscriber, #2304) [Link]

I agree with the limit on the length of lines: it should generally be shorter, more like 70 chars. That has nothing to do with terminal emulator width and everything to do with readability and the human eye.

But that width is the length of the line from its first character to its last: you do not sweep your eye over the indentation, so it should not count.

Thoughts on the ext4 panic

Posted Oct 30, 2012 18:53 UTC (Tue) by vonbrand (guest, #4458) [Link]

According to Poynton's "Ten Common Mistakes in the Typesetting of Technical Documents", a 66-character line of text is widely considered ideal for readability. I've seen other claims in the same vein (e.g. in the LaTeX memoir and KOMA packages documentation, aimed at serious book-writing). The 80 characters come from the IBM punched cards of yore, but their design in turn surely wasn't completely random either.

The decree from the $POWERS_THAT_BE is enshrined in the Linux coding style; trying to change that is futile (or at least, there are more fruitful outlets for your creative energies).

Thoughts on the ext4 panic

Posted Oct 30, 2012 21:00 UTC (Tue) by dlang (guest, #313) [Link]

80 characters was based on 10 characters per inch on standard typrwriters combined with 8 1/2" wide paper and the fact that you needed margins of ~1/4" to avoid problems with trying to type up to the edge of the paper.

When teletypes were built, they used the same print mechanisms and so had the same limits.

When terminals were built, they mimicked the printed stuff (so that you could see everything that you could see on the paper, and it was a waste to have anything wider, since the people who were still using paper wouldn't be able to see it)

IBM punch cards were 80 columns to match the paper as well.

The problem is none of these are good reasons any longer.

As for the ideal column width to read, go do some research on why newspapers use such narrow columns, the ideal width for reading is surprisingly narrow, and NOT 66 characters.

a paperback book is about the outer edge of what a good width is (no matter what the font size)

Thoughts on the ext4 panic

Posted Nov 4, 2012 23:37 UTC (Sun) by marcH (subscriber, #57642) [Link]

> As for the ideal column width to read, go do some research on why newspapers use such narrow columns, the ideal width for reading is surprisingly narrow, and NOT 66 characters.

Source code and newspaper articles are quite different types of "literature". The are typically laid out in extremely different ways. It would be a very surprising coincidence if their "ideal widths" were the same.

Thoughts on the ext4 panic

Posted Nov 6, 2012 18:25 UTC (Tue) by anselm (subscriber, #2796) [Link]

Newspaper columns are as narrow as they are because that lets the publisher fit more stuff on a page and cut paper costs, not because narrow columns are especially easy to read.

Newspapers even go to the trouble of having special fonts designed for them (where do you think Times Roman got its name?) in order to be able to cram more type into a narrow column, thus making the effective column width greater.

Thoughts on the ext4 panic

Posted Oct 30, 2012 9:54 UTC (Tue) by niner (subscriber, #26151) [Link]

The solution to the horrible formatting of do_one_pass() is not to increase the character limit, but to factor out parts of the function to decrease indentation levels. Some parts are indented 8 levels.

To quote CodingStyle: "The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program."

do_one_pass() is 390 lines long making it very hard to see it's structure anyway which just strengthens the indication of needed refactoring.

Thoughts on the ext4 panic

Posted Oct 30, 2012 10:43 UTC (Tue) by nix (subscriber, #2304) [Link]

So... you think that if you split each case statement into its own function, it would somehow become better designed?

Sorry, I'm as much a fan of functional decomposition as the next man -- actually I go a bit nuts about it -- but the key to that is splitting functions at logical boundaries. Splitting at random because the indentation passes some arbitrary 'too deep' level, when you would not have split it otherwise, is ridiculous -- and this function indicates that the kernel developers do not in fact do this ridiculous thing. (Obviously there *is* an effective length/complexity limit, and perhaps do_one_pass() is past it -- but a hypothetical function with a long stretch of low indentation with a bit of high indentation in the middle of it is *not* made any clearer by splitting the high indentation into a completely arbitrary separate function!)

Thoughts on the ext4 panic

Posted Oct 30, 2012 11:00 UTC (Tue) by niner (subscriber, #26151) [Link]

It may become better designed or maybe not. I did not read the code in detail because of too little time to decifer the code.

Maybe I was a bit unclear on this: if there is indeed not a single logical boundary in the function, then well, tough luck. Then this is one of the rare cases where there's just no good solution. Even with an increased character limit, the function would not be that much more readable. It would still be too long to get a good grasp of the structure quickly.

I did not mean that arbitrarily cutting up the function would be a solution, just that usually it is very much possible to decompose and achieve increased readability. That's why I used the word "indication" in my last sentence.

Thoughts on the ext4 panic

Posted Oct 30, 2012 14:14 UTC (Tue) by viro (subscriber, #7872) [Link]

oh, give me a break...

tagp = pointer; // fairly fat expression, actually
while (tagp - pointer <= size) { // again, and so's 'size'
tag = (journal_block_tag_t *) tagp;
flags = be32_to_cpu(tag->t_flags);
/* couple of lines */
err = jread(&obh, journal, io_block);
if (err) {
success = err;
printk(....);
} else {
/* about 50 lines */
}
skip_write:
tagp += sizeof(journal_block_tag_t);
if (!(flags & JFS_FLAG_SAME_UUID))
tagp += 16;

if (flags & JFS_FLAG_LAST_TAG)
break;
}

In a case. Wrapped in a while. Only at Dibbler's, and that's cutting me own throat...

Seriously, in this case 80-column heuristic has worked nicely - the code structure is badly obfuscated and 4-character tabs would only hide a visible indicator of trouble.

Eight character tabs

Posted Oct 30, 2012 13:01 UTC (Tue) by man_ls (guest, #15091) [Link]

The 80-char wrapping is obsolete, but the 8-char tab rule is just... puzzling. I usually have 4-char tabs and it works fine. I have used 2-char tabs and they work fine most of the time -- and that was for Python where indentation can get a bit hairy without {} blocks to guide the eye. The kernel coding style implies that 8-char tabs are easier to the eye, but that is a bit subjective -- for me they are much harder to read.

Is there any real reason why you cannot display tabs as 4 characters locally in your Linux code? After all Linux does use the tab character. I realize that you risk being burnt alive at the stake in some Linux conference if you forget to change tab length and they spot you, but that is a small price to pay for being able to read code comfortably.

Eight character tabs

Posted Oct 30, 2012 14:32 UTC (Tue) by nix (subscriber, #2304) [Link]

Changing the local tab width is safe as long as you never ever use spaces to indent anywhere, even on the ends of tabs to line up arguments in multi-line argument lists. That's... rare, and if people proceed to change tab widths and then line up arguments using *that* width, the result is just a mess for everyone. (Just another reason why it is basically a mistake to use tabs for anything. Spaces everywhere! I note that even GNU software, long a weird 8-char-tab-for-indentation-but-four-char-indent holdout, is slowly switching to spaces, project by project, because dealing with tabs has failure cases that just don't exist if you use spaces everywhere.)

Eight character tabs

Posted Oct 30, 2012 14:41 UTC (Tue) by paulj (subscriber, #341) [Link]

Well, dealing with spaces has failure cases that just don't exist if you use tabs everywhere :). But yes, over all, too many people will get tabs wrong that standardising on spaces - inferior as they are - is the only option.

Eight character tabs

Posted Oct 30, 2012 17:26 UTC (Tue) by nix (subscriber, #2304) [Link]

I don't see anything wrong with spaces. They make your uncompressed source code 10% or so larger, but these days that has no effect at all on anyone unless you're dealing with a source tree as large as, say, Chromium's. Everyone knows how big a space is, nobody can decide to make it take up more horizontal space... spaces are nice and uncontroversial and always work. Tabs don't.

Eight character tabs

Posted Oct 30, 2012 21:49 UTC (Tue) by man_ls (guest, #15091) [Link]

The problem with spaces is exactly that you cannot change indentation on the fly just by changing the local configuration. Every developer can choose the tabstop that suits them better. It is even possible to have different configurations for different situations: one for the laptop screen, another when the laptop is hooked to a monitor, etcetera. Spaces don't allow that.

Also, it takes longer to press space four times (or eight) than tab once. You can usually configure your favorite editor to insert the spaces when tab is pressed, even to delete them (:set expandtab in vim), so this issue is not so important.

As to the failure case you mention above, it makes sense to indent always using tabs, consistently. I have now remembered my painful days of aligning arguments using spaces -- it is just not worth it. I prefer to use short argument lists and set line wrap at 120 so I don't ever see multi-line argument lists; if they arise then just indent into a new line.

Eight character tabs

Posted Oct 31, 2012 0:42 UTC (Wed) by nix (subscriber, #2304) [Link]

Unfortunately changing indentation on the fly by changing tabs never, ever works. It only works if all developers are utterly rigorous about only ever indenting with tabs, always indenting with strictly one more tab, never using tabs to line anything up -- and, oh yes, never trying to make anything line up at all. If you like a coding style which lines up wrapped parameters one column after the open bracket of the call on the previous line (a very common style) you cannot do it, even with tabs, and expect indentation-changing-via-tab to work.

I have never worked on a codebase where changing indentation via tab width changes did anything but turn the codebase into goo. It is hopeless.

The way to reindent is via automatic reindentation programs (such as GNU indent). Teach that your indentation style, and you're home free. Nothing else works.

Eight character tabs

Posted Oct 31, 2012 1:05 UTC (Wed) by bronson (subscriber, #4806) [Link]

... and then somehow unreindent before you commit your code? Otherwise, unless you enforce company-wide indent settings (good luck!), most of your changes will be whitespace churn.

I'm at the point where I'm ready to declare that, on a real world codebase with multiple teammembers, nothing at all works. Nobody likes the whitespace gestapo.

Eight character tabs

Posted Oct 31, 2012 8:52 UTC (Wed) by man_ls (guest, #15091) [Link]

You can try telling git to ignore whitespace.

My point of view is a bit different: tabs are good, therefore avoid everything that doesn't work with tabs. Lining things up is evil, hard wrap-up limits are evil, and so on. Yes, it requires a lot of discipline, but then so does software development in general.

Eight character tabs

Posted Oct 31, 2012 13:17 UTC (Wed) by nix (subscriber, #2304) [Link]

Yeah, automated reindentation before commit only works if you have something pre-commit that reindents again. I've worked on projects with such rules, and it does work (certainly better than what we had before, with people with all sorts of tab widths and odious 'optimizing' editors that translated spaces to tabs of whatever width the user had set throughout the entire file on every save). Better yet is to adapt to whatever the coding standard of the project is and just eat your whining most of the time. (Says the guy who whined about part of the kernel's coding standards just a few posts up. Hypocrisy is *good* for you!)

Eight character tabs

Posted Oct 31, 2012 11:40 UTC (Wed) by etienne (guest, #25256) [Link]

> spaces are nice and uncontroversial and always work. Tabs don't.

Well, have you ever used an editor with variable width font?
Why - maybe because text is easier to read...
Then, you can only use tabs to align stuff - unless that is the beginning of the line. If you comment a line using "//" at its beginning stuff stay aligned.
I do not want a hard limit on the number of chars, because not all chars are the same width...

Also, in some case (long debug printf() line), breaking the long line makes the structure of the function more complex than it really is; just tell you editor not to wrap and just ignore the end of the printf() line - it is just a printf()!

Eight character tabs

Posted Oct 31, 2012 13:21 UTC (Wed) by nix (subscriber, #2304) [Link]

Variable-width fonts don't work at all for most programming languages for the reasons you state. I actually tried to use a variable-width font only for comments for a while, but it really messes up ASCII-art, and the places where you find ASCII-art in comments are generally the places where the code is complex enough to *need* it, and you don't want to have difficulty piled on by interpreting proportional-font damage too.

Eight character tabs

Posted Oct 31, 2012 17:08 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

printf() - a function famous for never having been implicated in any complicated security bugs. Oh wait.

No, thanks all the same but I'd rather actually see the code I'm maintaining and not trust that the bits I can't see aren't important.

I like the ASCII control characters, but most of them don't belong in my source code and that includes U+0009 TAB. Use soft tabs, set the continuous integration software to barf on hard tabs in source code, along with mysterious trailing whitespace and similar sins.

Eight character tabs

Posted Oct 31, 2012 18:52 UTC (Wed) by nix (subscriber, #2304) [Link]

I don't understand your debug printf() comment at all. You can always break a printf() line, even in the format string, with string literal concatenation.

Eight character tabs

Posted Nov 1, 2012 9:41 UTC (Thu) by ncm (guest, #165) [Link]

Me, I've never understood why people insist that printf format strings should be indented at all. I push them to the left margin, and it brings no confusion, but much clarity.

Eight character tabs

Posted Nov 1, 2012 12:06 UTC (Thu) by etienne (guest, #25256) [Link]

But most of the time you want to understand the structure of the function you are about to modify, even if you have (probably commented) printf's showing that you entered the function with those parameters, that you are calling that other function with those parameters, what is the value of each of the fields of that structure...
Having 3/4 lines of code with no algorithm meaning does not help to see the structure of the code and is difficult to comment/uncomment at once; it is easier to tell your editor not to wrap lines and so not display the end of those (probably commented) lines.
Obviously you can write the functions print_mystruct_and_cpt(), print_mystruct_and_idx(), ... that you call only once in a commented out block.
Linux source do not have much "logging" lines, so it is not a real problem there.

Eight character tabs

Posted Oct 30, 2012 18:11 UTC (Tue) by marcH (subscriber, #57642) [Link]

> But yes, over all, too many people will get tabs wrong that standardising on spaces - inferior as they are - is the only option.

Tab indentation was invented to illustrate "The perfect is the enemy of the good".

Let everyone have his own indentation taste from the same source? Great idea on paper. Falls apart on a regular basis. Because it's not compatible with lining up across lines, because it's not compatible with a max width, etc. And because people are only human and will not be disciplined enough.

Eight character tabs

Posted Oct 31, 2012 11:37 UTC (Wed) by Jonno (subscriber, #49613) [Link]

Lining up arguments in tab-indented code works just fine, iff you use the same number of tabs for indentation, and then line up the arguments with spaces.

Eight character tabs

Posted Oct 31, 2012 18:09 UTC (Wed) by marcH (subscriber, #57642) [Link]

"In theory, theory and practice are the same. In practice, they are not."

Thoughts on the ext4 panic

Posted Oct 30, 2012 5:33 UTC (Tue) by shentino (guest, #76459) [Link]

It's possible to make something break without being broken yourself.

Plenty of bugs are caused by one patch exposing a defect in another patch.

Thoughts on the ext4 panic

Posted Oct 30, 2012 9:14 UTC (Tue) by marcH (subscriber, #57642) [Link]

> Journal checksumming is an optional feature, not enabled by default, and, evidently, not widely used. Nix had turned on the feature somewhat inadvertently;

... which reminds us one of the ways Apple achieves higher Quality, while most others still fail: no optional features.

(another, related way is: bloat control)

Thoughts on the ext4 panic

Posted Oct 30, 2012 15:10 UTC (Tue) by salimma (subscriber, #34460) [Link]

... though when it comes to file systems, Apple's HFS+ is actually replete with seldom-used optional features - such as, mind-boggling as it might seem, case sensitivity.

But these are just exceptions that prove the rule... *ducks for cover*