Why people don't test development distributions

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
July 6, 2009

Development distributions play a crucial role in the free software ecosystem. They are the proving ground where much new software is first exposed to a wider user community; they are also the place where this software demonstrates how well it plays with other packages. Distributors would like to see wider testing of their development releases, but, as your editor's recent experience shows, there are limits to how wide this testing community can be expected to be.

Your editor has a habit of running development distributions on real-work machines. There is no better way to stay on top of what the development communities (at both the distributor and upstream levels) are up to; it's also a way to help the community by finding and reporting bugs. Much of June was spent traveling, though, with the result that these machines were generally on the wrong side of an ocean and, thus, fell behind the leading edge. On return, after shoveling out a horrifying inbox, your editor decided to bring his desktop system up to current Rawhide. After all, what could possibly go wrong?

Anybody who has worked with development distributions for any period of time knows that the early part of the distribution development cycle is when things are most likely to go wrong. That's when the distribution-wide, disruptive changes go in. Traffic on the mailing lists suggested that, after the Fedora 11 release, Rawhide did not disappoint anybody looking to add a little adrenaline to their working day. Still, it seemed that things had settled out a bit; one tester responded to a query from your editor by saying:

You have missed all the fun! :-) Rawhide just got back to usable state where I can begin reporting bugs again. Firefox has been completely weird, Evolution won't even start here, the kernel has done a good job of cooking my system drawing about twice the normal amount of power...

So your editor upgraded. Sound stopped working. The screen saver started leaving the display in a weird, low-color-resolution state. And, most annoyingly, the keyboard layout went fully into psychedelic country. Selecting the indispensable GNOME "caps lock is another Control" option yielded a keyboard with no Control key at all; turning that option off restored control - to the Alt-left key. The Alt modifier appeared to be entirely unobtainable - a situation which can only serve to cause extreme misery to any serious Emacs user.

All inconvenient, but, then, development distributions can be like that; one should not venture into that world if one is not prepared to encounter occasional bizarre behavior. Often, in cases like this, the best thing to do is to report the problems and follow the leading edge closely in the hope that fixes will be uploaded soon. So that's what your editor did.

[PULL QUOTE: Your editor, drawing on many years of system administration experience, had come to the reasoned conclusion that it was a good time to run away screaming. END QUOTE] Big mistake. Just before the holiday weekend in the US, somebody uploaded a broken prelink which hosed most important executables on the system. The result was a box which wouldn't boot and which couldn't really even be fixed from a rescue disk. It now seems that running prelink -au * from a rescue disk might be a way for other afflicted users to get their boxes back. By the time that was posted, though, your editor, drawing on many years of system administration experience, had come to the reasoned conclusion that it was a good time to run away screaming.

A helpful hint for development distribution users: have at least one other root-suitable partition set aside on the system. All useful files not directly tied to the distribution should be stored elsewhere. If things get really ugly, one can always boot an emergency backup partition and end up with a usable system. This article is currently being typed using a system kept on such a partition.

Others recommend running development distributions within virtualized guests or on sacrificial boxes. Both of those techniques are useful, but they miss an important point: the best way to find problems in new software is to use it for real work. If people are not trying to actually get things done with a development distribution, they are going to miss a lot of the bugs. Those bugs will then turn up after the (allegedly) stable release, biting users who didn't think they were signing up for alpha-level software. We need people doing more than just convincing themselves that the testing box boots properly.

For this reason, Fedora, like other distributors, would like to see more people testing its development distribution. Your editor would like to see that too; testing of early releases is one of the "prices" that many of us need to pay to help ensure that our free software is as good as we expect it to be. Besides, tracking an evolving system is often fun; it can help to bring users further into our community. But it is hard to tell most users that they should be running a development distribution if it's liable to leave them with a smoking wreckage of a system when they really need to get some work done.

And, it should be noted, problems like this are certainly not limited to Rawhide; Ubuntu testers who updated gdm at the wrong time will certainly be questioning their karma as this is being written.

So, what can be done to make development distributions safer for a wider community of testers? Absolute safety seems unattainable, but there are some things which could be done:

Create a version of the distribution containing packages which have shown a relatively low level of combustibility. The alpha releases done by some distributors are a step in this direction; there is usually an attempt made to stabilize things a little bit prior to the release. But these releases tend to leave testers somewhat behind the current state of the art. Debian's "testing" distribution is probably the best example of how this can be done on an ongoing basis.
Provide an indication of the state of the distribution. Many beaches are equipped with red flags which are posted when dangerous currents are present. Wouldn't it be nice if an apt-get upgrade could respond with a message like "the current threat condition is orange, you may want to reconsider"?
A built-in rollback system which can undo the effects of an ill-advised upgrade, even if the system as a whole has been reduced to rubble. The Btrfs snapshot mechanism should be well suited to this sort of feature - once Btrfs is stable enough to be used on a root partition.

This is an issue which merits some thought. If we can make testing easier and safer, we should end up with more testers. That, in turn, should lead to more stable releases and, just importantly, users who have more invested in the software and the process which creates it. It is hard to see how those could fail to be good things.

(Log in to post comments)

Why people don't test development distributions

Posted Jul 6, 2009 21:23 UTC (Mon) by drag (guest, #31333) [Link]

I know the answer :)

You make the stable system snapshots of the unstable system. This reflects the true nature of software were progress is a relentless grind and releases are merely development snapshots that are given special consideration.

This is how Debian does it with Sid and the testing snapshots and it works terrifically. I have used Debian unstable for years at a time without the major breakages that Fedora and Ubuntu users suffer from. Just lots of little breakage from time to time.

Then to solve the little breakages you use the brilliant Conary package management system developed by former Redhat developers in rPath and is incorporated into Foresight Linux distribution.

Conary has many very significant design advantages over RPM or DPKG system.

* Packages are very easy to make compared to deb or rpm format.

* Only updates the specific files that need to be updated, not entire software packages.

* features 'Rollbacks' so that users can freely and simply do downgrades in order to avoid breakages that plague normal Linux users.

Apt-get and Yum and such are designed to make upgrade easier... Conary makes going both ways easier. It is by far a much superior design.

If I was ruler of the planet I would have a single distribution core, featuring mainly 'Linux plumbing' that is under continious development. Then Debian and Fedora folks would be forced to port their stuff over to Conary packages and use the common core for building their custom distributions.

This would have a massive effect of reducing workload, eliminating redundant packaging and thus increasing the quality and number packages dramatically.

Why people don't test development distributions

Posted Jul 6, 2009 21:33 UTC (Mon) by einstein (guest, #2052) [Link]

You got my vote sir!

Why people don't test development distributions

Posted Jul 6, 2009 21:48 UTC (Mon) by kragil (guest, #34373) [Link]

Yeah, distributed package management and bug tracking is still needed.

I sure hope the benefits of distributed SCM will eventually lead to more .deb and .rpm colaboration on bugs and packaging.

Why people don't test development distributions

Posted Jul 7, 2009 15:33 UTC (Tue) by salimma (subscriber, #34460) [Link]

yum (Fedora) and zypper (openSUSE) support delta RPMs, so only incremental changes need to be downloaded during an update. yum supports rollbacks too.

Why people don't test development distributions

Posted Jul 7, 2009 18:54 UTC (Tue) by kragil (guest, #34373) [Link]

Yum does not support rollbacks in every situation. Snapshots are better.

Why people don't test development distributions

Posted Jul 6, 2009 21:38 UTC (Mon) by nix (subscriber, #2304) [Link]

A helpful hint for development distribution users: have at least one other root-suitable partition set aside on the system.

An alternative: have a busybox linked against uClibc linked to /bin/ash or similar. Then you can use that if glibc --- or, as in this case, *everything* dynamically linked --- gets fubared. For extra paranoia mark it immutable.

prelink is an awesome piece of work, but to be honest I look at anything editing ELF executables like it does and remain amazed that it ever works. There aren't many things out there that treat executables like a big data structure and tromp around in them modifying stuff *on the disk*. Only valgrind scores higher in my personal wizardliness scale, and if valgrind goes wrong, the system doesn't implode.

Automating prelink runs after a glibc or prelink update seems particularly risky: the thing should be run manually so you can verify that the result worked. (prelink itself is statically linked, and your existing *running* binaries will still work, so prelink -ua will function no matter how badly prelink goes wrong.)

This is probably not suitable for release versions of distros: unskilled users won't know how to fix things even if they do go wrong. But unskilled users won't be running rawhide.

The biggest mistake most of these people made was to reboot. Unix boxes are extremely robust even after horrific filesystem catastrophes until you reboot them, after which you're often in reinstall territory. Vaped dynamic loader, erased /lib, encrypted all of /bin and /usr/bin? The system's still running to some degree so you can figure out how to fix it. Reboot, and things get harder, 'cos that system isn't booting without help. I suppose livecds do help here.

Why people don't test development distributions

Posted Jul 7, 2009 8:24 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

In this case, prelink -ua was a killer without any reboot involved. Link reboot prelink is one of the few operation with instantaneous system-wide consequences.

Why people don't test development distributions

Posted Jul 7, 2009 12:12 UTC (Tue) by nix (subscriber, #2304) [Link]

You don't get my point. Sure, a broken prelink (like a broken libc install) kills exec() of new processes: but *existing ones continue to run*, so with sufficient ingenuity you can often use them to recover things, even though new exec()s die. (Even new dlopen()s from those existing processes will often be unaffected, unless the dlopen()ed libraries are also named as DT_NEEDED by some process, which is fairly rare for plugins and such.)

Reboot, and that possibility is lost.

Why people don't test development distributions

Posted Jul 7, 2009 12:42 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

That's theory.

If you know beforehand the problem, with a lot of ingenuity and chance, you may recover.

In the real world you'll always have closed a process you need for recovery (because you don't expect every new exec to fail) so the system is unrecoverable.

And BTW, existing processes will try to respawn childs that keep dying, so you'll have to deal with a fork-bomb simultaneously (and the first reflex will be so init 1 to quieten the system, which will close existing terms, and guess what will happen when you try to open a new one?)

Why people don't test development distributions

Posted Jul 7, 2009 16:31 UTC (Tue) by joey (guest, #328) [Link]

Existing proccesses are unlikely to respawn more than 1 process each, which will fail to start, which does not qualify as a fork bomb, surely?

I had excellent luck once recovering from a hosed ld.so using only zsh. zsh, you see, includes a built-in ftp client that doesn't need to fork at all in order to download a file. Only an open root shell was needed.

Why people don't test development distributions

Posted Jul 7, 2009 17:26 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

> Existing proccesses are unlikely to respawn more than 1 process each,
> which will fail to start, which does not qualify as a fork bomb, surely?

That depends on how many do at the same time, if some of them are configured to maintain a process worker pool, and how quickly they retry.

Why people don't test development distributions

Posted Jul 9, 2009 16:39 UTC (Thu) by Tet (subscriber, #5433) [Link]

If you know beforehand the problem, with a lot of ingenuity and chance, you may recover

Sometimes, the level of ingenuity necessary is beyond that of mere mortals...

Why people don't test development distributions

Posted Jul 11, 2009 9:48 UTC (Sat) by nix (subscriber, #2304) [Link]

That was the war story I was referring to, as well. It's almost alone:
I've seen a single additional tale in which an IRIX kernel hacker did
something similar to get a dead IRIX box back to life, and that's all.
(The 'as one whose world has just come to an end' story doesn't approach
the same level of brilliance, sorry.)

People who can do things like that are *rare*.

Why people don't test development distributions

Posted Jul 6, 2009 21:51 UTC (Mon) by lmb (subscriber, #39048) [Link]

Our editor's experience is part of a death spiral.

As developers know that users won't bother testing the development releases, they develop a "lax" attitude towards pre-submission and sometimes even pre-compile testing. As users know that developers, they are less willing to test; and the loop starts over again.

(From years of experience on enterprise distributions, they exhibit a bit of the same pattern - IHVs and ISVs will only bother testing the RCs, so the alpha and early beta releases are of expectable quality due to the very same mechanism.)

Further, "hey it is a beta" is used as an excuse for substandard quality, lack of regression and feature tests etcetera. Development distributions suffer from a further problem - namely, instead of just testing a single package, one gets the "best" of all test cycles rolled into one, making the overall experience even less pleasant, and even a small failure in a single component can cause a cascade of failures; or, combined, the total sum of all minor annoyances can render the system unusable.

And, of course, packagers are not always knowledgable upstream developers, but cram in updates from the upstream development cycle too - again turning the distribution not only into an integration testbed, but into a whack-a-mole playground.

In my not so humble opinion, this is a terrible misunderstanding of "agile" development and how "release early, release often" should work. And it makes me want to smack down developers with a Test-Driven-Development book.

The only way around this is to take all developers and force their own development distribution down their own throats. And to enforce better quality assurance, as in public floggings of developers who submit packages which they clearly have not run themselves at the very least.

But to say that this is making me grumpy is not quite true. I'm a reasonable man, so it makes me bloody furious.

Why people don't test development distributions

Posted Jul 7, 2009 6:43 UTC (Tue) by malor (guest, #2973) [Link]

Even worse, then you get teams that notice and decry the lack of testing, and then trick users into doing it, like KDE did with 4.0. They've explicitly stated that they called it 4.0 to get more testing. They did some mental gymnastics, claiming that putting "this is unstable!" in the release notes meant it was okay, but in practically the same breath, they admit that they called it 4.0 to get testers. In other words, they weren't "lying", they were just taking advantage of people who didn't read the README. They knew that 4.0 would get tested because a .0 release has an implied level of quality and release-worthiness. In no way, shape, or form, was this quality level met by the code released at the time.

This is pure scorn of a user community... the people using KDE are the same people making the other shit that KDE depends on. Shipping them a broken desktop just screws them up and slows them down. Their time is worth just as much, if not more, than the KDE dev team's. But it's pretty apparent that the KDE people don't think that's true. Rather, their need for testers outweighs every other consideration -- so they'll fool people, and then point to the README as justification.

I have no doubt that yet another apologist will show up and say "no we didn't!", but yes you did. Anyone involved with making that decision who didn't actively campaign against it is a selfish asshole.

Why people don't test development distributions

Posted Jul 7, 2009 9:16 UTC (Tue) by dholland (subscriber, #14680) [Link]

"a .0 release has an implied level of quality and release-worthiness"

I know more people who say ".0 - it'll be buggy, don't touch it" than I do people who say ".0 - new release! it'll be great! install it now!"

So maybe the implication is there - but not necessarily the one you mean.

Why people don't test development distributions

Posted Jul 7, 2009 12:15 UTC (Tue) by fb (guest, #53265) [Link]

Seriously, can you honestly define what your friends call "buggy" wrt .0 releases?

I sincerely suspect it is something like: "there will be a number of dumb bugs no one found yet. In two or three weeks time there will be a bugfix release. Use that one".

KDE 4.0 was "buggy" more along the lines of: "It doesn't work. Features were all removed. The core libs are still in full development. Please don't mention the applications. Developers are fully aware of it. They expect to bring it to 'work' within a year's time, meaning that it will take close to 2 years (or more) for it to work again."

Why people don't test development distributions

Posted Jul 16, 2009 15:34 UTC (Thu) by Wol (subscriber, #4433) [Link]

The *most* *important* feature of 4.0 was the API freeze.

App developers weren't writing/testing apps because the underlying libraries were a moving target. 4.0 was the release that said "this target has stopped moving, please start porting your apps".

So OF COURSE there were no apps worth speaking of on 4.0.

Cheers,
Wol

Why people don't test development distributions

Posted Jul 7, 2009 12:16 UTC (Tue) by etrusco (guest, #4227) [Link]

Don't be silly; this is more of a Windows joke than anything else.

Why people don't test development distributions

Posted Jul 7, 2009 12:14 UTC (Tue) by nix (subscriber, #2304) [Link]

What are you talking about? Modulo one unfortunate ommission (in the release notes, whoops), KDE 4.0 was always, always considered a stable *library* release, i.e. a release for *app developers*.

Now can we finally shut up about KDE 4.0? Nobody wants to hear this skeleton of a horse be flogged yet again.

Why people don't test development distributions

Posted Jul 7, 2009 12:41 UTC (Tue) by malor (guest, #2973) [Link]

ie, if caught with pants down, redefine pants.

Why people don't test development distributions

Posted Jul 7, 2009 13:37 UTC (Tue) by nix (subscriber, #2304) [Link]

My understanding is that 4.0 was meant to be a libraries-are-frozen app-developers-are-go release from long before the actual release date. It's just a shame that everyone was so used to this that they didn't think to mention it in the release notes!

'Redefine' implies post-facto. This is more like 'define'.

Why people don't test development distributions

Posted Jul 7, 2009 13:20 UTC (Tue) by fb (guest, #53265) [Link]

Don't you think it is quite trollish to tell somebody to "shut up about KDE 4.0" during a discussion about testing development distributions?

IMHO KDE-4.0 is an text book example of software release planing and management failure. Should it be kept from being discussed **in the context of release testing and planing** because it embarrasses some? I don't think so.

Why people don't test development distributions

Posted Jul 7, 2009 13:36 UTC (Tue) by nix (subscriber, #2304) [Link]

It's not because it embarrasses people. It's because it's *boring*. We've been over this dead horse until only a pile of bones is left.

Why people don't test development distributions

Posted Jul 9, 2009 22:58 UTC (Thu) by johnflux (guest, #58833) [Link]

As a KDE developer - I agree. The whole KDE 4.0 fiasco was a mess.

But the developers who are embarrassed by it all just keep quiet and get to work trying to make the code better, leaving behind the noisy people who claim that KDE isn't to blame.

Why people don't test development distributions

Posted Jul 6, 2009 23:12 UTC (Mon) by sjlyall (guest, #4151) [Link]

Is automated build images a possible answer?

I know automation won't pick up everything but I would have thought the prelink and the gdm bugs mentioned would have been picked up by even simple "apply the update and reboot" testing.

To tell the truth I thought most distributions already did this.

Preling but not gdm

Posted Jul 7, 2009 0:16 UTC (Tue) by khim (subscriber, #9252) [Link]

I know automation won't pick up everything but I would have thought the prelink and the gdm bugs mentioned would have been picked up by even simple "apply the update and reboot" testing.

Prelink will certainly be detected, but gdm... no luck: bug there killed your session when you tried to update, offline upgrade worked fine (it just killed the session of the unfortunate user who tried to use the system at the time).

The real solution is shown by Gentoo: package is added to the system in "masked" state. It's possible to use it - but you need to specifically ask for it. Once enough "success stories" are obtained the mask is removed and all "unstable" users are upgraded. Works good so far: I certainly never seen unstable Gentoo system which refused to even boot!

P.S. Number of success reports differes for type of package: for core packages (like baselayout, prelink, or glibc) it can be "few months of testing", but for fringe package like tofrodos it can be "one user other then the packager"...

Preling but not gdm

Posted Jul 7, 2009 14:49 UTC (Tue) by mjthayer (guest, #39183) [Link]

> The real solution is shown by Gentoo: package is added to the system in "masked" state. It's possible to use it - but you need to specifically ask for it.

You echo my thoughts. I'm sure many users would be happy (even eager) to test certain unstable packages that they are interested in. Compare that to the number who would be happy to work with an unstable distribution (I for one would be very reluctant to). The only thing that Gentoo is lacking in this respect is a way to pull in the dependencies of the unstable packages you wish to test without having to remove the stable versions that your other packages are using.

Preling but not gdm

Posted Jul 7, 2009 19:05 UTC (Tue) by cry_regarder (subscriber, #50545) [Link]

Fedora already has this. It is called bodhi. See:

http://bodhi.fedoraproject.org

But the discussion is about development __distributions__ not development __packages__.

Preling but not gdm

Posted Jul 9, 2009 19:40 UTC (Thu) by yokem_55 (subscriber, #10498) [Link]

Gentoo, while it is rare for a stable or even unstable update to completely hose a system, a couple of years ago they pushed through an upgrade to expat which bumped the .so version number, which in turn hosed everything depending on expat. On top of this, the usual tool for fixing missing libraries (revdep-rebuild) had some nasty weaknesses that made fixing one's system decidedly non-trivial. I personally had no idea what expat was or how much of a system depended on it (nearly everything that ran in X to start). There are still people running into this as the support thread for the issue in the Gentoo forums shows.

Preling but not gdm

Posted Jul 12, 2009 7:56 UTC (Sun) by dirtyepic (guest, #30178) [Link]

> The real solution is shown by Gentoo: package is added to the system in
> "masked" state. It's possible to use it - but you need to specifically
> ask for it. Once enough "success stories" are obtained the mask is
> removed and all "unstable" users are upgraded.

We do? ;) There are only a couple situations I can think of where a new package/version would be added to the tree in a masked state:

- you know it's going to break things and you need guin- er, testers to smoke out the worst of the bugs (eg. gcc-4.4)
- you have a large number of packages with interdependencies between each other that need to all be made available at the same time (eg. gnome)

...unless by masking you're actually referring to keywording (~arch / arch). Then yes, packages need to be in ~arch (unstable) a minimum of one month before they can be stabilized. We don't need success stories, just a lack of bug reports. Stabilization is done by dedicated arch-tester teams, so you will always have at least one other set of eyes on a package before it hits stable.

IMO Gentoo generally manages to keep the unstable tree in working order and usable enough for everyday work. I think this can be attributed in part to the fact that we don't do releases and therefore don't have that period of churn that conventional distros do where everything is getting updated at once. Basically, we don't have a development cycle; we're always in development.

Why people don't test development distributions

Posted Jul 7, 2009 0:33 UTC (Tue) by nirik (subscriber, #71) [Link]

The prelink issue only would have been picked up if it was installed and the regular nightly
cron job was run... granted a automated test could have run that manually, but it's not normal
right after an update.

Why people don't test development distributions

Posted Jul 7, 2009 8:30 UTC (Tue) by lmb (subscriber, #39048) [Link]

One could, with reasonable strength of argument, come to the conclusion that software that cannot be automatically tested well is badly designed.

(The counter argument that there are some things that are extremely difficult to test - like usability - does not diminish the assessment that very few software packages get anywhere close an acceptable test coverage.)

Why people don't test development distributions

Posted Jul 7, 2009 12:48 UTC (Tue) by MathFox (guest, #6104) [Link]

It would help a lot if the infrastructural packages came with self-tests that are routinely run before a new version is shipped. (Yes, writing tests takes time; catching the bugs timely saves a lot of time.) For a lot of software it is not-economical to write a 100% coverage automatic test (games!) or the hardware is not available. However that should not stop you from testing the 80-90% that can be tested.

Why people don't test development distributions

Posted Jul 7, 2009 19:32 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

It makes a big difference to actually enforce testability. You get your nose rubbed in how much better things work.

In one wonderful (in the sense of getting to follow all those ``best practices'' you read about in books) job, we had automated testing, and were required to build in test points. Which were then evaluated in the code reviews.

Of course, we were building medical devices, and you tend to be a bit more careful when the FDA is looking over your shoulder. :-) I suspect we were within shouting distance of the 80–90 % number.

But that's only 80–90 % number. If that's what we could do under duress, it's hardly surprising we're lucky to get maybe 30 % in the real world. It's human nature (especially programmer nature) to believe both

I made no mistakes this time, and
I can't afford to take the time for real testing.

Until we figure how to overcome those predilictions, we'll have alpha-level ``releases''.

Typo

Posted Jul 7, 2009 19:39 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

`Predilections'. So much for level of testing.

Why people don't test development distributions

Posted Jul 8, 2009 11:05 UTC (Wed) by nix (subscriber, #2304) [Link]

prelink *has* self-tests, but perhaps this is a problem that only shows up if ld-linux.so.2 itself is prelinked, in which case you wouldn't see it unless you prelinked the whole system.

(I'm running eglibc 2.10-head and prelink and see no trouble, though. Local RH patch breaking something? What to?)

Why people don't test development distributions

Posted Jul 7, 2009 0:54 UTC (Tue) by dankamongmen (subscriber, #35141) [Link]

Major recent upheavals in debian-unstable, related to the multiarch migration and eglibc move, left me sheepishly explaining that "it'd be more trouble than worth to rebuild [some 32-bit applications on sid-amd64] for a few days". That was unpleasant.

But, the bugs were filed and quickly fixed; the big wheel kept on turnin'. Unstable provides me the crucial support for new kernels (and thus new APIs) I need more than anything. Indeed, signalfd() breakage (DBTS 533360) wouldn't have any meaning on most "stable" distributions (backport-happy RHEL aside).

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=533360
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=533362

Another factor why people don't test development distributions

Posted Jul 7, 2009 1:06 UTC (Tue) by cesarb (subscriber, #6266) [Link]

There is another factor that reduces the amount of testing of development distributions: the stable releases work too well.

If the most recent stable release of your favorite distribution does everything you want, with just a couple of very minor issues that you already worked around by editing the correct magic files in /etc, you do not feel as compelled to try something newer (and can even end up not bothering with upgrading when your favorite distribution releases its next stable release).

If, on the other hand, there is something missing or broken, or a new feature you cannot live without (even if you lived without it for your whole life until then), it is more probable that you will want to try a future release even before it is declared as finished.

The current tendency seems to be more towards the former than towards the later.

Another factor why people don't test development distributions

Posted Jul 7, 2009 8:34 UTC (Tue) by knweiss (guest, #52912) [Link]

There is another factor that reduces the amount of testing of development distributions: the stable releases work too well.

This is the optimistic view. A pessimist may say that people already face enough problems with the stable releases.

Another factor why people don't test development distributions

Posted Jul 9, 2009 14:55 UTC (Thu) by elanthis (guest, #6227) [Link]

Seconded. I run into far, far, far more bugs in every Linux stable distro release than I did with even the Beta of Windows 7. It's ridiculous. I get why, sure -- hobbyist developers have the right to only work on what they want (they're doing it for free and most of us are just freeloading off of their hard work), and features and rearchitecturing are far more fun than bug hunting. Free Software evolves at a very rapid pace but many (not all, of course) projects tend to have very low quality assurance standards.

Fedora 11 has a lot of bugs. I have no doubt that Fedora 12 will fix all of those big Fedora 11 bugs and then replace 3-4 core system components with some "new and improved" do-over project that introduces a whole new slew of bugs.

Upgrading a Linux distro is a gamble. You want to get all the stupid bugs the current distro has fixed and hope that the new set of bugs don't impede your work/fun more than the old set.

I think a large part of this has to do with the whole release strategy of Linux distributions. Install a specific set of software. Get no updates other than critical bug/security fixes. 6 months later, install a new version of EVERYTHING all at once. Get no updates other than critical bug/security fixes. 6 months later, install a new version of EVERYTHING all at once, again. Repeat ad naseum.

You can't get updated software without updating everything, so it's impossible to test the bits you care about without being forced to test all the crap you didn't see a need to change in the first place.

Another factor why people don't test development distributions

Posted Jul 9, 2009 15:52 UTC (Thu) by kamil (subscriber, #3802) [Link]

Such selective updates are possible in, e.g., Gentoo. You can also skip on various rapidly evolving components -- my completely up to date systems do not have pulseaudio on them, and they are still running KDE 3.5.

Mind you, I'm not advocating that every newbee should switch to Gentoo; I'm just pointing out that there are alternatives to the "reinstall everything twice a year"-hell. There are costs involved, but to me at least, they are worth it.

twice a year hell, better than constant hell ?

Posted Jul 10, 2009 8:54 UTC (Fri) by langagemachine (guest, #56890) [Link]

>I'm just pointing out that there are alternatives to the "reinstall >everything twice a year"-hell. There are costs involved, but to me at >least, they are worth it.

Not 100% convinced about this. After 2 years running Gentoo, I have just decided to drop it in favor of an other 'rolling' distro (Arch Linux) on the ground that I spent more time fixing upgrade conflicts than actually using the computer (well, maybe not more time, but too much time).

I am fairly pleased with Arch Linux, although this week, a system upgrade hosed bash-completion; I could get it back thanks to a gross hack, but this has left me wondering about the whole 'rolling' distro concept: do constant upgrades not mean that you are constantly unsettling your system ?

So, update every week or upgrade twice a year, the hassle is probably the same ?

Come to think of it, perpetual unstability is character of life, as my biology teacher would teach us ...

twice a year hell, better than constant hell ?

Posted Jul 10, 2009 14:54 UTC (Fri) by kamil (subscriber, #3802) [Link]

There's some hassle with either approach, I'd just say it's a different sort of hassle.

If I update, say, the X server in Gentoo, I can expect and be prepared for some problems with, say, 3D, but I still expect the kernel to boot, sound to function, etc.

If I upgrade the whole distro, all bets are off. Will the printer still work afterwards? No idea.

I don't know which approach ultimately costs more time, I just know from experience that I personally found the distro upgrades more frustrating than individual package updates.

Why people don't test development distributions

Posted Jul 7, 2009 3:56 UTC (Tue) by dkite (guest, #4577) [Link]

What really is fun is running code written yesterday.

Oddly if enough people run testing distros or build from git/svn/cvs the
results are better. There are enough people running kde from svn that it
works most of the time. I'm not sure which happened first; a stable
repository or large numbers of users. No matter.

I realize that having bleeding edge kernels and system libraries may be a
bit more exciting.

Derek

Why people don't test development distributions

Posted Jul 7, 2009 7:26 UTC (Tue) by jeroen (guest, #12372) [Link]

Provide an indication of the state of the distribution. Many beaches are equipped with red flags which are posted when dangerous currents are present. Wouldn't it be nice if an apt-get upgrade could respond with a message like "the current threat condition is orange, you may want to reconsider"?

Apt-listbugs does something like that: it shows the release-critical bugs of the packages you are about to install or upgrade and asks whether you want to continue.

Why people don't test development distributions

Posted Jul 7, 2009 8:21 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

Rawhide has been especially nasty lately, probably because the Fedora 11 cycle was stretched over-long. The prelink bug was probably the worst rawhide crash in the past five years (even normal distro rescue tools could not manage it).

You can have a pretty good idea of rawhide's level of brokeness by checking the blocker list for next release (if only people could be more proactive in keeping it current)

https://bugzilla.redhat.com/show_bug.cgi?id=F12Blocker

(and this has *nothing* to do with the packaging tools used, every single big problem rawhide hit lately was in upstream code)

Development distributions are always on the edge of the knife. If people make an effort to use them, report bugs (and maintainers make the effort to fix them at once not procastinate because it's a devel release "no one cares about") you have a virtuous circle. If not problems snowball quickly.

Why people don't test development distributions

Posted Jul 7, 2009 8:39 UTC (Tue) by russell (guest, #10458) [Link]

how would the prelink problem occur in the first place. Did the developer not experience the problem or ( more likely ) not even test it. Perhaps more stability could be achieved by making developers who break things do QA penance for a month.

Why people don't test development distributions

Posted Jul 7, 2009 9:15 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

Probably because actual prelinking is deferred in a cron job, so you could update glibc and have the system go south hours after when the cron job executed

Why people don't test development distributions

Posted Jul 7, 2009 9:46 UTC (Tue) by russell (guest, #10458) [Link]

That doesn't stop the developer from running is manually. I believe the reason "People don't test development distributions" is because the developers don't either.

Why people don't test development distributions

Posted Jul 7, 2009 9:38 UTC (Tue) by marcH (subscriber, #57642) [Link]

> Rawhide just got back to usable state where I can begin reporting bugs again. Firefox has been completely weird, Evolution won't even start here, the kernel has done a good job of cooking my system drawing about twice the normal amount of power... [...] Sound stopped working. The screen saver started leaving the display in a weird, low-color-resolution state. And, most annoyingly, the keyboard layout went fully into psychedelic country. [...] somebody uploaded a broken prelink which hosed most important executables on the system.

Seriously, what is the point of making such software available? I mean, to anyone? What is a development distribution supposed to be exactly, just a mere and blind aggregation of random git snapshots?!

I am ready to test software tagged as a beta _release_ by developers, but random source code no thanks. Who would be?

See lmb's post above for more.

Why people don't test development distributions

Posted Jul 7, 2009 10:41 UTC (Tue) by jengelh (subscriber, #33263) [Link]

I am sorry to hear that Your Editor has repeatedly experienced problems with rawhide (including being stuck some years ago at the start of a presentation). It is well known that rawhide is perhaps the most disruptive choice. Was not it that Fedora even was the first to screw up sound by including PA.

The distro with the U has a track record of being not as broken, but then keeps that broken tate over the entire release as no updates for annoying things are released. So that is no choice either from my POV.

In the obvious move, I am suggesting to try something different. (And it makes no sense to post what I am thinking about because that would be followed by lots of arguments again.) Needless to say I used development cycle releases from ****************** (don't try to count, it's random) on shipped embedded devices without issue.

Why people don't test development distributions

Posted Jul 10, 2009 21:07 UTC (Fri) by Velmont (guest, #46433) [Link]

It's true that Ubuntu remains broken trough it's entire lifespan. Why is that? To not get new bugs? It's very troubling and unsatisfying.

Why people don't test development distributions

Posted Jul 10, 2009 22:25 UTC (Fri) by jengelh (subscriber, #33263) [Link]

Because it only attracts the deadweight from the Windows base - the real developers stay with what they had before ;-)

Why people don't test development distributions

Posted Jul 22, 2009 3:09 UTC (Wed) by maco (guest, #53641) [Link]

Not enough developers who know what they're doing. Plenty of folks running around trying to triage bugs and getting stuck, not so many folks who know how to code, let alone code well. If left to hack, they're more efficient, but then folks like me who want to learn and have patches to fix those annoying issues keep interrupting them going "hey hey can you upload this patch?" and breaking their concentration. Patch acceptance procedures are being streamlined to try to fix that.

Why people don't test development distributions

Posted Jul 7, 2009 12:44 UTC (Tue) by rsidd (subscriber, #2582) [Link]

I don't think the Ubuntu bug compares at all with what you describe in Fedora. The upgrade kills the X session. Big deal: you continue the upgrade from a console (anyone running the development distribution should know how to do that, right?) and then restart gdm.

Time was I would always upgrade from a console, without X or any important program running, just to be safe. The reliability of apt-get and dpkg has spoiled us and we take fewer precautions, but even so, it's rare to find something in Ubuntu or Debian that is hard to recover from for someone comfortable with the console. That does not seem to be true of Fedora or RPM-based distros in general.

Why people don't test development distributions

Posted Jul 7, 2009 13:16 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

It's rare in Fedora too. I fact I can't remember any other instance right away (there may have been some), and I've been running rawhide continuously on my home system for about a decade.

Why people don't test development distributions

Posted Jul 7, 2009 13:25 UTC (Tue) by rsidd (subscriber, #2582) [Link]

I don't use Fedora but I keep reading articles like this one (LWN, 01/2009)

Why people don't test development distributions

Posted Jul 7, 2009 13:52 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

You keep reading this kind of article because Fedora is a very active and open project. Fedora makes headlines about its problems because it always tries to improve, so heated post-mortem public discussions are common and expected.

Users of other distributions seem to prefer tip-toeing around problems to avoid giving their choice a bad rep.

Why people don't test development distributions

Posted Jul 7, 2009 13:38 UTC (Tue) by mrshiny (subscriber, #4266) [Link]

I'd be willing to run a development distro except I have so many problems with the stable one, I can't imagine dealing with more problems. Maybe it's my choice of Distro, but sometimes Fedora's stable updates break random things for me. I already feel like I'm testing beta software.

Endless updates

Posted Jul 7, 2009 14:40 UTC (Tue) by southey (guest, #9466) [Link]

In at least the case of Fedora rawhide you often get updates which is a little tiresome because there can be many many packages to update at the same time. If you choose to update, then update fails because usually due to some fake dependency on other package (especially if the package is backwards compatible). It also becomes tedious to even select which updates to apply or find the package(s) that are blocking your update.

But on the positive side, it does allow you to get the major changes to the core packages like KDE and Gnome without having to wait until the next release in a years time. Also with Fedora you can stop at a release if you get a system you like!

Why people don't test development distributions

Posted Jul 7, 2009 15:30 UTC (Tue) by salimma (subscriber, #34460) [Link]

From the bug posting, it looks more like a glibc/prelink mismatch after a *glibc* update, not a broken prelink -- though, granted, both ought to have been upgraded at the same point (and ideally maintained by the same person).

Why people don't test development distributions

Posted Jul 7, 2009 16:31 UTC (Tue) by iabervon (subscriber, #722) [Link]

One think I really like about Gentoo is how the "developement version" works. The user gets to list all of the packages where they would like to get development versions (which can be "everything"), and they get development versions of those and not of other packages. (Of course, this depends on the distribution supporting being in a mixed state, which is much easier with a source-based distribution than with a binary distribution, where you'd have to make some hard choices about which versions of libraries everything links against; there's still obviously some cases where some combinations fail to meet version requirements for dependencies).

This is really handy for being able to test particular packages that you're interested in, and being able to adjust these as needed. For example, I could positively identify that X.org server 1.5 (and not anything else) broke my arrow keys. Also, I could then say that I didn't want to test it any more, because I'd determined that it had problems and didn't feel compelled to suffer needlessly until there was a possible fix. (Actually, it turned out that I was accidentally remapping the wrong keys in a config file because keycodes had been renumbered, so the fix was to my configuration, but I could use my arrow keys in 1.4 while I figured this out).

I suspect that it would be really helpful to the process in general to be able to tell your package manager: don't give me a system where bug #X is known to be present. Then you could do your update, find that sound doesn't work, report the bug (or find the existing bug report), say that bug's a show-stopper for you, update again (causing that package to be downgraded to an earlier version), and find out whether the new version of your music player is broken or not. When the sound bug might be fixed, the package manager would try upgrading again so you could test, and you could either say that it's still broken (and again revert to the known-good version) or find that it now works.

Why people don't test development distributions

Posted Jul 7, 2009 16:57 UTC (Tue) by joey (guest, #328) [Link]

Reading the article, I can't help but think Jon is generalizing from one development distribution and reaching some not so general conclusions, although there's good stuff in there too. (I'm looking forward to btrfs rollbacks..)

My ancedote: I've run Debian unstable on all my servers and desktops for 10+ years, and only 3 times have I had breakage on the order of a broken libc or bad prelink. If I had chosen to run Debian testing, I'd have missed 2 of the breakages (testing was not available when the first happened).

Distributors would like to see wider testing of their development releases, but, as your editor's recent experience shows, there are limits to how wide this testing community can be expected to be.

According to popcon, currently 30% of the subset of Debian users who choose to enable reporting use unstable or testing, not stable. If anything, some in Debian may wish to see less wide use of its development branches, but stable is not useful for lots of users.

Anybody who has worked with development distributions for any period of time knows that the early part of the distribution development cycle is when things are most likely to go wrong.

This is certianly true, and a look at the debian-user mailing list will find similar warnings about using unstable after a release. Less so for the testing distribution, as the way testing's algorythm reacts to lots of churn and bugs in unstable is to stop updating many packages from unstable until the churn quiets down.

Provide an indication of the state of the distribution. Many beaches are equipped with red flags which are posted when dangerous currents are present. Wouldn't it be nice if an apt-get upgrade could respond with a message like "the current threat condition is orange, you may want to reconsider"?

As previously noted, apt-listbugs can do just that. It looks like about 1 in 10 users of unstable (or testing) uses apt-listbugs.

Why people don't test development distributions

Posted Jul 8, 2009 4:20 UTC (Wed) by rsidd (subscriber, #2582) [Link]

Debian has a bad reputation because of the constant bickering on what "free" really means (but these discussions are useful in the long run, to everybody, not just to Debian!). And also because of the painful delays in its "stable" releases, but this is very misleading: Debian testing and unstable, as you say, are much more stable than the releases of many other distros.

I think the reason for Debian's success is the package management system: nothing is perfect but Debian's is as close as one can get (further improvements will require support from the OS/filesystem, such as snapshotting and rollbacks). The newer distros realise this and nearly all of them have chosen to use Debian's package management rather than RPM. If Fedora can make RPM+yum as robust and stable as dpkg+apt, then many problems may go away. But might it not be easier to just scrap RPM altogether and use dpkg+apt? Is it the NIH syndrome? Or do RPM/yum really offer something to Fedora that dpkg/apt don't?

Why people don't test development distributions

Posted Jul 8, 2009 13:31 UTC (Wed) by nim-nim (subscriber, #34454) [Link]

dpkg+apt's main advantage is the existence of a large package pool (Debian) with little involvement by a commercial entity like Red Hat, and just static/problematic enough it's not succeeding as much as it could. So you can take the 80% done by Debian, add your 20% and get something dramatically better from the user point of view. You don't see any newer distro that chose dpkg+apt without forking the Debian Repository too.

If you try the same thing with the Fedora or Red Hat package pool you'll have an hard time proving your fork is better because there are so many public efforts pouring in the main branch (as Oracle, Mandriva, etc learnt). So it's not attractive for third-parties that want to make their own name. It takes behemmots like Intel or Oracle to try to outcompete Red Hat on its home turf nowadays.

If Ubuntu manages to raise the limit Debian-side the same way, you'll see third-parties looking elsewhere for a new base.

Why people don't test development distributions

Posted Jul 9, 2009 1:15 UTC (Thu) by rsidd (subscriber, #2582) [Link]

Well, the main advantage is that upgrades <i>just work</i>. Even across distros. I've gone from Knoppix to Debian to Ubuntu via apt-get dist-upgrade. Ubuntu has a graphical update tool that takes care of some subtle things during the upgrade process, but using command-line apt-get dist-upgrade will not break your system. Fedora, last I checked, still recommends backup+reinstall for upgrading.

Why people don't test development distributions

Posted Jul 9, 2009 20:59 UTC (Thu) by nim-nim (subscriber, #34454) [Link]

I'm happy for you.

I'm sure people have the same stories for Fedora/RHEL/Centos derivatives (not recommended is different from you can make it work most of the time with little pain)

Though I doubt any distro ever shortlisted a package management system because it made easy to upgrade to a different distro.

Why people don't test development distributions

Posted Jul 9, 2009 1:47 UTC (Thu) by PaulWay (guest, #45600) [Link]

The snapshotting idea is a good one, and easily implementable using LVM. Just make sure that your LVs don't use all of your disk space and you can create snapshots to your heart's content. Then you can simply tell GRUB to boot off that snapshot (AFAIK).

Having an automated system to do this in development versions of distros would be awesome though. I've done it manually on my work machine prior to trying the 'preupgrade' process to Fedora 11 and it would be nice to have that done automatically. LVM should make that a snap, if you'll forgive the pun.

Have fun,

Paul

Why people don't test development distributions

Posted Jul 9, 2009 10:46 UTC (Thu) by nix (subscriber, #2304) [Link]

My understanding is that snapshotting the root filesystem, especially under memory pressure, is still deadlock-prone: and after the deadlock has hit everything accessing the root fs blocks, which basically means a reboot.

Why people don't test development distributions

Posted Jul 9, 2009 16:58 UTC (Thu) by mdz@debian.org (guest, #14112) [Link]

Thanks for this article. It will make interesting reading for all of the folk who ask "why don't distributions just ship pristine upstream source without any patches?" The answer, of course, is that the result doesn't actually work at all (even when updating versions from a stable state).

It turns out that making a working system out of the whole mess is actually a fair bit of work. :-)

Why people don't test beta releases and RCs

Posted Jul 10, 2009 21:46 UTC (Fri) by Richard_J_Neill (guest, #23093) [Link]

In my experience (variously with Mandrake/Mandriva and Ubuntu), I'm usually willing to install ONE of the beta or RC releases before the final release, and put some effort into testing it. I usually find about a dozen bugs varying from outright breakage to smaller errors, and I report these. I usually track down the root cause as best I can, and include a fix or workaround.

However, the distros never get these bugs fixed in time for the release. They don't even tend to fix them within the lifespan of that release. Sometimes they are fixed in the dev version only. So what's the point?

I think there is an implied social contract here: if I (the user) test a distro alpha/beta and take the time to make a correct and detailed bug report, with all the information needed, and if I spend multiple hours of my time doing it, I *expect* the distributor to read the bug report and get the fix deployed in a timely manner, and pushed into the *stable* release, not just the next dev cycle. But most distros don't do this.

Why people don't test beta releases and RCs

Posted Jul 14, 2009 9:00 UTC (Tue) by marcH (subscriber, #57642) [Link]

> However, the distros never get these bugs fixed in time for the release. They don't even tend to fix them within the lifespan of that release. Sometimes they are fixed in the dev version only. So what's the point?

Just lack of resources? You get what you paid for.

Bug reports like yours are still tremendously useful. I mean not just to the upstream developer(s): you have no idea how many end users are delighted to find and apply your fix or workaround.

Why people don't test development distributions

Posted Jul 14, 2009 9:45 UTC (Tue) by jcm (subscriber, #18262) [Link]

I enjoy having a laptop that actually works without surprises - you know, where you can plug in a projector and have a reasonable expectation that it works as well as it did the last time you tried a few weeks ago, and that an update didn't break it yesterday. And many other examples. I admit to being somewhat amused when people say "OMG, it broke horribly!" as a machine they rely upon randomly breaks at the worst moment (I'm not attacking Jon here!).

No. The only possible answer is to run development distributions on a dedicated development machine - I usually do it under KVM, and use full screen X forwarding and the like (to the same laptop) so it feels much like the real thing. Except I can reboot and generally things don't fall apart. Those who want to test on the bare metal - I'm sure there are plenty of second systems out there, kernel test boxes, and the like. Old laptops - you name it - but running any development distribution for daily productivity purposes is a giant time vampire and will bite eventually.

Jon.

Why people don't test development distributions

Posted Jul 21, 2009 16:06 UTC (Tue) by DarthCthulhu (guest, #50384) [Link]

One of the best things about GoboLinux is the ability to have (and run!) multiple versions of software on the system. None of this: Upgrade Everything And Never Be Able to Go Back. If you upgrade a library or piece of software and it turns out to suck, you can always go back to using the one that doesn't suck. You get the best of both worlds: fast access to Shiny New Software, as well as Continuing Access to Stuff That Works.

I've even run KDE3 and KDE4 at the same time! Literally, both were running and operational in parallel. When KDE4 would inevitably crash hard, I could continue working with the KDE3 interface while KDE4 eventually raised itself from confused sloth to restart.

This does mean you have to occasionally go in and manually uninstall old libraries and software you're not using any more (by 'manually', I mean run the RemoveProgram script; you can also do it by rm-ing the files and symlinks if you want), but that is a small price to pay.

Second class bug reporting

Posted Jul 23, 2009 13:19 UTC (Thu) by pboddie (subscriber, #50784) [Link]

This was kind of mentioned in the article and mostly ignored by the comments, but I have to take issue with the continual persuasion to run development distributions. Just because I'm a software developer, it doesn't mean that I'm developing and testing the various distribution components from kernel through the user interface to the different applications - I write my own software, too, and want a reasonable level of stability to be able to do so, even though I may want to install cutting-edge releases of some packages. In other words, while I'm perfectly happy to report bugs, testing the fundamental components of the system is not exactly my job. And in some cases, it's clear that testing the basics was nobody's job.

What one sees with Ubuntu, for example, is that when reporting bugs, there's a resistance to do a great deal with them if one isn't running the latest pre-release version. The argument goes that only the upcoming release is fixable and the previous releases are mostly "done". Thus, "you can help with the new release" takes priority over improving something which is already doing its job. (It's also irritating to see that things like the packages Web site also defaults to the latest release, rather than "any", but I suppose they'd get complains either way.)

So, the choice is between running something that probably won't be fixed or something which is possibly broken. How about a better range of choices, distro makers?