So what?

Story: Test: Do Linux filesystems need defragmentation?Total Replies: 8
Author Content
chalex

Dec 10, 2007
7:59 PM EDT
Thank you for not making any unfounded claims based on your findings. You could have said something like "my system is so much snappier now!"

That's great, you've found that some of your files are fragmented and then you found a utility to defragment them. You average number of fragments per file went down.

"As the results show, it's worth the effort to try Con Kolivas defragmentation script." Sure, that's true if you care about the average number of fragments per file. However, you did not explain why I should care about a metric like that.
hkwint

Dec 10, 2007
10:01 PM EDT
Quoting:However, you did not explain why I should care about a metric like that.


It's almost impossible to benchmark in my opinion, though I _think_ I felt the difference back when I used WinXP. It's merely rest of mind you're buying; from the factors that could slow down your experience, at least file fragmentation isn't one of them anymore.
set

Dec 10, 2007
10:50 PM EDT
While your article is interesting in that it introduced me to the 'filefrag' util, (used by the first script) and Colivas' script, I have to agree with chalex that you really do not conclude with anything useful. Fragmentation isnt anything inheriently bad, it just means that a file is not 100% contiguous on disk. Your own results indicate that only few large files were mostly susceptible to fragmentation , which makes sense and shouldnt present much of a performance problem. Fragmentation presented real problems on crufty old filesystems, like FAT based, when after a while, it was hard to create a file that wasnt broken into chunks all over the place. (fragmentation hurts you in two ways; it might waste your read-ahead, and in seeking ) [It also can impact file recovery after a disaster, but thats not performance]

Basicly, your conclusion is that 'it makes me feel better', which is fine, but dont pretend its anymore than that. And dont try to draw inferences from your old use of another operating system with an unspecified filesystem. Fragmentation was a problem that was solved long ago, and if you actually found that it was a signifigant problem with a filesystem, I would say that filesystem is probably not very good. At least you do provide the tools to allow someone to investigate further.

As far as benchmarking, a read through-put test wouldnt be too hard, just 'time find /foo -type f -exec cat {} ; > /dev/null' (find a way to invalidate the page cache-- excercise for the student) defrag and repeat the test.
hkwint

Dec 11, 2007
12:34 AM EDT
Quoting:Basicly, your conclusion is that 'it makes me feel better', which is fine, but dont pretend its anymore than that.


That's true. The most important point for me was that the partitions that probably are important for system performance, /usr and /var, were not fragmented, like I assumed. I assumed so because a Slackware user pointed out Gentoo made his system slow 'because of filesystem fragmentation', and I guess that's no issue on my /usr and /var partitions. The rest of the article is 'but what if these results were worse, and I would like to defragment?" Maybe that wasn't clear enough. If files on my home partition - like my music files are a bit fragmented, that would probably not present a problem; let alone a Linux-iso which is uploaded through bittorrent.

It's a bid sad I can't un-defragment to do the benchmarking you suggest on the system I tested, but I might try it on my other PC. Probably I should test how much time it takes for VMWare to start WinXP from a fragmented and a defragmented image, but I'm afraid I can't give quantitative and representative results. Moreover, I'm no hero in the technical details as you might have understood by now. It's a bit the same like the RDSL scheduler maybe (also by Con Kolivas), where people _felt_ it made their system snappier, but Con was unable to present thorough benchmarks in fact showing / proving the improvements the people felt to Linus. That's the reason I didn't go into that and didn't make conclusions; it's really hard to do a proper benchmark and probably my feelings about the improvement in prefromance aren't a good measure either. Nonetheless, as indicated, I might try the read through-put test.
jezuch

Dec 11, 2007
8:11 AM EDT
Quoting:a Slackware user pointed out Gentoo made his system slow 'because of filesystem fragmentation', and I guess that's no issue on my /usr and /var partitions.


More likely it was slow because there were lots and lots of small files scattered throughout the disk. That, I would expect, should be the result of multiple compile jobs (a "typical Gentoo workload" ;) ) - there are no (or not many) large files involved in that. It depends on the filesystem, but the typical block size (or "allocation unit" in Windows-speak) is 1kB or 4kB - you won't find many source files larger than that :) So file fragmentation should not be the issue.
gus3

Dec 11, 2007
8:40 PM EDT
Quoting:a Slackware user pointed out Gentoo made his system slow 'because of filesystem fragmentation'
That was probably me. I did state that, or something closely resembling it, several months ago on these boards.

If I were to say the same right now, I would not have the courage of my conviction. I would hedge my bets by saying, "but I haven't moved Portage to its own filesystem yet, so take it with a grain of salt." However, I stand by the statement that my old 550 MHz P3 laptop has right now a snappier response running Slackware-current than my powerful 2.1GHz Athlon desktop had with the best-tuned Gentoo I could muster. And that's saying something.
set

Dec 12, 2007
12:55 AM EDT
Just a quick comment re: gus3. (as it touches on the lack of quantifiable measurements and apples to oranges comparisions) Logicly, we would expect a 2.1g Athlon to perform better than a 550m P3, but attempting to compare the two, running two completely different distribution with possibly different kernels/filesystems/video cards/etc doesnt really mean much. And when the metric is 'snappier', it gets even fuzzier because even if you were comparing apples to apples (testing on the same hardware) kernel configuration and version could make a tremendous difference.

So, I hope that 'And thats saying something.' means that further investigation is warrented, rather than case closed.
gus3

Dec 12, 2007
1:10 AM EDT
@set:

Take it as you will, but when I switched from Gentoo to Slackware, I made a step backwards in software versions, so some "innovations" were gone, which should have resulted in slowdown, not speedup. Add to that the default i486 optimizations for Slackware (rather than the Athlon optimizations I gave to Gentoo), and it does make one wonder what external circumstance could cause Gentoo's slowdown.

Given the Portage system, my #1 suspect is filesystem fragmentation.

I use the same distro on all my systems (perhaps different versions, but all Slackware right now). Yes, they have different hardware, but I seriously doubt that it would affect a subjective "snappiness" judgment in the manner I observed.
hkwint

Dec 12, 2007
11:39 AM EDT
gus3: You're right, it was you who started my '(de)frag journey', thanks for that BTW, it was interesting.

If you say you find Slackware snappier than Gentoo, I believe you. However, it seems file fragmentation wasn't a problem / issue on my system, and of course, as a 'real Gentoo user', I was glad to find out, we wouldn't like Gentoo being responsable for file fragmentation would we? ;)

I have tried to understand this whole (de)frag issue by reading a lot of discussions (I believe I spent several hours trying to catch up) - amongst other things on Gentoo forums, an old Linux-gazette benchmark, and 'phsolides' remarks in the other thread beneath this article, but the more I read, the more difficult and complex it all becomes (like phsolide suggests). It seems because of LBA, if two data blocks are logically next to each other, it doesn't have to mean they are physically next to each other. I also made a great blunder by forgetting I use EVMS (mainly a frontend to LVM2), which maps my 'binary data' over several virtual layers to the physical hard disk. I'm glad I only reported fragmentation numbers and no performance improvements or so.

Looking at what other users say, I became more confused, especially when it comes to ReiserFS. A lot of comments made on the issue were contradictive.

Anyway, the most important issues I 'learnt':

Both ReiserFS v3 and ext3 don't have defragmentation programs. ReiserFS4 will have its own defragmentation program. However, I understood ext3 isn't that susceptible to file fragmentaiton. On the other hand, a lot of people say 'ReiserFS v3 fragments like hell'. Most of the people agree putting /var, tmp and /usr/portage on different partitions help prevent fragmentation. The tails in ReiseFS partly prevent fragmentation. XFS doesn't pre-allocate blocks like most filesystems do these days, but it allocates them when the harddisk is flushed (synced I believe). Hans Reiser believes this is the way to go, since it prevents fragmentation, and this mechanism will be used in ReiserFS4. From Dino I understood file fragmentation really was an issue 'back in the old UNIX days'.

The connection of file fragmentation and performance, like chalex started this topic with, is also a difficult subject, and I'm afraid there's not much I can say about it, since I don't have the knowledge for it.

Maybe it would be a good idea to find the (kernel-module) maintainers of several of those filesystems and ask them for some comments on this whole issue.

Posting in this forum is limited to members of the group: [ForumMods, SITEADMINS, MEMBERS.]

Becoming a member of LXer is easy and free. Join Us!