NUMA-style GPU usage

Story: AMD Launches Open GPU WebsiteTotal Replies: 15
Author Content
Bob_Robertson

Feb 14, 2008
6:51 AM EDT
Here's what I'd like to see.

Right now, Linux is good with Symmetric Multi Processing. Those of you lucky people with multi-core CPUs know what I mean, it's practically automatic at this point. Debian stopped providing pre-packaged kernels without SMP turned on years ago.

Right now, Linux is good with NUMA, Non-Uniform Memory Access systems. SMP without the "S", as it were.

But I, like lots of others, am sitting at a machine with a GPU. Until or unless specifically called upon by a program that demands its attention, it pretty much sits idle.

Might it be possible to utilize this GPU, that has its own dedicated 64M of RAM (in my 5 year old machine), that is optimized for mathematical functions, during normal functioning when not being called upon specifically for graphical manipulation? Hmmm....

http://www.wired.com/techbiz/it/news/2007/10/ps3_supercomput...

Right now, the PS3 is being utilized as a supercomputer node, with diverse CPUs in it, specifically a general purpose CPU and multiple IBM "cell" processors, in one box. My thought is to utilize the GPU in much the same way.

Maybe a kernel module specific to that GPU would have to be built, which would preclude closed-source GPUs like Nvidia (which sucks, because I have an Nvidia), but the recent actions by the ATI division of AMD to completely document their GPU systems makes it possible. And there are possibilities in the F/OSS versions of "TheOpenGraphicsCard".

It might be a way to supercharge older machines with relatively cheap "older" video cards. "Why did you put two graphics cards in your server when you don't even use the onboard video?" Hehe!

Sander_Marechal

Feb 14, 2008
7:27 AM EDT
Quoting:Debian stopped providing pre-packaged kernels without SMP turned on years ago.


Not quite. In newer kernels, SMP support can be turned on and off at boot-time instead of at compile-time. It's for kernels 2.6.18 and up IIRC, which just happens to be what Etch ships with. In those kernels, when you boot and Linux detects that you do not have an SMP machine, it overwrites in-memory the parts that deal with SMP with NOOPs (No Operation). It's self-adepting code. Pretty cool :-)

So, non-SMP kernels haven't disappeared at all.
gus3

Feb 14, 2008
7:43 AM EDT
Quoting:when you boot and Linux detects that you do not have an SMP machine, it overwrites in-memory the parts that deal with SMP with NOOPs (No Operation).
Self-modifying code?!? That relies on so many parts outside the kernel working properly, it isn't funny.

And whose bright idea was that? Gee, I can almost guess...
Sander_Marechal

Feb 14, 2008
8:47 AM EDT
Quoting:Self-modifying code?!? That relies on so many parts outside the kernel working properly, it isn't funny.


Why? It's only the in-kernel stuff that gets modified. I don't see at all what it would have to do with anything outside the kernel.
Bob_Robertson

Feb 14, 2008
10:19 AM EDT
Ah, no, SMP is turned on at compile time. It doesn't modify itself, it simply doesn't use SMP if it's not an SMP machine.

Exactly the same as compiling in a particular filesystem and then not using any devices with that file system. It's in there, just not used.

The performance hit and "wasted kernel space" is so small as to be irrelevant. That's why Debian does it, especially since new hardware has multi-core CPUs as a rule rather than as an exception.
Sander_Marechal

Feb 14, 2008
2:38 PM EDT
Bob, sorry to disagree but you're wrong. See the debian documentation at http://www.debian.org/releases/stable/i386/ch02s01.html.en

Quoting:2.1.5. Multiple Processors

Multiprocessor support — also called “symmetric multiprocessing” or SMP — is available for this architecture. The standard Debian 4.0 kernel image was compiled with SMP-alternatives support. This means that the kernel will detect the number of processors (or processor cores) and will automatically deactivate SMP on uniprocessor systems.

The 486 flavour of the Debian kernel image packages for Intel x86 is not compiled with SMP support.


"SMP alternatives" is the self-modifying code I described above. There is no 486-smp variant. Only SMP alternatives in the default Debian kernel. For a good review of SMP alternatives, see this LWN article: http://lwn.net/Articles/164121/

I see that not only does it work at boot time, but all the time. You can switch from SMP tot non-SMP if you hot-swap a CPU. Very useful for e.g. Xen guests when you migrate a running guest from a non-SMP system to an SMP system. Instant SMP support without even a reboot :-)

Quoting:A more interesting - and more controversial - feature of this patch is that, when the kernel is converted between the SMP and uniprocessor mode, the overwritten instructions are remembered. At some point the the future, then, the alternatives code can reverse the change, switching the kernel back to the full SMP implementation. The code is then run whenever a CPU hotplug event happens, optimizing the kernel for the system's new configuration. A system can be initially booted with a single processor, and the alternatives code will edit out all of the SMP-related instructions. If another processor is added later on, the kernel will be automatically converted back into a fully SMP-capable mode. If processors are removed, the SMP code can be taken out too. All within a running system, with no need to reboot.

This feature may seem useful to a rather small minority of users - and it is. But that minority may be bigger than one thinks. Virtualization systems (and Xen in particular) are implementing the ability to configure the number of (virtual) CPUs in each running instance on the fly, in response to the load on each. So it may really be that a busy, virtualized server will have CPUs hot-plugged into it, and that those processors will go away when the load drops. Enabling the kernel to reconfigure itself on the fly when this happens will allow each Xen instance to run a kernel which is optimized for its current situation.


jezuch

Feb 14, 2008
3:19 PM EDT
Quoting:Maybe a kernel module specific to that GPU would have to be built, which would preclude closed-source GPUs like Nvidia (which sucks, because I have an Nvidia)


There's GPGPU (General Processing GPU, or sth. like that) which AFAICS is a somewhat hot topic recently. I can't imagine anything GP- without a public API, possibly open (or at least "open"). Older cards like yours would probably be outside of the scope of this, but...
Bob_Robertson

Feb 14, 2008
4:25 PM EDT
Sander, I still don't see where it is "self modifying". All I see is what I said, that if there isn't more than one CPU, it doesn't use SMP.

If there is something more in what you wrote, I just don't see it. Sorry.

> Older cards like yours would probably be outside of the scope of this, but...

The video card producers seem to be getting the idea about openly publishing their specs, so I'm sure the situation will improve. However, I can say from experience that the people writing the "nv" and "neuvo" GPL drivers are having to reverse engineer the Nvidia cards, so it's clear that Nvidia isn't playing entirely nice as yet.
gus3

Feb 14, 2008
9:20 PM EDT
Sander:

Quoting:Why? It's only the in-kernel stuff that gets modified. I don't see at all what it would have to do with anything outside the kernel.
Well, the compiler and linker, for starters. An off-by-one error in the wrong place, and you put your whole "let's overwrite the SMP code" at risk.

All the more reason to start using a properly-segmented programming model.
Sander_Marechal

Feb 14, 2008
10:36 PM EDT
Quoting:Sander, I still don't see where it is "self modifying".


Did you read the LWN article? It's not like the kernel simply skips the SMP instructions. It edits them out. Or it overwrites some parts of the kernel with different code optimized for single processor systems. Here is the simplest example I could find from the SMP Alternatives patch:

alternative_smp(     __raw_spin_lock_string,     __raw_spin_lock_string_up,     "=m" (lock->slock) : : "memory");

When compiling this, the kernel will be compiled to call __raw_spin_lock_string. __raw_spin_lock_string_up is also compiled but put aside in a special section. When the kernel detects a CPU change to non-SMP, it patches the kernel in-memory to replace__raw_spin_lock_string_up with __raw_spin_lock_string. Of course, you can patch not just single calls but whole sections of code this way. Here's an example that executes two different sets of assembler instructions:

alternative_smp("lock; subl $1,%0nt"     "jns 1fn"     "pushl %%eaxnt"     "leal %0,%%eaxnt"     "call " helper "nt"     "popl %%eaxnt"     "1:n",     "subl $1,%0nt",     "=m" (*(volatile int *)rw) : : "memory")

It's self-modifying code. No matter what way you look at it.

Quoting:An off-by-one error in the wrong place, and you put your whole "let's overwrite the SMP code" at risk.


From what I understand from the SMP Alternatives patch, the compiler marks the start and end of sections that should be overwritten at compile time, so this happens automagically. The only place where you could have an off-by-one error is in the implementation of the alternative_smp() macro and in the code that does the patching. I'm pretty sure that they verified that code thrice. It's not that much code anyway. Have a look at the original patch at LWN: http://lwn.net/Articles/163810/
gus3

Feb 14, 2008
10:46 PM EDT
I'm not talking about the Linux kernel code being off-by-one. I'm talking about some corner case in the compiler or linker themselves.
Sander_Marechal

Feb 15, 2008
1:21 AM EDT
I presume that such bugs in the compiler are quickly found and fixed.
Bob_Robertson

Feb 15, 2008
7:40 AM EDT
$ uname -a Linux dierdre 2.6.24-1-686 #1 SMP Mon Feb 11 14:37:45 UTC 2008 i686 GNU/Linux

Gee, something must be wrong with my system. It has only one single-core cpu, and it still shows up as "SMP". It didn't modify itself.

Oh well, it's a monumentally stupid thing to argue about anyway, so Sander, you're absolutely right. Good for you.
Sander_Marechal

Feb 15, 2008
1:22 PM EDT
Are you running a vanilla Debian kernel? Not all distro's carry smp alternatives. It's a patch. Also, you're running a 686 kernel. Hyper Threading counts as SMP. How many CPU's does /proc/cpuinfo show? Also, perhaps uname -a is always saying SMP because it can be used that way. I have no idea if the smp alternatives code also changes the kernel's uname when switching between smp and non-smp mode.

Quoting:it's a monumentally stupid thing to argue about anyway


I'm not arguing. I'm just pointing out that Debian did not stop providing non-SMP kernels like you claimed they did.
Bob_Robertson

Feb 15, 2008
1:26 PM EDT
> How many CPU's does /proc/cpuinfo show?

1. No hyperthreading.

> Are you running a vanilla Debian kernel?

linux-image-2.6.24-1-686_2.6.24-4_i386.deb

> Also, perhaps uname -a is always saying SMP because it can be used that way.

Exactly what I said in the first place.

Sander_Marechal

Feb 15, 2008
1:41 PM EDT
Quoting:Exactly what I said in the first place.


You said that Debian stopped providing non-SMP kernels and that the kernel skips the SMP parts, which would be a performance hit on non-SMP hardware. But you can rest assured that your kernel will run as fast as possible on your non-SMP system, thanks to smp alternatives :-)

It's just one of these things about the kernel that made me go "Wow!" when I first read about it.

Posting in this forum is limited to members of the group: [ForumMods, SITEADMINS, MEMBERS.]

Becoming a member of LXer is easy and free. Join Us!