From: Keith W. <ke...@tu...> - 2008-05-19 12:10:08
|
Just reposting this with a new subject line and less preamble. ----- Original Message ---- > > Well the thing is I can't believe we don't know enough to do this in some > way generically, but maybe the TTM vs GEM thing proves its not possible. I don't think there's anything particularly wrong with the GEM interface -- I just need to know that the implementation can be fixed so that performance doesn't suck as hard as it does in the current one, and that people's political views on basic operations like mapping buffers don't get in the way of writing a decent driver. We've run a few benchmarks against i915 drivers in all their permutations, and to summarize the results look like: - for GPU-bound apps, there are small differences, perhaps up to 10%. I'm really not concerned about these (yet). - for CPU-bound apps, the overheads introduced by Intel's approach to buffer handling impose a significant penalty in the region of 50-100%. I think the latter is the significant result -- none of these experiments in memory management significantly change the command stream the hardware has to operate on, so what we're varying essentially is the CPU behaviour to acheive that command stream. And it is in CPU usage where GEM (and Keith/Eric's now-abandoned TTM driver) do significantly dissapoint. Or to put it another way, GEM & master/TTM seem to burn huge amounts of CPU just running the memory manager. This isn't true for master/no-ttm or for i915tex using userspace sub-allocation, where the CPU penalty for getting decent memory management seems to be minimal relative to the non-ttm baseline. If there's a political desire to not use userspace sub-allocation, then whatever kernel-based approach you want to investigate should nonetheless make some effort to hit reasonable performance goals -- and neither of the current two kernel-allocation-based approaches currently are at all impressive. Keith ============================================================== And on an i945G, dual core Pentium D 3Ghz 2MB cache, FSB 800 Mhz, single-channel ram: Openarena timedemo at 640x480: -------------------------------------------- master w/o TTM: 840 frames, 17.1 seconds: 49.0 fps, 12.24s user 1.02s system 63% cpu 20.880 total master with TTM: 840 frames, 15.8 seconds: 53.1 fps, 13.51s user 5.15s system 95% cpu 19.571 total i915tex_branch: 840 frames, 13.8 seconds: 61.0 fps, 12.54s user 2.34s system 85% cpu 17.506 total gem: 840 frames, 15.9 seconds: 52.8 fps, 11.96s user 4.44s system 83% cpu 19.695 total KW: It's less obvious here than some of the tests below, but the pattern is still clear -- compared to master/no-ttm i915tex is getting about the same ratio of fps to CPU usage, whereas both master/ttm and gem are significantly worse, burning much more CPU per fps, with a large chunk of the extra CPU being spent in the kernel. The particularly worrying thing about GEM is that it isn't hitting *either* 100% cpu *or* maximum framerates from the hardware -- that's really not very good, as it implies hardware is being left idle unecessarily. glxgears: A: ~1029 fps, 20.63user 2.88system 1:00.00elapsed 39%CPU (master, no ttm) B: ~1072 fps, 23.97user 18.06system 1:00.00elapsed 70%CPU (master, ttm) C: ~1128 fps, 22.38user 5.21system 1:00.00elapsed 45%CPU (i915tex, new) D: ~1167 fps, 23.14user 9.07system 1:00.00elapsed 53%CPU (i915tex, old) F: ~1112 fps, 24.70user 21.95system 1:00.00elapsed 77%CPU (gem) KW: The high CPU overhead imposed by GEM and (non-suballocating) master/TTM should be pretty clear here. master/TTM burns 30% of CPU just running the memory manager!! GEM gets slightly higher framerates but uses even more CPU than master/TTM. fgl_glxgears -fbo: A: n/a B: ~244 fps, 7.03user 5.30system 1:00.01elapsed 20%CPU (master, ttm) C: ~255 fps, 6.24user 1.71system 1:00.00elapsed 13%CPU (i915tex, new) D: ~260 fps, 6.60user 2.44system 1:00.00elapsed 15%CPU (i915tex, old) F: ~258 fps, 7.56user 6.44system 1:00.00elapsed 23%CPU (gem) KW: GEM & master/ttm burn more cpu to build/submit the same command streams. openarena 1280x1024: A: 840 frames, 44.5 seconds: 18.9 fps (master, no ttm) B: 840 frames, 40.8 seconds: 20.6 fps (master, ttm) C: 840 frames, 40.4 seconds: 20.8 fps (i915tex, new) D: 840 frames, 37.9 seconds: 22.2 fps (i915tex, old) F: 840 frames, 40.3 seconds: 20.8 fps (gem) KW: no cpu measurements taken here, but almost certainly GPU bound. A lot of similar numbers, I don't believe the deltas have anything in particular to do with memory management interface choices... ipers: A: ~285000 Poly/sec (master, no ttm) B: ~217000 Poly/sec (master, ttm) C: ~298000 Poly/sec (i915tex, new) D: ~227000 Poly/sec (i915tex, old) F: ~125000 Poly/sec (gen, GPU lockup on first attempt) KW: no cpu measurements in this run, but all are almost certainly 100% pinned on CPU. - i915tex (in particular i915tex, new) show similar performance to classic - ie low cpu overhead for this memory manager. - GEM is significantly worse even than master/ttm -- hopefully this is a bug rather than a necessary characteristic of the interface. texdown: A: total texels=393216000.000000 time=3.004000 (master, no ttm) B: total texels=434110464.000000 time=3.000000 (master, ttm) C: (i915tex new --- woops, crashes) D: total texels=1111490560.000000 time=3.002000 (i915tex old) F: total texels=279969792.000000 time=3.004000 (gem) Note the huge (3x-4x) performance lead of i915tex, despite the embarassing crash in the newer version. I suspect this is unrelated to command handling and probably somebody has disabled or regressed some aspect of the texture upload path... NOTE: The reason that i915tex does so well relative to master/no-ttm is because we can upload directly to "VRAM"... master/no-ttm treats vram as a cache & always keeps a second copy of the texture safe in main memory... Hence performance isn't great for texture uploads on master/no-ttm. Here's what we're seeing on a i915 3GHz Celeron 256kB cache. Dual channel. Reportdamage disabled. DRM master: ======================================================================= *Test* *i915tex_branch* *i915 master, TTM* *i915 master, classic* ( no gem results on this machine ... ) gears 1033fps, 70.1% CPU. (i915tex) 726fps, 100% CPU. (master, ttm) 955fps, 56%CPU. (master, no-ttm) openarena 47,1fps, 17.9u, 2.7s time (i915tex) 31.5fps, 21.1u, 8.7s time (master, ttm) 39fps, 17.9u, 1.3s time (master, no-ttm) Texdown 1327MB/s (i915tex) 551MB/s (master, ttm) 572MB/s (master, no-ttm) Texdown, subimage 1014MB/s (i915tex) 134MB/s (master, ttm) 148MB/s (master, no-ttm) Ipers, no help screen 255 000 tri/s, 100% cpu (i915tex) 139 000 tri/s, 100% cpu (master, ttm) 241 000 tri/s, 100% cpu (master, no-ttm) I would summarize the results like this: - master/no-ttm has a basically "free" memory manager in terms of CPU overhead - master/ttm and GEM gain a proper memory manager but introduce a huge CPU overhead & consequent performance regression - i915tex makes use of userspace sub-allocation to resolve that regression & achieve comparable efficiency to master/no-ttm. - a separate regression seems to have killed texture upload performance on master/ttm relative to it's ancestor i915tex. Keith |
From: Thomas H. <th...@tu...> - 2008-05-19 14:07:52
|
Keith Whitwell wrote: > Texdown > 1327MB/s (i915tex) > 551MB/s (master, ttm) > 572MB/s (master, no-ttm) > Texdown, subimage > 1014MB/s (i915tex) > 134MB/s (master, ttm) > 148MB/s (master, no-ttm) > Gem on this machine (kernel 2.6.24) is hitting Texdown 342MB/s Texdown, subimage 76MB/s ... > > - a separate regression seems to have killed texture upload performance on master/ttm relative to it's ancestor i915tex. > Actually I think these are mostly issues stemming from not using write-combined mappings and instead using write-back mappings with clflush and chipset flush before binding to the GTT. Note that, from what I can tell, the i915 gem driver is still using mmap for these operations. /Thomas > Keith > > |
From: Keith P. <ke...@ke...> - 2008-05-19 18:01:07
|
On Mon, 2008-05-19 at 05:09 -0700, Keith Whitwell wrote: > I > think the latter is the significant result -- none of these experiments > in memory management significantly change the command stream the > hardware has to operate on, so what we're varying essentially is the > CPU behaviour to acheive that command stream. And it is in CPU usage > where GEM (and Keith/Eric's now-abandoned TTM driver) do significantly > dissapoint. Your GEM results do not match mine; perhaps we're running different kernels? Anything older than 2.6.24 won't be using clflush and will instead use wbinvd, a significant performance impact. Profiling would show whether this is the case. I did some fairly simple measurements using openarena and enemy territory. Kernel version 2.6.25, CPU 1.3GHz Pentium M, 915GMS with the slowest possible memory. I'm afraid I don't have a working TTM environment at present; I will try to get that working so I can do more complete comparisons. fps real user kernel glxgears classic: 665 glxgears GEM: 889 openareana classic: 17.1 59.19 37.13 1.80 openarena GEM: 24.6 44.06 25.52 5.29 enemy territory classic: 9.0 382.13 226.38 11.51 enemy territory GEM: 15.7 212.80 121.72 40.50 > Or to put it another way, GEM & master/TTM seem to burn huge > amounts > of CPU just running the memory manager. I'm not seeing that in these demos; actual allocation is costing about 3% of the CPU time. Of course, for this hardware, the obvious solution of re-using batch buffers would eliminate that cost entirely. It would be nice to see the kernel time reduced further, but it's not terrible so far. -- kei...@in... |
From: Thomas H. <th...@tu...> - 2008-05-19 18:33:09
|
Keith Packard wrote: > On Mon, 2008-05-19 at 05:09 -0700, Keith Whitwell wrote: > > >> I >> think the latter is the significant result -- none of these experiments >> in memory management significantly change the command stream the >> hardware has to operate on, so what we're varying essentially is the >> CPU behaviour to acheive that command stream. And it is in CPU usage >> where GEM (and Keith/Eric's now-abandoned TTM driver) do significantly >> dissapoint. >> > > Your GEM results do not match mine; perhaps we're running different > kernels? Anything older than 2.6.24 won't be using clflush and will > instead use wbinvd, a significant performance impact. Profiling would > show whether this is the case. > > I did some fairly simple measurements using openarena and enemy > territory. Kernel version 2.6.25, CPU 1.3GHz Pentium M, 915GMS with the > slowest possible memory. I'm afraid I don't have a working TTM > environment at present; I will try to get that working so I can do more > complete comparisons. > > fps real user kernel > glxgears classic: 665 > glxgears GEM: 889 > openareana classic: 17.1 59.19 37.13 1.80 > openarena GEM: 24.6 44.06 25.52 5.29 > enemy territory classic: 9.0 382.13 226.38 11.51 > enemy territory GEM: 15.7 212.80 121.72 40.50 > > Keith, The GEM timings were done with 2.6.25, except on the i915 system texdown timings which used 2.6.24. Indeed, Michel reported much worse GEM figures with 2.6.23. Your figures look a bit odd. Is glxgears classic CPU-bound? If not, why does it give a significantly slower framerate than glxgears GEM? The other apps are obviously GPU bound judging from the timings. They shouldn't really differ in frame-rate? /Thomas |
From: Keith P. <ke...@ke...> - 2008-05-19 18:57:21
|
On Mon, 2008-05-19 at 20:32 +0200, Thomas Hellström wrote: > Keith Packard wrote: > > On Mon, 2008-05-19 at 05:09 -0700, Keith Whitwell wrote: > > > > > >> I > >> think the latter is the significant result -- none of these experiments > >> in memory management significantly change the command stream the > >> hardware has to operate on, so what we're varying essentially is the > >> CPU behaviour to acheive that command stream. And it is in CPU usage > >> where GEM (and Keith/Eric's now-abandoned TTM driver) do significantly > >> dissapoint. > >> > > > > Your GEM results do not match mine; perhaps we're running different > > kernels? Anything older than 2.6.24 won't be using clflush and will > > instead use wbinvd, a significant performance impact. Profiling would > > show whether this is the case. > > > > I did some fairly simple measurements using openarena and enemy > > territory. Kernel version 2.6.25, CPU 1.3GHz Pentium M, 915GMS with the > > slowest possible memory. I'm afraid I don't have a working TTM > > environment at present; I will try to get that working so I can do more > > complete comparisons. > > > > fps real user kernel > > glxgears classic: 665 > > glxgears GEM: 889 > > openareana classic: 17.1 59.19 37.13 1.80 > > openarena GEM: 24.6 44.06 25.52 5.29 > > enemy territory classic: 9.0 382.13 226.38 11.51 > > enemy territory GEM: 15.7 212.80 121.72 40.50 > > > > > Keith, > > The GEM timings were done with 2.6.25, except on the i915 system texdown > timings which used 2.6.24. > Indeed, Michel reported much worse GEM figures with 2.6.23. We clearly need to find a way to generate reproducible benchmark data. Here's what I'm running: kernel: commit 4b119e21d0c66c22e8ca03df05d9de623d0eb50f Author: Linus Torvalds <tor...@li...> Date: Wed Apr 16 19:49:44 2008 -0700 Linux 2.6.25 (there's a patch to export shmem_file_setup on top of this) mesa (from git://people.freedesktop.org/~keithp/mesa): commit 8b49cc104dd556218fc769178b96f4a8a428d057 Author: Keith Packard <ke...@ke...> Date: Sat May 17 23:34:47 2008 -0700 [intel-gem] Don't calloc reloc buffers Only a few relocations are typically used, so don't clear the whole thing. drm (from git://people.freedesktop.org/~keithp/drm): commit 6e46a3c762919af05fcc6a08542faa7d185487a1 Author: Eric Anholt <er...@an...> Date: Mon May 12 15:42:20 2008 -0700 [GEM] Update testcases for new API. xf86-video-intel (from git://people.freedesktop.org/~keithp/xf86-video-intel): commit c81050c0058e32098259b5078515807038beb7d6 Merge: 9c9a5d0... e9532f3... Author: Keith Packard <ke...@ke...> Date: Sat May 17 23:26:14 2008 -0700 Merge commit 'origin/master' into drm-gem > Your figures look a bit odd. Is glxgears classic CPU-bound? If not, why > does it give a significantly slower framerate than > glxgears GEM? glxgears uses 40% of the CPU in both classic and gem. Note that the gem version takes about 20 seconds to reach a steady state -- the gem driver isn't clearing the gtt actively and so glxgears gets far ahead of the gpu. My theory is that this shows that using cache-aware copies from a single static batch buffer (as gem does now) improves cache performance and write bandwidth. -- kei...@in... |
From: Keith W. <ke...@tu...> - 2008-05-19 19:11:12
|
> > glxgears uses 40% of the CPU in both classic and gem. Note that the gem > version takes about 20 seconds to reach a steady state -- the gem driver > isn't clearing the gtt actively and so glxgears gets far ahead of the > gpu. > > My theory is that this shows that using cache-aware copies from a single > static batch buffer (as gem does now) improves cache performance and > write bandwidth. I'm still confused by your test setup... Stepping back from cache metaphysics, why doesn't classic pin the hardware, if it's still got 60% cpu to burn? I think getting reproducible results makes a lot of sense. What hardware are you actually using -- ie. what is this laptop? Keith |
From: Keith P. <ke...@ke...> - 2008-05-19 20:04:03
|
On Mon, 2008-05-19 at 20:11 +0100, Keith Whitwell wrote: > I'm still confused by your test setup... Stepping back from cache > metaphysics, why doesn't classic pin the hardware, if it's still got > 60% cpu to burn? glxgears under classic is definitely not pinning the hardware -- the 'intel_idle' tool shows that it's only using about 70% of the GPU. GEM is pinning the hardware. Usually this means there's some synchronization between the CPU and GPU causing each to wait part of the time while the other executes. I haven't really looked at the non-gem case though; the numbers seem similar enough to what I've seen in the past. > I think getting reproducible results makes a lot of sense. What > hardware are you actually using -- ie. what is this laptop? This is a Panasonic CF-R4. -- kei...@in... |
From: Thomas H. <th...@tu...> - 2008-05-20 09:21:50
|
Keith Packard wrote: > On Mon, 2008-05-19 at 20:11 +0100, Keith Whitwell wrote: > > >> I'm still confused by your test setup... Stepping back from cache >> metaphysics, why doesn't classic pin the hardware, if it's still got >> 60% cpu to burn? >> > > glxgears under classic is definitely not pinning the hardware -- the > 'intel_idle' tool shows that it's only using about 70% of the GPU. GEM > is pinning the hardware. Usually this means there's some synchronization > between the CPU and GPU causing each to wait part of the time while the > other executes. I haven't really looked at the non-gem case though; the > numbers seem similar enough to what I've seen in the past. > > >> I think getting reproducible results makes a lot of sense. What >> hardware are you actually using -- ie. what is this laptop? >> > > This is a Panasonic CF-R4. > > So we were actually using a slightly stale version of GEM from Eric's repos, Michel rerun his tests with the indicated versions without any significant changes. I've rebuilt on a HP (Compaq) nx 7300 laptop, 1GB single-channel i945G, Celeron M 430@1.73 GHz. Kernel 2.6.25rc4. This is the third system we test. I've added the "teapot" demo, since it should be completely CPU-bound, even on this machine. After the tests, I must say Keith Whitwell's conclusions seem to hold: * Intel's TTM and GEM's approaches to buffer management translate to a lot extra CPU usage and worse performance * With that approach, GEM might improve over TTM, but it's not seen here. * Classic is apparently doing suboptimal syncs that limits its performance in some cases (gears, teapot and perhaps openarena), one should not benchmark framerates against classic in those cases. And furtermore * GEM's tex(sub)image (map and copy to device) performance really sucks. It would be good to see some benchmarks using pwrite here. Gears: Classic 845fps @ 35% Intel TTM 942fps @ 61% GEM 902fps @ 61% i915tex (TTM) 977fps @ 40% Openarena + exec anholt @ 640x480 Classic 49.8fps 12.3u 1.1s Intel TTM 54.0fps 12.9u 4.6s GEM 50.3fps 12.0u 4.8s i915tex (TTM) 61.0fps 12.6u 1.7s Ipers without help screen Classic 333000 pps Intel TTM 254000 pps GEM GPU lockup i915tex (TTM) 325000 pps Teapot Classic 65.5 fps (CPU at 77%) Intel TTM 70.3 fps GEM GPU lockup i915tex (TTM) 77.0 fps Texdown + subimage Classic 452 + 510 MB/s Intel TTM 537 + 158 MB/s GEM 385 + 86 MB/s i915tex (TTM) 1185 + 1664 MB/s |
From: Keith W. <ke...@tu...> - 2008-05-20 09:35:40
|
On Mon, May 19, 2008 at 9:03 PM, Keith Packard <ke...@ke...> wrote: > On Mon, 2008-05-19 at 20:11 +0100, Keith Whitwell wrote: > >> I'm still confused by your test setup... Stepping back from cache >> metaphysics, why doesn't classic pin the hardware, if it's still got >> 60% cpu to burn? > > glxgears under classic is definitely not pinning the hardware -- the > 'intel_idle' tool shows that it's only using about 70% of the GPU. GEM > is pinning the hardware. Usually this means there's some synchronization > between the CPU and GPU causing each to wait part of the time while the > other executes. Yes, understood -- the question though is why... Classic has always been more than able to pin the hardware in gears, and there's enough buffering in the system to avoid letting the GPU go idle. It's not exactly rocket science to dump enough frames of gears into the queue to keep the GPU busy, as long as the CPU isn't itself pinned. There aren't a huge number of synchronization points in the driver that gears could hit -- the two are allocation of space for batch buffers and the throttle which prevents the app from getting more than a couple of frames ahead of the hardware... The latter is where you would expect gears to spend most of its wall time - snoozing somewhere inside swapbuffers, comfortably a frame or so ahead of hardware. If it's for some reason starved of batchbuffer space, you might see it spending time elsewhere -- stalled inside command submission or vertex emit... or there may be some unexpected other case... So possibilities are: - batchbuffer starvation -- has - over-throttling in swapbuffers -- I think we used to let it get two frames ahead - has this changed? - something else... An easy way to investigate is just run gears under gdb and hit ctrl-c periodically & see where it ends up. IE, look for a pattern in the stack traces... Most profiling tools out there try and figure out where the CPU cycles go, but at this point we're trying to figure out where the wall time goes. > I haven't really looked at the non-gem case though; the > numbers seem similar enough to what I've seen in the past. I think it's important if there are going to be performance comparisons between various versions of the memory manager that all the versions are actually working at their best. The baseline in this case is classic & for some reason it seems to be operating less well on your box than elsewhere. A consequence of that is that it risks making all the new memory managers look better than they should, because the baseline is artificially poor... As much as I like to promote the new tech, I don't think crippling the old to make it look good is a great strategy, so lets figure out why gears has regressed on classic & then re-assess how that changes the landscape for ttm & gem. It's worth noting that even 'classic' has changed fairly significantly over the last couple of years with the backport of the bufmgr_fake functionality from i965, so there were plenty of opportunities for regressions. It might be worth trying a Mesa-7.0-ish version of the driver as well. Keith |
From: Keith W. <ke...@tu...> - 2008-05-20 09:37:44
|
> So possibilities are: > - batchbuffer starvation -- has I was going to say 'has this changed significantly' -- and the answer is that it has of course, with the bufmgr_fake changes... I can't tell by quick inspection if these are a likely culprit, but it's certainly a signifcant set of changes relative to the classic version of classic... > - over-throttling in swapbuffers -- I think we used to let it get > two frames ahead - has this changed? > - something else... Keith |
From: Dave A. <ai...@li...> - 2008-05-20 10:16:14
|
> So possibilities are: > - batchbuffer starvation -- has > - over-throttling in swapbuffers -- I think we used to let it get > two frames ahead - has this changed? I would suspect this broke somehow at some point.. Dave. |
From: Keith W. <ke...@tu...> - 2008-05-20 09:40:58
|
> * Classic is apparently doing suboptimal syncs that limits its > performance in some cases (gears, teapot and perhaps openarena), > one should not benchmark framerates against classic in those cases. As I said elsewhere, I'd like to get to the bottom of this -- it wasn't always this way. Otherwise we should abandon 'classic' off the trunk and use one of the ye olde 7.0 versions. Keith |
From: Thomas H. <th...@tu...> - 2008-05-20 10:04:09
|
Keith Whitwell wrote: >> * Classic is apparently doing suboptimal syncs that limits its >> performance in some cases (gears, teapot and perhaps openarena), >> one should not benchmark framerates against classic in those cases. >> > > As I said elsewhere, I'd like to get to the bottom of this -- it > wasn't always this way. Otherwise we should abandon 'classic' off the > trunk and use one of the ye olde 7.0 versions. > > I agree. I did some benchmarks on TTM vs classic then and they were quite similar back then with TTM generally using slightly more CPU, as we would expect. TTM of course would do better on apps with certain texture functionality due to it's single texture copy. > Keith > /Thomas |
From: Johannes E. <jcn...@go...> - 2008-05-20 10:49:57
|
Hi, everyone, I wonder how you got any OpenGL-app running using Keith's GEM tree. For me even glxgears turns the screen black although AFAIK not necessarily crashing the Xserver. I will further investigate on that. Best regards, Johannes |
From: Johannes E. <jcn...@go...> - 2008-05-20 10:59:28
|
Johannes Engel schrieb: > Hi, everyone, > > I wonder how you got any OpenGL-app running using Keith's GEM tree. > For me even glxgears turns the screen black although AFAIK not > necessarily crashing the Xserver. > I will further investigate on that. OK, at least that seems not to be reproducible, since it does not occur at the moment one restart later. On my 945GM GEM lets kwin4 with composite feel much smoother. But that's only subjective. glxgears does not pin the CPU but returns values similar to those with TTM. Greetings, Johannes |
From: Thomas H. <th...@tu...> - 2008-05-20 10:59:28
|
Johannes Engel wrote: > Hi, everyone, > > I wonder how you got any OpenGL-app running using Keith's GEM tree. For > me even glxgears turns the screen black although AFAIK not necessarily > crashing the Xserver. > I will further investigate on that. > > Best regards, Johannes > > Johannes, Double-check that you're not enabling AIGLX. /Thomas > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > -- > _______________________________________________ > Dri-devel mailing list > Dri...@li... > https://lists.sourceforge.net/lists/listinfo/dri-devel > |
From: Johannes E. <jcn...@go...> - 2008-05-20 15:47:05
|
Thomas Hellström schrieb: > Johannes Engel wrote: >> Hi, everyone, >> >> I wonder how you got any OpenGL-app running using Keith's GEM tree. >> For me even glxgears turns the screen black although AFAIK not >> necessarily crashing the Xserver. >> I will further investigate on that. >> >> Best regards, Johannes >> >> > Johannes, > Double-check that you're not enabling AIGLX. > > /Thomas Without AIGLX it does not even run, since I cannot compile the glcore driver since the source file seems to miss any include. :) Greetings, Johannes |