From: Carl-Daniel H. <c-d...@gm...> - 2003-04-28 13:08:14
|
[CC:ing lse-tech because they know better than me] Henti Smith wrote: > Hi all > > I had a discussion with somebody watching the whole M$ server launch and mentioned then new systems supports up to a terabyte of ram. > I've tried looking for a hint at what the max memory support on linux is and cannot find it anywhere. > > can somebody here enlighten me on just what the maximum amount of memory linux can deal with ? Linux supports up to 4 GB (~2^32 bytes) of memory on 32-bit architectures and 64 GB (~2^36 bytes) on x86 with PAE. No other operating system can support more on 32-bit since it is a limitation of the hardware. On 64-bit systems, Linux supports up to 16 EB (~2^64 bytes) of memory, which is about 16 million times more than the 1 TB limit of MS. Current Linux 2.4 allows 32 CPUs for 32-bit arches and 64 CPUs on 64-bit arches. However, this limit is (was?) being removed in 2.5, so you can have up to 32767 CPUs, which should be enough for you right now. (Note: I said _right now_, lest anybody make jokes about 640K limit) Regards, Carl-Daniel -- http://www.hailfinger.org/ |
From: Andi K. <ak...@su...> - 2003-04-28 17:13:57
|
> Cool. Sorry to be pestering about the 64-bit limits, but can we really > use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? (AFAIK, 64-bit > arches don't suffer from a small ZONE_LOWMEM.) No. The hardware have far smaller physical limits. Current AMD64 CPUs are limited to 40bit physical, 48bit virtal (the virtual limit per process in the current Linux kernel is 39bits) Itanium 2 afaik support a bit more 50bits (51 or 52, I forgot) physical, probably more virtual. Other 64bit architectures are somewhere inbetween. The actual limit in the machines is even less. You will have a hard time to find an affordable machine (64bit or not) with more than 8 DIMM slots. That's 16GB Max with 2GB DIMMs. -Andi |
From: David M. <da...@na...> - 2003-04-28 17:55:32
|
>>>>> On Mon, 28 Apr 2003 19:13:53 +0200, Andi Kleen <ak...@su...> said: >> Cool. Sorry to be pestering about the 64-bit limits, but can we >> really use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? >> (AFAIK, 64-bit arches don't suffer from a small ZONE_LOWMEM.) Andi> No. The hardware have far smaller physical limits. Andi> Current AMD64 CPUs are limited to 40bit physical, 48bit virtal Andi> (the virtual limit per process in the current Linux kernel is Andi> 39bits) Andi> Itanium 2 afaik support a bit more 50bits (51 or 52, I forgot) Andi> physical, probably more virtual. Itanium 2 supports all 64 virtual address bits and 50 physical bits (in what way is "1024 times more" "a bit more"? ;-). --david |
From: Gerrit H. <gh...@us...> - 2003-04-28 18:31:48
|
On Mon, 28 Apr 2003 10:53:53 PDT, David Mosberger wrote: > >>>>> On Mon, 28 Apr 2003 19:13:53 +0200, Andi Kleen <ak...@su...> said: > > >> Cool. Sorry to be pestering about the 64-bit limits, but can we > >> really use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? > >> (AFAIK, 64-bit arches don't suffer from a small ZONE_LOWMEM.) > > Andi> No. The hardware have far smaller physical limits. > > Andi> Current AMD64 CPUs are limited to 40bit physical, 48bit virtal > Andi> (the virtual limit per process in the current Linux kernel is > Andi> 39bits) > > Andi> Itanium 2 afaik support a bit more 50bits (51 or 52, I forgot) > Andi> physical, probably more virtual. > > Itanium 2 supports all 64 virtual address bits and 50 physical bits > (in what way is "1024 times more" "a bit more"? ;-). > > --david 0x400 is just one more bit, albeit slid around a byte or two. ;) gerrit |
From: David M. <da...@na...> - 2003-04-28 19:06:48
|
>>>>> On Mon, 28 Apr 2003 11:31:04 -0700, Gerrit Huizenga <gh...@us...> said: Gerrit> 0x400 is just one more bit, albeit slid around a byte or Gerrit> two. ;) Nice try! ;-) --david |
From: Andi K. <ak...@su...> - 2003-04-28 14:10:28
|
> Linux supports up to 4 GB (~2^32 bytes) of memory on 32-bit > architectures and 64 GB (~2^36 bytes) on x86 with PAE. No other That's far too optimistic. 64GB will need patches, like the pgcl patch. It is unlikely to work out of the box. Just do the math. 900MB low mem for all the kernel data structures on IA32. 44 byte struct page for each 4K page. This gives 704MB just for the mem_map array. Leaves you 196MB left for the kernel to manage your 64GB of memory and your processes. Unlikely to work. 2.5 uses 40 bytes for an struct page, but that doesn't help much. Yes you can move the kernel:user split to give the kernel more memory at the expense of the application, but that is likely to not make the user programs happy and also helps only very limited (even 2GB lowmem are probably not enough to make it run well with 64GB) Realistic limit currently is ~16GB with an IA32 box. For more you need an 64bit architecture. -Andi |
From: Carl-Daniel H. <c-d...@gm...> - 2003-04-28 15:12:10
|
Andi Kleen wrote: > > Linux supports up to 4 GB (~2^32 bytes) of memory on 32-bit > > architectures and 64 GB (~2^36 bytes) on x86 with PAE. No other > > > That's far too optimistic. 64GB will need patches, like the pgcl > patch. It is unlikely to work out of the box. Just do the math. [explanatory math snipped] > Realistic limit currently is ~16GB with an IA32 box. For more you need <marketingspeak> This means ~16GB is available on IA32 right now with vanilla kernels, and 64GB is available from vendors who are willing to apply the pgcl patch. Conclusion: Linux supports 64GB on IA32 </marketingspeak> > an 64bit architecture. > > On 64-bit systems, Linux supports up to 16 EB (~2^64 bytes) of memory That statement is OK? Thanks, Carl-Daniel -- http://www.hailfinger.org/ |
From: Andi K. <ak...@su...> - 2003-04-28 15:16:53
|
> and 64GB is available from vendors who are willing to apply the pgcl > patch. Nobody is doing that. pgcl is 2.5 only and seems to be still quite instable. Also it's extremly intrusive. -Andi |
From: Dave H. <hav...@us...> - 2003-04-28 16:53:22
|
Andi Kleen wrote: >>and 64GB is available from vendors who are willing to apply the pgcl >>patch. > > Nobody is doing that. pgcl is 2.5 only and seems to be still quite instable. > Also it's extremly intrusive. Bill will probably wake up any time now and chime in, but don't forget all of the drivers. # grep -r PAGE_SIZE drivers/ | wc -l 893 Each one of those needs to be audited before pgcl is acceptable to a wide audience. We've already seen plenty of stuff that breaks. ext2/3 look to be all right, but I know that JFS is broken. -- Dave Hansen hav...@us... |
From: Martin J. B. <mb...@ar...> - 2003-04-28 16:58:41
|
>>> and 64GB is available from vendors who are willing to apply the pgcl >>> patch. >> >> Nobody is doing that. pgcl is 2.5 only and seems to be still quite >> instable. Also it's extremly intrusive. > > Bill will probably wake up any time now and chime in, but don't forget > all of the drivers. > ># grep -r PAGE_SIZE drivers/ | wc -l > 893 > > Each one of those needs to be audited before pgcl is acceptable to a > wide audience. We've already seen plenty of stuff that breaks. ext2/3 > look to be all right, but I know that JFS is broken. Well, the upside is that he's only doing s/PAGE_SIZE/MMU_PAGESIZE/ in most places, which are normally both 4K. So it will have no effect whatsoever unless you explicitly turn it on. M. |
From: William L. I. I. <wl...@ho...> - 2003-04-28 22:41:38
|
At some point in the past, Dave Hansen wrote: >> Each one of those needs to be audited before pgcl is acceptable to a >> wide audience. We've already seen plenty of stuff that breaks. ext2/3 >> look to be all right, but I know that JFS is broken. On Mon, Apr 28, 2003 at 09:58:31AM -0700, Martin J. Bligh wrote: > Well, the upside is that he's only doing s/PAGE_SIZE/MMU_PAGESIZE/ > in most places, which are normally both 4K. So it will have no effect > whatsoever unless you explicitly turn it on. The JFS issue is general to PAGE_SIZE > 4KB, pgcl-induced or not. shaggy et al are already aware of it. Most of the driver stuff I've seen is ioremap() of O(PAGE_SIZE) which just gets denied so it fails to probe. IDE was worse (as usual), and AGP needed an unusual amount of tweaking, which probably will be typical for the graphics drivers in general. Block stuff seems to be well-abstracted, so basically the only semantically significant needed fix for block drivers is for 512*q->max_len < PAGE_SIZE (not yet done). -- wli |
From: Dave J. <da...@co...> - 2003-04-28 23:51:37
|
On Mon, Apr 28, 2003 at 03:40:25PM -0700, William Lee Irwin III wrote: > Most of the driver stuff I've seen is ioremap() of O(PAGE_SIZE) which > just gets denied so it fails to probe. IDE was worse (as usual), and > AGP needed an unusual amount of tweaking, which probably will be > typical for the graphics drivers in general. Is this stuff in the current pgcl patch? I've not looked at it, but wouldn't mind a look-see sometime. Dave |
From: William L. I. I. <wl...@ho...> - 2003-04-29 00:01:44
|
On Mon, Apr 28, 2003 at 03:40:25PM -0700, William Lee Irwin III wrote: >> Most of the driver stuff I've seen is ioremap() of O(PAGE_SIZE) which >> just gets denied so it fails to probe. IDE was worse (as usual), and >> AGP needed an unusual amount of tweaking, which probably will be >> typical for the graphics drivers in general. On Tue, Apr 29, 2003 at 12:50:23AM +0100, Dave Jones wrote: > Is this stuff in the current pgcl patch? I've not looked at it, > but wouldn't mind a look-see sometime. Not much of it. Basically I've only swept the drivers for the systems I've been hacking on. My driver-fu is limited anyway. Most of it is really boring, basically changing size specifications for blocks of memory used by the driver from being defined in terms of MMUPAGE_SIZE when they need to be in 4KB units (which is the hardware pagesize on most cpus). IDE was sizing its PRD tables in terms of PAGE_SIZE, so that needed quickfixing and it's otherwise mostly immune to the effect(s), and that's in the patch. AGP was fiddling around with something, which very well may have been some kind of GART aperture for all I know about it, and needed to use MMUPAGE_SIZE to think of its size correctly. Hugh's 2.4.x code had a better sampling of what's needed for DRM and AGP in general, along with various fixes for other framebuffer drivers, but it predated 2.4.8, at which time some kind of enormous DRM merge happened and clobbered things. -- wli |
From: Dave J. <da...@co...> - 2003-04-29 00:07:19
|
On Mon, Apr 28, 2003 at 05:00:14PM -0700, William Lee Irwin III wrote: > AGP was fiddling around with something, which very well may > have been some kind of GART aperture for all I know about it, and needed > to use MMUPAGE_SIZE to think of its size correctly. A lot of GARTs can only operate on 4KB pages. As long as this is kept in mind, things should tick along just fine. Even those that can operate with different size pages, we still treat as 4KB. Dave |
From: William L. I. I. <wl...@ho...> - 2003-04-29 00:14:35
|
On Mon, Apr 28, 2003 at 05:00:14PM -0700, William Lee Irwin III wrote: >> AGP was fiddling around with something, which very well may >> have been some kind of GART aperture for all I know about it, and needed >> to use MMUPAGE_SIZE to think of its size correctly. On Tue, Apr 29, 2003 at 01:06:02AM +0100, Dave Jones wrote: > A lot of GARTs can only operate on 4KB pages. As long as this is kept in > mind, things should tick along just fine. Even those that can operate > with different size pages, we still treat as 4KB. That's basically the gist of it. The "larger page" is purely a software construct anyway, so the search and replace approach should be fine. i.e. s/PAGE_SIZE/MMUPAGE_SIZE/ for the appropriate bits of GART drivers It won't fix the issues with true hardware page sizes of larger than 4KB, but the issues raised by the larger software page size (really get_free_pages() allocation unit) are very fixable there. -- wli |
From: William L. I. I. <wl...@ho...> - 2003-04-28 22:36:05
|
Andi Kleen wrote: >> Nobody is doing that. pgcl is 2.5 only and seems to be still quite instable. >> Also it's extremly intrusive. On Mon, Apr 28, 2003 at 09:52:20AM -0700, Dave Hansen wrote: > Bill will probably wake up any time now and chime in, but don't forget > all of the drivers. > # grep -r PAGE_SIZE drivers/ | wc -l > 893 > Each one of those needs to be audited before pgcl is acceptable to a > wide audience. We've already seen plenty of stuff that breaks. ext2/3 > look to be all right, but I know that JFS is broken. I don't have a good estimate for speed-of-processing on the driver front. My current guesstimates are based on something around 5 drivers a day per person. -- wli |
From: William L. I. I. <wl...@ho...> - 2003-04-28 22:34:49
|
At some point in the past, someone wrote: >> and 64GB is available from vendors who are willing to apply the pgcl >> patch. On Mon, Apr 28, 2003 at 05:16:49PM +0200, Andi Kleen wrote: > Nobody is doing that. pgcl is 2.5 only and seems to be still quite instable. > Also it's extremly intrusive. It's not ready for a vendor to ship, no. OTOH 2.5 itself isn't either. -- wli |
From: William L. I. I. <wl...@ho...> - 2003-04-29 04:04:00
|
On Mon, Apr 28, 2003 at 05:16:49PM +0200, Andi Kleen wrote: > Nobody is doing that. pgcl is 2.5 only and seems to be still quite instable. > Also it's extremly intrusive. Unfortunately true. For all the merits of the technique itself, I'm not hugh, and my results are proportionally less impressive. That said, the intrusiveness aspect is easily dealt with as the patch itself is very easy to chop into small pieces and merge incrementally. I'd also say my progress toward stabilization is steady, though, of course, things are by no means perfect. At this point I'm literally more concerned about cleanliness than pure stability, as the stability aspects lacking appear to be related primarily to sweeping through code I can't regularly test anyway. After that, of course, due diligence demands the driver sweeps etc. be carried out before fully merging, but it's ultimately busywork. Not to say it's any less essential, for a kernel would be useless without drivers, but that's what it is. If this is not your experience, I'd love to hear things like bugreports and so on. Whatever feedback I can get I'd be very grateful for. I feel that the highmem emphasis of my own particular effort has marginalized the patch, and no one's really trying it out for things like large fs blocksize and the linear space reductions and speedups. Of course, that may be partially due to the fact I've not merged some of the things needed to properly combat fragmentation into the patch, but those will be taken care of in the next 72-96 hours (if they're not and I can't write about them ajh will strangle me, but I've already taken care of most of it anyway, and just haven't posted a release with the stuff). -- wli |
From: Dave H. <hav...@us...> - 2003-04-28 16:46:57
|
Andi Kleen wrote: > Realistic limit currently is ~16GB with an IA32 box. For more you need > an 64bit architecture. Let's say 32GB :) It boots just fine with 2.5.68, no additional patches. There's even half a gig of lowmem free. curly:~# cat /proc/meminfo MemTotal: 32688576 kB MemFree: 32644196 kB Buffers: 3632 kB Cached: 8068 kB SwapCached: 0 kB Active: 9420 kB Inactive: 4616 kB HighTotal: 32112640 kB HighFree: 32098240 kB LowTotal: 575936 kB LowFree: 545956 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 160 kB Writeback: 0 kB Mapped: 4596 kB Slab: 8316 kB Committed_AS: 5544 kB PageTables: 260 kB VmallocTotal: 114680 kB VmallocUsed: 3792 kB VmallocChunk: 110888 kB -- Dave Hansen hav...@us... |
From: Carl-Daniel H. <c-d...@gm...> - 2003-04-28 16:58:05
|
Dave Hansen wrote: > Andi Kleen wrote: > >>Realistic limit currently is ~16GB with an IA32 box. For more you need >>an 64bit architecture. > > > Let's say 32GB :) It boots just fine with 2.5.68, no additional > patches. There's even half a gig of lowmem free. Cool. Sorry to be pestering about the 64-bit limits, but can we really use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? (AFAIK, 64-bit arches don't suffer from a small ZONE_LOWMEM.) Regards, Carl-Daniel -- http://www.hailfinger.org/ |
From: Dave H. <hav...@us...> - 2003-04-28 17:17:43
|
Carl-Daniel Hailfinger wrote: > Cool. Sorry to be pestering about the 64-bit limits, but can we really > use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? (AFAIK, 64-bit > arches don't suffer from a small ZONE_LOWMEM.) First of all, I'm not sure any of the 64-bit arches even fully support 64-bit physical addresses. If I remember correctly the first hammers support 40 bits, with more to be added later. Power4 is in close to the same boat, but I know they go up to 256GB today (I seem to recall something about 44-bit being the limit, though). Don't forget that highmem starts to be needed before the 4G boundary. The kernel has only 1GB of virtual space (look for PAGE_OFFSET, which defines it), which means that you start needing to pull all of the highmem trickery before you get to the actual limits. Nobody knows how far it will go. It's fairly safe to say that, at this rate, Linux will keep up with whatever hardware anyone produces. Unless, of course, someone gets even more perverse than PAE. :) -- Dave Hansen hav...@us... |
From: Carl-Daniel H. <c-d...@gm...> - 2003-04-28 17:36:24
|
Dave Hansen wrote: > Carl-Daniel Hailfinger wrote: > >>Cool. Sorry to be pestering about the 64-bit limits, but can we really >>use 2^64 bytes of memory on ia64/ppc64/x86-64 etc.? (AFAIK, 64-bit >>arches don't suffer from a small ZONE_LOWMEM.) > > [...] > Don't forget that highmem starts to be needed before the 4G boundary. > The kernel has only 1GB of virtual space (look for PAGE_OFFSET, which > defines it), which means that you start needing to pull all of the > highmem trickery before you get to the actual limits. It seems I misunderstood the concept of highmem. I thought highmem was not needed on 64-bit arches. Thanks for pointing that out to me. > > Nobody knows how far it will go. It's fairly safe to say that, at this > rate, Linux will keep up with whatever hardware anyone produces. That is the answer the original poster was looking for. > Unless, of course, someone gets even more perverse than PAE. :) hehe ;-) Can you say PAE in userspace? Regards, Carl-Daniel -- http://www.hailfinger.org/ |
From: Andi K. <ak...@su...> - 2003-04-28 17:46:37
|
> > Don't forget that highmem starts to be needed before the 4G boundary. > > The kernel has only 1GB of virtual space (look for PAGE_OFFSET, which > > defines it), which means that you start needing to pull all of the > > highmem trickery before you get to the actual limits. > > It seems I misunderstood the concept of highmem. I thought highmem was > not needed on 64-bit arches. Thanks for pointing that out to me. No, your original understandarding was correct. Highmem is not used on 64bit. -Andi |
From: William L. I. I. <wl...@ho...> - 2003-04-28 22:58:47
|
Dave Hansen wrote: >> Unless, of course, someone gets even more perverse than PAE. :) On Mon, Apr 28, 2003 at 07:36:08PM +0200, Carl-Daniel Hailfinger wrote: > hehe ;-) Can you say PAE in userspace? In a sense, already done; c.f. sys_remap_file_pages(). -- wli |
From: William L. I. I. <wl...@ho...> - 2003-04-28 22:50:09
|
On Mon, Apr 28, 2003 at 10:16:43AM -0700, Dave Hansen wrote: > Nobody knows how far it will go. It's fairly safe to say that, at this > rate, Linux will keep up with whatever hardware anyone produces. > Unless, of course, someone gets even more perverse than PAE. :) 32-bit kernels on 64-bit machines with RAM capacities larger than 64GB would seem to raise issues far more severe than i386 PAE, but to date little (if any) interest has been expressed in using Linux in such scenarios. -- wli |