I’d like to propose an enhancement to the memory management in RamDyn by integrating 2MiB large pages as an alternative to AWE physical memory. While AWE offers a way to prevent memory from being written to the pagefile, it comes with performance drawbacks. Specifically, AWE tends to be several times slower than regular memory when it comes to random 4K read/write operations.
On the other hand, 2MiB large pages are non-pageable but without the performance hit. In fact using large pages could provide a small performance boost by reducing page table overhead and minimizing TLB misses.
Both AWE and large pages rely on the SeLockMemoryPrivilege for access to locked memory so users will need the necessary privilege to use this feature.
This approach would effectively achieve the same goals as AWE but with far better performance especially for workloads that require high-speed random access to memory.
This feature should only be used if the BlockSize is set to at least 2MiB and a power of 2, ensuring proper alignment and optimal performance.
I’ve implemented the use of 2MiB large pages for dynamically allocated memory in RamDyn. The core functionality is working but the implementation is partial lacking features like command-line option, RamDiskUI support, and registry entry which still need to be added.
I can post the current implementation, but these additional features have not yet been added.
PS:
There are also huge pages (greater than 1GiB) but I have not yet attempted to implement them.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The AWE API has the advantage to allow allocation of more than 4GB on some 32-bit editions of Windows.
For a 64-bit software, indeed, this advantage no longer exists.
However, as said in the documentation, because of memory fragmentation, allocating large pages can be often slow.
In some cases, I assume it can be even slower than AWE. But you will not see that with a benchmark.
Every day I have to use Windows I wonder how Microsoft manages to do things in the most convoluted or incompetent way. I know Linux isn’t perfect either but I really wish Microsoft would implement something like a proper tmpfs instead of constantly pushing AI features nobody asked for.
Anyway, I’ll just use this and see if any issues come up.
Sorry for bothering you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, in fact this is a very interesting matter.
I could workaround the crashes by adding that as an option, but about the performances, I have read several complaints.
Because of the RAM fragmentation, it should be faster (and much faster than AWE) if the RAM blocks are allocated once and for all, and slower if allocations and deallocations are done repeatedly. And benchmarks such as CrystalDiskMark preallocate the space used, therefore you can hardly see the time required by the allocations.
All that really deserves some tests...
For now, I will be interested if you have some benchmarks (even with files preallocated), or if you see any specific behavior.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've completed implementing all the core functionality of the Large Pages, including the UI. However I haven't tackled localization.
I've been running the ramdisk all day yesterday under various workloads — using it as a compilation directory, running fio tests and even installing virtual machines directly onto it. I encountered no errors, crashes, or instability during any of these tests.
n terms of performance, it significantly outperforms AWE-based solution. I'm running on a system with 128 GiB of RAM, so I didn’t run into any memory fragmentation issues. For my use case, a block size of 2 MiB seems ideal. I suspect larger sizes — especially 1 GiB — might cause issues, but I haven't tested that in detail.
I've attached the runtime logs for the following configurations:
* Dynamic allocation
* Dynamic allocation (AWE)
* Dynamic allocation (Large Pages)
* Full allocation
* Full allocation (AWE)
Some observations:
* Dynamic allocation with AWE is noticeably slow.
* Dynamic allocation with Large Pages performs about the same as standard dynamic allocation.
* I haven’t tested full allocation with Large Pages.
PS: Newer versions of Windows 11 no longer include wmic by default and the following snippet causes the build process to fail:
@for/f%%Iin('wmic OS get LocalDateTime ^| find "."')do@setD=%%Iecho#defineAPP_VERSION"%D:~0,8%">inc\build.hecho#defineAPP_NUMBER%D:~0,8%>>inc\build.h
The README should mention that users can re-enable it using:
Thanks a lot for these informations.
Yes, the difference is awesome. But of course, with 128 GB of RAM, it is unlikely to face RAM fragmentation.
And with so much RAM, the big question comes back: why the hell keeping a page file? But let's be not too unpleasant. :)
Anyway, this is definitely something to add.
And yes, I didn't notice about wmic. It is not available by default on Windows 11 24H2 (my development system is still a previous version). A bit surprising, as this is still a useful tool.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah with 128 GiB of ram fragmentation is usually not an issue, but I still prefer to keep a page file mainly because some company tools I use actually require it. I’d rather have my browser paged out than risk a crash if memory runs low. That’s also why I want to avoid the ramdisk itself being paged out — defeats the whole purpose of having one :)
I’ve attached the changes I made. Please review them before accepting. I’ve marked a few spots with FIXME and TODO to make them easier to find.
Some quick notes:
* RamDiskUI/resource.rc: The checkbox positioning needs adjustment. It’s currently getting cut off. Also I’d recommend renaming ID_CHECK200000 (200000 hex = 2^21 = 2 MiB, i.e., large page) to something more descriptive or to your liking.
* RamDiskUI/resource.h: Same note about ID_CHECK200000.
* RamDiskUI/RamDiskUI.c: I’ve hardcoded the text since I wasn’t sure how to localize it properly based on the current structure, no offense intended. You might want to consider using GetPrivateProfileString for this. That said I haven’t dug into why the current file parsing is implemented the way it is, just flagging this as a thought not a critique.
* It’s been a while since I wrote GUI code so you might want to double-check the checkbox creation and toggling behavior especially regarding enabling/disabling and the dependency logic where the "Use Large Pages" and "Use AWE physical memory" can't be both enabled at the same time.
P.S. Just for fun I’ve attached some fio benchmark results for a Linux VM running on a Windows-hosted ramdisk. I tested:
* Native Linux tmpfs
* Loop-mounted NTFS using ntfs3 (kernel driver)
* Loop-mounted NTFS using ntfs-3g (FUSE driver)
Even in a VM on top of a Windows ramdisk Linux tmpfs is still blazing fast compared to anything on the Windows side. Not blaming you or Olof at all — I’ve got great respect for both of you. Just venting a bit of frustration at Microsoft’s priorities.
Thanks. I just took a look. Even if it seems correct, don't be offended if I write something different. :)
But it's still good to have an example that works.
I need to take time to dig into that.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I now have my implementation of large-page allocations in RamDyn. Here are the results with CrystalDiskMark.
The small requests with AWE are pretty bad...
For the large-pages, as expected, results are close to the virtual memory allocations. However, it seems that large requests with several queues are a bit slower. I don't know why, but it's not a huge difference.
But as I said, a benchmark does not always reflect real life performances, and we cannot see here the slowdown in the case the memory is very fragmented.
Nonetheless, for users who have a lot of RAM (and therefore likely less fragmented) and, even so, a page file, and want to have their ramdisk never on the pagefile, it will be interesting.
The fun is over for me, as I have to work on the GUI...
I’d like to propose an enhancement to the memory management in RamDyn by integrating 2MiB large pages as an alternative to AWE physical memory. While AWE offers a way to prevent memory from being written to the pagefile, it comes with performance drawbacks. Specifically, AWE tends to be several times slower than regular memory when it comes to random 4K read/write operations.
On the other hand, 2MiB large pages are non-pageable but without the performance hit. In fact using large pages could provide a small performance boost by reducing page table overhead and minimizing TLB misses.
Both AWE and large pages rely on the SeLockMemoryPrivilege for access to locked memory so users will need the necessary privilege to use this feature.
This approach would effectively achieve the same goals as AWE but with far better performance especially for workloads that require high-speed random access to memory.
You can find more details about large page support in the official Microsoft documentation here.
This feature should only be used if the BlockSize is set to at least 2MiB and a power of 2, ensuring proper alignment and optimal performance.
I’ve implemented the use of 2MiB large pages for dynamically allocated memory in RamDyn. The core functionality is working but the implementation is partial lacking features like command-line option, RamDiskUI support, and registry entry which still need to be added.
I can post the current implementation, but these additional features have not yet been added.
PS:
There are also huge pages (greater than 1GiB) but I have not yet attempted to implement them.
The AWE API has the advantage to allow allocation of more than 4GB on some 32-bit editions of Windows.
For a 64-bit software, indeed, this advantage no longer exists.
However, as said in the documentation, because of memory fragmentation, allocating large pages can be often slow.
In some cases, I assume it can be even slower than AWE. But you will not see that with a benchmark.
Besides, this feature has a bug on some versions of Windows:
https://sourceforge.net/p/sevenzip/discussion/45797/thread/e730c709/
Not so great eh?
Every day I have to use Windows I wonder how Microsoft manages to do things in the most convoluted or incompetent way. I know Linux isn’t perfect either but I really wish Microsoft would implement something like a proper tmpfs instead of constantly pushing AI features nobody asked for.
Anyway, I’ll just use this and see if any issues come up.
Sorry for bothering you.
Well, in fact this is a very interesting matter.
I could workaround the crashes by adding that as an option, but about the performances, I have read several complaints.
Because of the RAM fragmentation, it should be faster (and much faster than AWE) if the RAM blocks are allocated once and for all, and slower if allocations and deallocations are done repeatedly. And benchmarks such as CrystalDiskMark preallocate the space used, therefore you can hardly see the time required by the allocations.
All that really deserves some tests...
For now, I will be interested if you have some benchmarks (even with files preallocated), or if you see any specific behavior.
I've completed implementing all the core functionality of the Large Pages, including the UI. However I haven't tackled localization.
I've been running the ramdisk all day yesterday under various workloads — using it as a compilation directory, running fio tests and even installing virtual machines directly onto it. I encountered no errors, crashes, or instability during any of these tests.
n terms of performance, it significantly outperforms AWE-based solution. I'm running on a system with 128 GiB of RAM, so I didn’t run into any memory fragmentation issues. For my use case, a block size of 2 MiB seems ideal. I suspect larger sizes — especially 1 GiB — might cause issues, but I haven't tested that in detail.
Here's the command I used to test and benchmark:
I've attached the runtime logs for the following configurations:
* Dynamic allocation
* Dynamic allocation (AWE)
* Dynamic allocation (Large Pages)
* Full allocation
* Full allocation (AWE)
Some observations:
* Dynamic allocation with AWE is noticeably slow.
* Dynamic allocation with Large Pages performs about the same as standard dynamic allocation.
* I haven’t tested full allocation with Large Pages.
PS: Newer versions of Windows 11 no longer include wmic by default and the following snippet causes the build process to fail:
The README should mention that users can re-enable it using:
I suggest replacing it with this PowerShell-based version which works reliably:
Let me know if you'd like me to test any other scenarios or provide more feedback.
Last edit: Milan 2025-09-01
Thanks a lot for these informations.
Yes, the difference is awesome. But of course, with 128 GB of RAM, it is unlikely to face RAM fragmentation.
And with so much RAM, the big question comes back: why the hell keeping a page file? But let's be not too unpleasant. :)
Anyway, this is definitely something to add.
And yes, I didn't notice about wmic. It is not available by default on Windows 11 24H2 (my development system is still a previous version). A bit surprising, as this is still a useful tool.
Yeah with 128 GiB of ram fragmentation is usually not an issue, but I still prefer to keep a page file mainly because some company tools I use actually require it. I’d rather have my browser paged out than risk a crash if memory runs low. That’s also why I want to avoid the ramdisk itself being paged out — defeats the whole purpose of having one :)
I’ve attached the changes I made. Please review them before accepting. I’ve marked a few spots with FIXME and TODO to make them easier to find.
Some quick notes:
* RamDiskUI/resource.rc: The checkbox positioning needs adjustment. It’s currently getting cut off. Also I’d recommend renaming ID_CHECK200000 (200000 hex = 2^21 = 2 MiB, i.e., large page) to something more descriptive or to your liking.
* RamDiskUI/resource.h: Same note about ID_CHECK200000.
* RamDiskUI/RamDiskUI.c: I’ve hardcoded the text since I wasn’t sure how to localize it properly based on the current structure, no offense intended. You might want to consider using GetPrivateProfileString for this. That said I haven’t dug into why the current file parsing is implemented the way it is, just flagging this as a thought not a critique.
* It’s been a while since I wrote GUI code so you might want to double-check the checkbox creation and toggling behavior especially regarding enabling/disabling and the dependency logic where the "Use Large Pages" and "Use AWE physical memory" can't be both enabled at the same time.
P.S. Just for fun I’ve attached some fio benchmark results for a Linux VM running on a Windows-hosted ramdisk. I tested:
* Native Linux tmpfs
* Loop-mounted NTFS using ntfs3 (kernel driver)
* Loop-mounted NTFS using ntfs-3g (FUSE driver)
Even in a VM on top of a Windows ramdisk Linux tmpfs is still blazing fast compared to anything on the Windows side. Not blaming you or Olof at all — I’ve got great respect for both of you. Just venting a bit of frustration at Microsoft’s priorities.
Last edit: Milan 2025-09-01
Thanks. I just took a look. Even if it seems correct, don't be offended if I write something different. :)
But it's still good to have an example that works.
I need to take time to dig into that.
I now have my implementation of large-page allocations in RamDyn. Here are the results with CrystalDiskMark.
The small requests with AWE are pretty bad...
For the large-pages, as expected, results are close to the virtual memory allocations. However, it seems that large requests with several queues are a bit slower. I don't know why, but it's not a huge difference.
But as I said, a benchmark does not always reflect real life performances, and we cannot see here the slowdown in the case the memory is very fragmented.
Nonetheless, for users who have a lot of RAM (and therefore likely less fragmented) and, even so, a page file, and want to have their ramdisk never on the pagefile, it will be interesting.
The fun is over for me, as I have to work on the GUI...