While that may be true currently, it can easily be wrapped with a malloc that overallocates and stores the offset in the spare bytes. Pretty much what _aligned_malloc does already (from how I understand what this seems to do https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-crt/misc/mingw-aligned-malloc.c ). What I hadn't thought of (and you undoubtably have) was API interoperability: With this change, MSVC couldn't free a buffer mingw-w64 code malloc-ed (unless there was some manual "unwrapping"...
for the second half / background info / related discussion see: https://sourceforge.net/p/mingw-w64/bugs/779/ Also mentioned there, I forgot to add here: This disparity only arises on i386. Also, the alignment of pointers returend by the current malloc implementation (described by the same/similar wording in the standard) is currently only 8, not 16.
max_align_t definition outdated (on i386) since GCC 7
max_align_t depends on header inclusion order