I use PGGB so my albums are 30 GB (WAV files), but reading to RAM is only 600MB/s (even with parallelize file read) for a NVME (3500MB/s read capable).
Could you make read speed faster?
On preliminary testing, it seems it is possible to transfer 1GB/s to 2GB/s WAV with converting to PCM16/24/32/32float using memory mapped file read and assembly optimized bit-depth conversion. It seems the bottleneck of the process is copying data onto main memory. following is Intel Advisor result.
ASM PCM16toF32 100M sample conversion in 0.022285 sec. 4.487364 Gsamples/sec
C++ PCM16toF32 100M sample conversion in 0.053660 sec. 1.863582 Gsamples/sec
ASM PCM16to24 100M sample conversion in 0.068168 sec. 1.466968 Gsamples/sec
C++ PCM16to24 100M sample conversion in 0.144177 sec. 0.693593 Gsamples/sec
ASM PCM24toF32 100M sample conversion in 0.029918 sec. 3.342425 Gsamples/sec
C++ PCM24toF32 100M sample conversion in 0.110455 sec. 0.905349 Gsamples/sec
ASM PCM24to32 100M sample conversion in 0.029540 sec. 3.385252 Gsamples/sec
C++ PCM24to32 100M sample conversion in 0.102460 sec. 0.975993 Gsamples/sec
ASM PCM16to32 100M sample conversion in 0.039539 sec. 2.529155 Gsamples/sec
C++ PCM16to32 100M sample conversion in 0.048266 sec. 2.071843 Gsamples/sec
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
5GB/sec WAV file read was achieved with 8-queue IO Completion ports with 8 thread parallel read with AVX512F and AVX512BW bit depth conversion. Now integrating it with PlayPcmWin code.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2022-08-24
Wow, great!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Implemented on PlayPcmWin 5.0.85 but read performance improvement is not enough. Found the read performance bottleneck is on C# marshaling part of the code...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2023-01-23
Thanks, but in my system there is no improvement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
True. I'm fixing the bottleneck part. Not completed yet. There are a lot of workaround code for malformed WAV files on C# WAV file read program and I'd like to port it to native C++ code to keep compatibility and it takes some time...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is pretty interesting problem. I'd like to look into it when I have a spare time
Great.
On preliminary testing, it seems it is possible to transfer 1GB/s to 2GB/s WAV with converting to PCM16/24/32/32float using memory mapped file read and assembly optimized bit-depth conversion. It seems the bottleneck of the process is copying data onto main memory. following is Intel Advisor result.
ASM PCM16toF32 100M sample conversion in 0.022285 sec. 4.487364 Gsamples/sec
C++ PCM16toF32 100M sample conversion in 0.053660 sec. 1.863582 Gsamples/sec
ASM PCM16to24 100M sample conversion in 0.068168 sec. 1.466968 Gsamples/sec
C++ PCM16to24 100M sample conversion in 0.144177 sec. 0.693593 Gsamples/sec
ASM PCM24toF32 100M sample conversion in 0.029918 sec. 3.342425 Gsamples/sec
C++ PCM24toF32 100M sample conversion in 0.110455 sec. 0.905349 Gsamples/sec
ASM PCM24to32 100M sample conversion in 0.029540 sec. 3.385252 Gsamples/sec
C++ PCM24to32 100M sample conversion in 0.102460 sec. 0.975993 Gsamples/sec
ASM PCM16to32 100M sample conversion in 0.039539 sec. 2.529155 Gsamples/sec
C++ PCM16to32 100M sample conversion in 0.048266 sec. 2.071843 Gsamples/sec
This is PCM24bit to 32 float conversion performance
5GB/sec WAV file read was achieved with 8-queue IO Completion ports with 8 thread parallel read with AVX512F and AVX512BW bit depth conversion. Now integrating it with PlayPcmWin code.
Wow, great!
Implemented on PlayPcmWin 5.0.85 but read performance improvement is not enough. Found the read performance bottleneck is on C# marshaling part of the code...
Thanks, but in my system there is no improvement.
True. I'm fixing the bottleneck part. Not completed yet. There are a lot of workaround code for malformed WAV files on C# WAV file read program and I'd like to port it to native C++ code to keep compatibility and it takes some time...