ggml webgpu: shader library organization (#19530)
* Basic JIT compilation for mul_mat, get_rows, and scale (#17)
* scale jit working
* preliminary working jit for getrows and mulmat, needs refining
* simplified mul_mat preprocessing switch statement
* get_rows fixes, mul_mat refinement
* formatted + last edits
* removed some extraneous prints
* fixed get_rows, fixed workgroup dispatch in mul_mat. no gibberish
* small fix
* some changes, working
* get_rows and mul_mat jit fixed and working
* Update formatting
* formatting
* Add header
---------
Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>
* Start work on all-encompassing shader library
* refactor argmax, set_rows
* Refactor all but flashattention, mat mul
* flashattention and matrix multiplication moved to new format
* clean up preprocessing
* Formatting
* remove duplicate constants
* Split large shaders into multiple static strings
---------
Co-authored-by: neha-ha <137219201+neha-ha@users.noreply.github.com>
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: