|
From: Moore, S. <st...@sa...> - 2025-10-29 20:32:42
|
If you want more details on the challenges of virtual functions on the GPU, see https://kokkos.org/kokkos-core-wiki/ProgrammingGuide/Kokkos-and-Virtual-Functions.html. Note that this isn't a Kokkos issue, but rather a GPU issue (we would have the same issue if using raw CUDA). Stan ________________________________ From: Moore, Stan <st...@sa...> Sent: Wednesday, October 29, 2025 2:27 PM To: Max Amer <max...@gm...>; SPARTA Mailing list <spa...@li...> Subject: Re: [EXTERNAL] [sparta-users] Kokkos limit on command instances Hi Max, The problem is that GPUs don't support virtual functions very well—it is a chore to get the vtable into GPU memory. And there can be a performance hit of using virtual functions on the device too, so the current workaround is to use a compile time limit instead. Yes you can change the limit in the code then recompile and it should work. If that doesn't work please let us know. Regards, Stan ________________________________ From: Max Amer <max...@gm...> Sent: Wednesday, October 29, 2025 6:36 AM To: SPARTA Mailing list <spa...@li...> Subject: [EXTERNAL] [sparta-users] Kokkos limit on command instances Dear developers, I have recently started using Kokkos for GPU acceleration, and in reusing the codes I have for mpi CPU computing I have come across an "issue" that is giving me some complications. I have noticed that there is a hard limit on the number of instances commands such as compute_surf or surf_collide have (i.e. I can only use two surf_collide diffuse commands, two surf_collide transparent commands, or two compute_surf commands). This is an issue when you have a simulation where you want to compute more than two surfaces (which is my case) and define different surf collide models conditions to each surface. The update_kokkos.cpp and .h have the variables #define KOKKOS_MAX_SURF_COLL_PER_TYPE 2 #define KOKKOS_MAX_TOT_SURF_COLL 10 which set the limit for surf_collide types and total count, I guess a similar variable is implemented for compute_surf. If I change those limits in the code and recompile would I be able to then run my simulations with Kokkos for more surfaces and collision models or would it not work due to other factors implicit in the structure of the code? GPU acceleration seems very useful but only if I have the same flexibility to play with multiple geometry configurations, and I need to know that for proper benchmarking. Thank you in advance for your attention. Max P.S. I sent a couple other emails in the past months that have gone unanswered and I don't know if it is because there's no clear or good response to them or if maybe they got lost inside a mailbox, but I think some points are worth looking over as they can add interesting features to SPARTA! |