|
From: Nnaemeka A. <nna...@en...> - 2025-09-30 18:12:10
|
Hi Stan,
Yh I tried global mem/limit grid, but that didn't work:
SPARTA (24 Sep 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:473)
Created 184320000 child grid cells
CPU time = 6.14946 secs
create/ghost percent = 86.6197 13.3803
ERROR on proc 3: Global mem/limit setting cannot exceed 2GB (../update.cpp:1872)
ERROR on proc 4: Global mem/limit setting cannot exceed 2GB (../update.cpp:1872) etc...
Where do I find particles per MPI rank?? Unfortunately the terminal output is all I get, the log.sparta file remains empty... Is there an optimal number of particles/cells/elements per MPI rank for SPARTA?
Thanks,
N.
________________________________
From: Moore, Stan <st...@sa...>
Sent: 30 September 2025 15:06
To: Nnaemeka Anyamele <nna...@en...>; Steve Plimpton <sj...@gm...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: Re: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
It appears you are running out of memory (RAM). How many particles per MPI rank? You could try "global mem/limit grid" instead, but ultimately you may need to use more CPU nodes.
Stan
________________________________
From: Nnaemeka Anyamele <nna...@en...>
Sent: Monday, September 29, 2025 4:46 PM
To: Moore, Stan <st...@sa...>; Steve Plimpton <sj...@gm...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: Re: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
Hi Steve/Stan,
Ok so I've upgraded SPARTA and run with the following input file:
seed 1221
dimension 3
global gridcut 0.0 comm/sort yes
boundary o o o
create_box -0.01 0.0400064091 -0.015 0.015 -0.015 0.015
create_grid 400 240 240 levels 2 subset * * * * 2 2 2
global mem/limit 1024
balance_grid rcb cell
global nrho 4.689623478e+21 fnum 1e+10
species air.species N2 O2
mixture fluid N2 O2 vstream 710.6291787 0 0 temp 44.73286308
mixture fluid N2 frac 0.78849
mixture fluid O2 frac 0.21151
read_surf SPARTArun00008.geom
surf_collide 1 diffuse 295 0.9
surf_modify all collide 1
collide vss fluid air.vss relax constant
react none
fix inflow emit/face fluid xlo twopass
variable Nfreq equal 10
compute fluid5 property/grid all id xc yc zc
dump fluid5 grid all 5640 xport_stdy_grid.*.stdydata id c_fluid5[*]
compute model1 surf all fluid n nflux_incident fx fy fz press etot
compute model1b reduce sum c_model1[*]
dump model1 surf all ${Nfreq} xport_stdy_model.*.stdydata id c_model1[*]
compute model2 property/surf all id area xc yc zc
dump model2 surf all 5640 xport_stdy_surf.*.stdydata id c_model2[*]
timestep 2.5e-07
stats 10
stats_style step cpu wall np nattempt ncoll nscoll nscheck c_model1b[*]
run 2820
It managed to complete the migration, but eventually the program terminated (and a few seconds later the terminal crashed...!)... No error was thrown, it just terminated (by the looks of it, during/just after the reading surface stage of initialisation). This is what was output in the terminal:
SPARTA (24 Sep 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:473)
Created 184320000 child grid cells
CPU time = 6.42229 secs
create/ghost percent = 85.3168 14.6832
Balance grid migrated 165888000 cells
CPU time = 171.819 secs
reassign/sort/migrate/ghost percent = 4.7303 0.212882 66.0955 28.9613
Reading surface file ...
6390 triangles
0 0.0283564 xlo xhi
-0.00499695 0.00499695 ylo yhi
-0.00498782 0.005 zlo zhi
9.82486e-06 min triangle edge length
1.99208e-09 min triangle area
Terminated
Not certain what this is indicative of... I had just restarted the PC to reset memory etc... What do you think?
Thanks!
N.
________________________________
From: Moore, Stan <st...@sa...>
Sent: 29 September 2025 16:49
To: Steve Plimpton <sj...@gm...>; Nnaemeka Anyamele <nna...@en...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: Re: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
>I rummage through older threads on here suggested adding the command "global mem/limit 1024" but this then throws the following error
I think this may be fixed by https://github.com/sparta/sparta/pull/549. Can you please try with the latest version of SPARTA? You do need to keep "global mem/limit 1024".
Stan
________________________________
From: Steve Plimpton <sj...@gm...>
Sent: Monday, September 29, 2025 8:36 AM
To: Nnaemeka Anyamele <nna...@en...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
Seeing your input script would help diagnose this.
Steve
On Sun, Sep 28, 2025 at 5:26 PM Nnaemeka Anyamele via sparta-users <spa...@li...<mailto:spa...@li...>> wrote:
Hi,
I would like to run a case with 800 x 480 x 480 cells (= 184,320,000), and with 'balance_grid rcb cell' command, but I receive the following error:
SPARTA (20 Jan 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:471)
Created 184320000 child grid cells
CPU time = 7.91065 secs
create/ghost percent = 89.2979 10.7021
ERROR on proc 6: Migrate cells send buffer exceeds 2 GB (../comm.cpp:281)
ERROR on proc 2: Migrate cells send buffer exceeds 2 GB (../comm.cpp:281) etc...
The error is self-explanatory, but it's not clear what I might be able to do to get around it? I rummage through older threads on here suggested adding the command "global mem/limit 1024" but this then throws the following error:
SPARTA (20 Jan 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:471)
Created 184320000 child grid cells
CPU time = 7.21874 secs
create/ghost percent = 88.0565 11.9435
ERROR on proc 2: Irregular comm recv buffer exceeds 2 GB (../irregular.cpp:653)
ERROR on proc 7: Irregular comm recv buffer exceeds 2 GB (../irregular.cpp:653) etc...
Tried running with spa_mpi_big and the result is the same...
Another thread suggested that the only way to get around this is to run on more MPI processes... I haven't tried to run this on our cluster but I'm hesitant to try before I wreck anything... Do you have any suggestions for how one might be able to get past this issue, before I try on the cluster?
Many thanks,
N.
_______________________________________________
sparta-users mailing list
spa...@li...<mailto:spa...@li...>
https://lists.sourceforge.net/lists/listinfo/sparta-users
|