|
From: Nnaemeka A. <nna...@en...> - 2025-09-29 22:47:08
|
Hi Steve/Stan,
Ok so I've upgraded SPARTA and run with the following input file:
seed 1221
dimension 3
global gridcut 0.0 comm/sort yes
boundary o o o
create_box -0.01 0.0400064091 -0.015 0.015 -0.015 0.015
create_grid 400 240 240 levels 2 subset * * * * 2 2 2
global mem/limit 1024
balance_grid rcb cell
global nrho 4.689623478e+21 fnum 1e+10
species air.species N2 O2
mixture fluid N2 O2 vstream 710.6291787 0 0 temp 44.73286308
mixture fluid N2 frac 0.78849
mixture fluid O2 frac 0.21151
read_surf SPARTArun00008.geom
surf_collide 1 diffuse 295 0.9
surf_modify all collide 1
collide vss fluid air.vss relax constant
react none
fix inflow emit/face fluid xlo twopass
variable Nfreq equal 10
compute fluid5 property/grid all id xc yc zc
dump fluid5 grid all 5640 xport_stdy_grid.*.stdydata id c_fluid5[*]
compute model1 surf all fluid n nflux_incident fx fy fz press etot
compute model1b reduce sum c_model1[*]
dump model1 surf all ${Nfreq} xport_stdy_model.*.stdydata id c_model1[*]
compute model2 property/surf all id area xc yc zc
dump model2 surf all 5640 xport_stdy_surf.*.stdydata id c_model2[*]
timestep 2.5e-07
stats 10
stats_style step cpu wall np nattempt ncoll nscoll nscheck c_model1b[*]
run 2820
It managed to complete the migration, but eventually the program terminated (and a few seconds later the terminal crashed...!)... No error was thrown, it just terminated (by the looks of it, during/just after the reading surface stage of initialisation). This is what was output in the terminal:
SPARTA (24 Sep 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:473)
Created 184320000 child grid cells
CPU time = 6.42229 secs
create/ghost percent = 85.3168 14.6832
Balance grid migrated 165888000 cells
CPU time = 171.819 secs
reassign/sort/migrate/ghost percent = 4.7303 0.212882 66.0955 28.9613
Reading surface file ...
6390 triangles
0 0.0283564 xlo xhi
-0.00499695 0.00499695 ylo yhi
-0.00498782 0.005 zlo zhi
9.82486e-06 min triangle edge length
1.99208e-09 min triangle area
Terminated
Not certain what this is indicative of... I had just restarted the PC to reset memory etc... What do you think?
Thanks!
N.
________________________________
From: Moore, Stan <st...@sa...>
Sent: 29 September 2025 16:49
To: Steve Plimpton <sj...@gm...>; Nnaemeka Anyamele <nna...@en...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: Re: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
>I rummage through older threads on here suggested adding the command "global mem/limit 1024" but this then throws the following error
I think this may be fixed by https://github.com/sparta/sparta/pull/549. Can you please try with the latest version of SPARTA? You do need to keep "global mem/limit 1024".
Stan
________________________________
From: Steve Plimpton <sj...@gm...>
Sent: Monday, September 29, 2025 8:36 AM
To: Nnaemeka Anyamele <nna...@en...>
Cc: Moore, Stan via sparta-users <spa...@li...>
Subject: [EXTERNAL] Re: [sparta-users] Unable to balance grid due to buffers exceeding 2GB (MPI limitation?)
Seeing your input script would help diagnose this.
Steve
On Sun, Sep 28, 2025 at 5:26 PM Nnaemeka Anyamele via sparta-users <spa...@li...<mailto:spa...@li...>> wrote:
Hi,
I would like to run a case with 800 x 480 x 480 cells (= 184,320,000), and with 'balance_grid rcb cell' command, but I receive the following error:
SPARTA (20 Jan 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:471)
Created 184320000 child grid cells
CPU time = 7.91065 secs
create/ghost percent = 89.2979 10.7021
ERROR on proc 6: Migrate cells send buffer exceeds 2 GB (../comm.cpp:281)
ERROR on proc 2: Migrate cells send buffer exceeds 2 GB (../comm.cpp:281) etc...
The error is self-explanatory, but it's not clear what I might be able to do to get around it? I rummage through older threads on here suggested adding the command "global mem/limit 1024" but this then throws the following error:
SPARTA (20 Jan 2025)
Running on 10 MPI task(s)
Created orthogonal box = (-0.01 -0.015 -0.015) to (0.0400064 0.015 0.015)
WARNING: Could not acquire nearby ghost cells b/c grid partition is not clumped (../grid.cpp:471)
Created 184320000 child grid cells
CPU time = 7.21874 secs
create/ghost percent = 88.0565 11.9435
ERROR on proc 2: Irregular comm recv buffer exceeds 2 GB (../irregular.cpp:653)
ERROR on proc 7: Irregular comm recv buffer exceeds 2 GB (../irregular.cpp:653) etc...
Tried running with spa_mpi_big and the result is the same...
Another thread suggested that the only way to get around this is to run on more MPI processes... I haven't tried to run this on our cluster but I'm hesitant to try before I wreck anything... Do you have any suggestions for how one might be able to get past this issue, before I try on the cluster?
Many thanks,
N.
_______________________________________________
sparta-users mailing list
spa...@li...<mailto:spa...@li...>
https://lists.sourceforge.net/lists/listinfo/sparta-users
|