WSClean has very different performance characteristics to the w-projection algorithm. When wsclean is executed on a machine that does not have enough memory to store all w-layers at once in memory, the program will make several passes over your measurement set; in the first pass it will grid and FFT only the layers with lowest w terms, in the second pass it will process the data with second-lowest w-terms, etc. A few passes typically do not slow down the imaging much, but if on the order of ten or more passes are executed, the performance might no longer be acceptable. In that case you can:
In the w-projection algorithm, the number of w-layers hardly affects the speed of the algorithm. However, in w-stacking it is almost linear in the number of w-layers. Hence, you should not make this number larger than necessary. If the observation has large w-values, because it is far off-zenith, a large number of w-layers might be required. Note that the w-projection algorithm will in such a case have a very large w-kernel as well, and becomes also extremely slow. In such cases it might be better to change the phase centre before imaging: see the chgcentre documentation.
The Fornax supercomputer is a good place to run WSClean, since it has 70 GB of mem.