Hi Sander,
W dniu 7.02.2023 o 09:26, Sander Apweiler pisze:
> Dear Krzysztof,
> we have problems with slow web UI and creashes of the endpoints. We got
> a lot of feedback from users that the web UI is quite slow. Especially
> if they want to invite multiple people or just accept an invitation
> (more than two minutes the spinning wheel).
> When we delete five registrations, the progress bar goes to ~95%,
> blinks and it tokes two to three minutes to finish the deletion. If we
> want to delete ten registrations, the risk is high that the console
> endpoints crashes and we need to restart unity.
> Switching the conflict resolution of an attribute statement from skip
> to merge toke two minutes this morning.
>
> I'm pretty sure that our large number of users (14k+) is one of the
> reasons for this. It seems that the server itself is not on load. It
> has 0,3 having 4 cores. Unity is allowed to use 8GB RAM but the whole
> server uses at the moment just 5,4GB.
>
> We increased already the number of workers to 32. Do you have some
> hints how we can get a better performance?
It is hard to say and most likely profiling will be needed to identify
root cause.
Before that can you be more specific by what do you mean be "endpoint
crashes"? Are there any exceptions in logs? This might be very helpful.
Generally there are many aspects influencing app performance. It is not
only memory and CPU/threads. Also it might be related to I/O (e.g.
excessive logging on DEBUG/TRACE level or RDBMS access - e.g. too few
connections). There might be spikes in memory load which you won't
observe wit OS tools, rather you need APM for that. What I'd anyway
suggest for any bigger production instance. Then you will be able to
check the detailed memory usage stats over time (if JVM runs close to
its memory limits, GC kicks in and app starts to be very slow), threads
utilization (there are few thread pools).
In general my take on performance is that I first try to find
reproducible case which is slow, then run it in some isolation (simulate
on separate server or even on prod in off peak hours) with some extra
logging turned on, find which operations are slow (gap in logs or long
reported operation) and proceed from that point.
HTH,
Krzysztof
|