From: Duncan P. <du...@ex...> - 2021-03-01 15:32:49
|
HI, so from your responses I m still curious about the arguments used to start and set the memory allocation of the containers. I’ve recently had the chance to debug a clients app and test this part intensively (nothing like a reproducer) dynamic memory allocation worked great, so did recovering from crashes. There are essentially two immediate options: 1) Killer Query / memory leak > the instances a container will start eating over 10 GB sounds like that. Nothing to do with the orchestration, just plain old programming error, or adventurous users. Check with latest and the logs to see if there is a bad query or a hopefully fixed memory-leak. 2) ECS is stumbling when trying to reallocate memory leading to stuck memory. Basically the external commands to the container, and its internal reporting are out of sync. To test that you can run a single container in a minimum memory environment locally do some light browsing / editing, if you see crashes or runaway memory allocation that’s your problem. You need to adjust the start commands (i.e. modify or override JAVA_TOOL_OPTIONS) to your deployment scenario. Adjusting, e.g. > -XX:MaxRAMFraction=1 to 2, should ensure that no internal external conflicts are occurring. You can then continue to play with other commands to tweak it more, or this might already be a suitable solutions to your problem, depending on your overall load profile. If none of that helps, I’d need to take a deeper look into your system, as part of a proper consultancy arrangement. I m very short on time at the moment. Hope that helps Duncan Ceterum censeo exist-db.org esse conriganda > On 1. Mar 2021, at 16:09, Woods, Jonathan <jon...@Va...> wrote: > > Hey! > > I didn’t get that email. Not sure I get the replies. > > There aren’t 7 instances. There are 2 instances that are running 7/8 task within ECS. Each service runs one task for the container. The containers themselves aren’t dependent on each other. They all live within the same network (VPC), but that is it. For memory allocations, I don’t have max (aka – hard) limits on the containers. This allow the containers to use whatever memory they need. The instance the containers are running on are running out of memory, yes, but in theory there are 4 containers running on a 16 GB instance and while monitoring the instances a container will start eating over 10 GB of memory causing it to max out and then restart the container. As for storage, we are using the rexray plug-in to have consistent storage for the containers. We allocated 30 GB for each container and they aren’t even close to utilizing that space. > > Hopefully that makes sense. > > Jonathan Woods > Cloud Engineer, Cloud Services > Information Technology | Vanderbilt University > > From: Winona Salesky <wsa...@gm...> > Sent: Monday, March 1, 2021 9:05 AM > To: Duncan Paterson <du...@ex...> > Cc: exist-open <exi...@li...>; Woods, Jonathan <jon...@Va...> > Subject: Re: [Exist-open] Running eXist-db Docker containers on AWS (Winona Salesky) > > Hi Duncan, > Thanks for getting back to me, I missed this over the weekend, so sorry for the delay in responding. I'm looping in Jonathan Woods on this email as he has handled the AWS installations. > Here is my understanding of our set up: > 1. The 7 instances run permanently > 2. There are no cross-dependencies between the containers. > Jonathan, perhaps you can answer the other questions having to do with memory? > > I have not yet tried running them on :latest, as I was hoping to hold out until the next official release, perhaps I will give it a try and see if it addresses any issues. > > Thanks for your help, > -Winona > > > > On Sat, Feb 27, 2021 at 7:47 AM Duncan Paterson <du...@ex... <mailto:du...@ex...>> wrote: > Hi Winona, > > i m afraid the topology isn’t quite clear to me yet, from your description. Are all 7 instances supposed to run permanently, or on-demand? > Are there cross-dependencies between the containers? Do they all live on the same network? What memory arguments are you using to run the containers? What are the max and min memory allocations per container? > Are they running out of memory due to something that happens inside exist, or due to ECS memory reallocations? > The logs you shared on slack actually look more like there is an IO issues with disk spaces, what do the logs say that make you think its memory related? > > Have you tried to switch to :latest as many bug fixes are in 5.3.0-SNAPSHOT to see if the problem persists there? Are you provisioning your own base images? > > Greetings > Duncan > > > > We are moving our eXist-db v5.2 based applications to AWS using docker and > are experiencing some performance issues. > > We have an ECS cluster running two t3a.xlarge which have 16gb, each. The > ECS service/task are running on a spread deployment where it tries to load > each task/container evenly between the nodes. We are running 7 eXist-db > based docker images. > > The apps all start out running fine, but eventually run out of memory, > which can cause an unclean shutdown. This seems to happen every few days. > > Any help debugging our setup would be greatly appreciated. > Thank you, > -Winona > > Ceterum censeo exist-db.org <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fexist-db.org%2F&data=04%7C01%7Cjonathan.woods%40vanderbilt.edu%7C87ff6a1d6b4c4b18eb1108d8dcc35861%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637502078853258332%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=szkt1r50J%2BPNwd4oMtGfOpi5Wp3Pm3aZ7xMElJEXJA8%3D&reserved=0> esse conriganda > > > > > > > _______________________________________________ > Exist-open mailing list > Exi...@li... <mailto:Exi...@li...> > https://lists.sourceforge.net/lists/listinfo/exist-open <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fexist-open&data=04%7C01%7Cjonathan.woods%40vanderbilt.edu%7C87ff6a1d6b4c4b18eb1108d8dcc35861%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637502078853258332%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LEDbYsD8a6xTqe39I4QpMF%2FsTYni85z%2FTTKwMZTQbtA%3D&reserved=0> |