Investigation of problem: Docker exiting
One website is exiting. We use docker for it. My idea here is to describe the process of searching for the cause of this problem and solving it. I’ll write where follow the debugging process.
What I have:
- Error 500 on the browsers page. After running
docker-compose up --build -d
, all works fine. - This is the second time it occurs. I have the time when the sites went down, because an automatic website monitor sent me and e-mail.
Context
Fist, we have three containers for this site. Backend, Frontend and Database. I guess that the backend have the problem, because the favicon of frontend is still visible.
Steps for solving problem.
- Using
history
command to see if my brother restarted the container earlier To filter by date, use theHISTTIMEFORMAT="%d/%m/%y %T"
command. After that, use grep to filter by 29/5. Result: No logs from this day, only today.
Second try. Running ncdu
to see if there is some storage problem… 11GB in use. We have 20. Near 7GB is used by archiva. So, this is not the problem.
Third try. Follow logs for each container. Starting with backend. How see the logs? Google… docker logs CONTAINER
is useful only to runtime logs.
Follow this tutorial. It says to follow logs.
docker inspect CONTAINER_ID
docker stats
docker stats --format "table \t "
I can see that there is no current higher memory usage. Lets back to search for stored logs.
Searching on google, I found where the logs are stored:
/var/lib/docker/containers/<container_id>/<container_id>-json.log
.
docker ps
shows the containers IDs. The json.log has dates. Maybe I can filter by the date the container has stopped.
Or, we can use docker logs <container_id> -t
.
I was yet unable to find the problem. But, with the simple htop
command I can see that the backend.jar is consuming at least 15% memory. I’ll change the name of the jar to be simple distinct this application from others.
Second day
Today, a process was consuming 50% of server’s memory. Almost no memory available…
Docker
docker stats
show that the bad process id tvgaspar_frontend.
Thinking about changes, the only one I can remember is adding the countdown banner. Maybe the thickTimer
function is creating a memory leak.
tickTimer() {
// removed now = this.$moment()
const now = new Date()
if(this.parsedDate.isAfter(now))
{
const remainingTime = this.$moment.duration(this.parsedDate.diff(now));
this.months = remainingTime.months();
this.days = remainingTime.days();
this.hours = remainingTime.hours();
this.minutes = remainingTime.minutes();
this.seconds = remainingTime.seconds();
}
}
What I learned:
docker stats
- My guess was wrong. The problems was in front, not in back.
The memory used by the process seems to be slowly becoming higher… From 2.04% in the morning, to 2.14% this afternoon. Maybe the problems is still there.