Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Help with debugging server slowdowns #4579

tomz ·

@robinshen

We have recently experience extreme Quickbuild server performance issues twice since upgrading from 13.0.31 to 13.0.46. We've been able to resolve the problem with a Quickbuild service restart, but we are at a point in the project where unscheduled restarts are very costly. We could use some help debugging this if it happens again. Here are the details as best as I can describe.

  • Server slows down considerably in web interface, many queued jobs get stuck in "checking build condition", server load > 150 active jobs but this is normal. The checking build condition slowdown is very odd because this happens in places where the build condition is set to "always run"
  • First instance of the problem seemed to be paired with a specific server log WARNING related to a BuildID REST request we were making in a local software tool. We have since fixed this issue in the software tool and the WARNING went away, but yesterday the problem happened again. This time there were no WARNINGS or ERRORS in the server log.
  • The second instance, we checked the top command for the server, and it seemed like mysql instance was taking abnormally high CPU (> 10%, normally it is < 5%)

We are currently using mysql Ver 14.14 Distrib 5.6.33, for Linux (x86_64) using EditLine wrapper, and Quickbuild server version 13.0.46. If we run into this problem again, what can we do to get more detailed logs? The server logs didn't seem to indicate anything out of the ordinary.

We appreciate your help.

TZ

  • replies 2
  • views 239
  • stars 0
robinshen ADMIN ·

Even with always option, the check build condition phase still needs to do some other jobs, such as running snapshot taking script, clone/pull repository to workspace, etc.

When this issue happens again, please get a stack trace of server by running "bin/server.sh dump" while browser is waiting for server response. Please send logs/console.log which should contain the stack trace.

drdt ·

I have also observed in recent updates a slowness when there are dependency jobs. Specifically, I had a job which populated a common folder with scripts used in my other builds. It did no compilation; all it did was a Perforce checkout and recursive copy. It only had to run a few times a week, but every one of my other jobs depended on it to make sure the common folder was up-to-date.

These other jobs were all on 5-minute intermittent schedule. What would happen is, dozens of instances of the common job would queue up, each taking up to thirty seconds to figure out it didn't need to do anything. In turn all of my other jobs would hang up. It ended up taking a half-hour to get through the list of jobs. This was not the case using QB v10, but definitely in v13 and possibly v12.

We got around this by eliminating the common folder, instead checking out the scripts into the build workspace as part of each job. But maybe I should have reported the issue, since it seems like a systemic problem.