logs on the agent:
jvm 6 | 2021-10-28 10:00:54,356 WARN - Job still exists on job node and cancel command is issued (job id: 600d77ad-a409-462d-b133-bb01d7385d8c, build id: 20835045, job node: SWDD3922:8814)...
jvm 6 | 2021-10-28 10:00:53,439 WARN - Job still exists on job node and cancel command is issued (job id: aacc3de1-a3a8-4273-988a-32f8aac2a338, build id: 20835045, job node: SWDD3922:8814)...
jvm 6 | 2021-10-28 10:01:33,113 WARN -
jvm 6 | javax.servlet.ServletException: java.lang.OutOfMemoryError: Java heap space
jvm 6 | at com.caucho.hessian.server.HessianServlet.service(HessianServlet.java:385)
jvm 6 | at com.pmease.quickbuild.grid.GridServlet.service(GridServlet.java:36)
jvm 6 | at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
jvm 6 | at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239)
jvm 6 | at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
jvm 6 | at pl.samsung.srpol.dpi.filter.RemoteUserFilter.doFilter(RemoteUserFilter.java:47)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
jvm 6 | at com.pmease.quickbuild.Quickbuild$AddCrossDomainFilter.doFilter(Quickbuild.java:1152)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
jvm 6 | at com.pmease.quickbuild.Quickbuild$DisableTraceFilter.doFilter(Quickbuild.java:1177)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
jvm 6 | at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
jvm 6 | at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
jvm 6 | at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
jvm 6 | at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
jvm 6 | at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
jvm 6 | at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
jvm 6 | at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
jvm 6 | at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
jvm 6 | at org.eclipse.jetty.server.Server.handle(Server.java:499)
jvm 6 | at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
jvm 6 | at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:258)
jvm 6 | at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
jvm 6 | at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
jvm 6 | at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
jvm 6 | at java.lang.Thread.run(Thread.java:748)
jvm 6 | Caused by: java.lang.OutOfMemoryError: Java heap space
wrapper | Pinging the JVM took 7 seconds to respond.
jvm 6 | 2021-10-28 10:01:25,693 WARN - Job entry no longer exists at task node 'SWDD3922:8814', will cancel running job...
jvm 6 | 2021-10-28 10:01:25,693 WARN - Job entry no longer exists at task node 'SWDD3922:8814', will cancel running job...
wrapper | TERM trapped. Shutting down.
Logs on the server:
2021-10-28 18:29:58,855 [javamelody] WARN net.bull.javamelody - exception while collecting data: java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2021-10-28 18:29:58,855 [javamelody] WARN net.bull.javamelody - exception while collecting data: java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2021-10-28 18:29:58,856 [javamelody] WARN net.bull.javamelody - exception while collecting data: java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2021-10-28 18:30:20,240 [qtp1925094702-107801] ERROR com.pmease.quickbuild.grid.NodeServiceImpl - Unable to find job 'a07d9fe3-f7cc-4835-b742-a8caec4e800b' on node 'SWDD6821:8814' (Job is ever started: true).
The above are error logs and builds start to fail after this. Restarting the agent resolves this problem.
We also found a similar issue on community: https://support.pmease.com/PMEase/QuickBuild/topics/4056/build-step-is-canceled-by-unknown-reason-and-the-build-status-is-failed-not-canceled;jsessionid=2F9FFF74EF6FF10D210C3E5767F8B6DE?0
But this does not help. This happens when more than 800 builds are triggered in parallel step using trigger other builds step.
We are currently using QB 8.