Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

500 Socket Read Timeout errors #1199

ALG4 ·
Hi,

Occasionally, I will start to get frequent occurrences of "Read Timeout" errors when a build finishes, so if there are artifacts expected to be copied back from an agent, it fails. If I restart the QB server, this usually helps. But the problem always comes back. I'm using 3.0.0, but I'm going to upgrade to 3.0.17 today. Looking through the fixes, I didn't see anything related to this, but it's time to upgrade anyway.

Thanks.

500: java.net.SocketTimeoutException: Read timed out
caused by: Read timed out
caused by: Read timed out
  • replies 7
  • views 4673
  • stars 0
robinshen ADMIN ·
Please post full stack trace of this exception. Thanks!
ALG4 ·
Here's the trace:
20:34:39,575 [master>GCOV>Publish Test GCOV Report@rhel5-gcov.seattlead.bstep.us:8811] ERROR - Step 'Publish Test GCOV Report' is failed.
com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketTimeoutException: Read timed out
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:198)
at $Proxy28.copyFilesFrom(Unknown Source)
at com.pmease.quickbuild.grid.GridImpl.transferFiles(GridImpl.java:161)
at com.pmease.quickbuild.model.Build.publish(Build.java:1028)
at com.pmease.quickbuild.plugin.htmlreport.HtmlReportPublishStep.run(HtmlReportPublishStep.java:96)
at com.pmease.quickbuild.plugin.htmlreport.HtmlReportPublishStep$$EnhancerByCGLIB$$ca1ac41a.CGLIB$run$6(<generated>)
at com.pmease.quickbuild.plugin.htmlreport.HtmlReportPublishStep$$EnhancerByCGLIB$$ca1ac41a$$FastClassByCGLIB$$f6f56f44.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:271)
at com.pmease.quickbuild.plugin.htmlreport.HtmlReportPublishStep$$EnhancerByCGLIB$$ca1ac41a.run(<generated>)
at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:449)
at com.pmease.quickbuild.stepsupport.StepJob.execute(StepJob.java:34)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:120)
at java.lang.Thread.run(Thread.java:595)
Caused by: java.net.SocketTimeoutException: Read timed out
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1225)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1219)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:906)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:180)
... 13 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:681)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:626)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:957)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:367)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:169)
... 13 more
robinshen ADMIN ·
Looks like the html publish step runs too long to cause network timeout. The current socket read timeout in QuickBuild is set to 30 minutes, which should be sufficient in most cases. However, if the directory for publishing contains a lot of files, and if there are many html publish steps in a single build, the whole publish time may exceed 30 mins.
To verify, please check the log entry just before the stack trace entry to see if it was logged 30 mins ago.
ALG4 ·
Yes, it did happen exactly 30 min later. However, the odd thing is that all the html files actually did seem to get published anyway.

There were not that many files; actually the step before publishes even more html files, and it only took a total of 14min to complete. So I'm confused why it always happens with this step only. I've seen this behavior occasionally with different machines under a different build configuration, they will seem to just get stuck when transferring files back. Everything is on a gigabit network, and other steps seem to be fine with transfer speeds.

This is on a RHEL5 virtual machine, however, I've seen this happen before with other physical machines such as AIX or HPUX.
robinshen ADMIN ·
QuickBuild currently has a bug of calculating timeout of html publishing wrongly when multiple html publish steps are involved. We will get it solved in next patch release (3.0.18)
ALG4 ·
I'm seeing this same issue again, a timeout after 30min. I'm on the current version, 3.1.37.
robinshen ADMIN ·
The read timeout fix requires that all agents being manually re-installed. To do so, please follow step 5 of below procedure:
http://wiki.pmease.com/display/QB31/Upg ... +And+Agent