Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Socket read timeout #3624

jossy ·
Currently we experience a lot of socket read timeout exceptions during svn checkouts:

22:44:50,746 ERROR - Step 'master>checkout>checkout-svn-jloadtrace-plugins' is failed.
com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketTimeoutException: Read timed out
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:165)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:300)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
at com.sun.proxy.$Proxy24.saveRepositoryRuntimeCustomData(Unknown Source)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository.checkoutByRevision(SvnRepository.java:510)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.CGLIB$checkoutByRevision$4(<generated>)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752$$FastClassByCGLIB$$1a579870.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:261)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.checkoutByRevision(<generated>)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository.checkoutByRevision(SvnRepository.java:46)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.CGLIB$checkoutByRevision$5(<generated>)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752$$FastClassByCGLIB$$1a579870.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:261)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.checkoutByRevision(<generated>)
at com.pmease.quickbuild.repositorysupport.Repository.checkout(Repository.java:215)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.CGLIB$checkout$74(<generated>)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752$$FastClassByCGLIB$$1a579870.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:261)
at com.pmease.quickbuild.plugin.scm.svn.SvnRepository$$EnhancerByCGLIB$$7253f752.checkout(<generated>)
at com.pmease.quickbuild.repositorysupport.CheckoutStep.run(CheckoutStep.java:71)
at com.pmease.quickbuild.repositorysupport.CheckoutStep$$EnhancerByCGLIB$$3a4ff3b9.CGLIB$run$0(<generated>)
at com.pmease.quickbuild.repositorysupport.CheckoutStep$$EnhancerByCGLIB$$3a4ff3b9$$FastClassByCGLIB$$b8011158.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:261)
at com.pmease.quickbuild.repositorysupport.CheckoutStep$$EnhancerByCGLIB$$3a4ff3b9.run(<generated>)
at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:549)
at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:31)
at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:45)
at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:60)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:106)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1890)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1885)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1884)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1457)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:145)
... 37 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:132)
... 37 more


We do not really know why this happens quite often from different build nodes. Maybe you have a clue what might be going on there.

Alex
  • replies 7
  • views 5741
  • stars 0
robinshen ADMIN ·
There are three reasons for this:
1. QB server runs out of database connections, and agent times out while waiting for available connections. In this case, you may need to increase the database connection
2. QB server is under heavy load due to many builds running
3. There are intermittent network connection issue from agent to server

For reason 1&2, this issue should happen while system is busy, and if system is really too busy, you may need to split QB into multiple instances.
jossy ·
Hi Robin,

We had some issues with the database connection limits on db server side which have been resolved for about 1 week now.

We still have some issues with read timeouts which seem to be weird. On some nodes every night the live log cleanup fails.

[DefaultQuartzScheduler_Worker-44] ERROR com.pmease.quickbuild.log.DefaultLivelogManager - Error cleaning up livelog entries.
com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketTimeoutException: Read timed out
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:165)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:300)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
at com.sun.proxy.$Proxy24.queryUniqueResult(Unknown Source)
at com.pmease.quickbuild.persistence.AgentQueryCarrier.invoke(AgentQueryCarrier.java:59)
at com.sun.proxy.$Proxy23.uniqueResult(Unknown Source)
at com.pmease.quickbuild.entitymanager.impl.DefaultBuildManager.getStatus(DefaultBuildManager.java:458)
at com.pmease.quickbuild.persistence.SessionInterceptor.invoke(SessionInterceptor.java:49)
at com.pmease.quickbuild.log.DefaultLivelogManager.execute(DefaultLivelogManager.java:68)
at com.pmease.quickbuild.taskschedule.DefaultTaskScheduler$HelperTask.execute(DefaultTaskScheduler.java:103)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
Caused by: java.net.SocketTimeoutException: Read timed out
at sun.reflect.GeneratedConstructorAccessor2126.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1890)
at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1885)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1884)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1457)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:145)
... 11 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:132)
... 11 more


This happens for about 2 weeks now. Every night at around 2:45 a.m. The server load did not increase significantly. Any additional ideas? How long is the read timeout?
robinshen ADMIN ·
The read timeout is 5min. The live log cleaning up needs to query database, and it will wait for free connections if existing connections have been used up. This might happen if you have many build agents. Do you add a lot more agents recently?
jossy ·
I just looked at the db pool usage of the last 6 hours and the maximum was 44% (22 of 50 connections used). We also saw that the QB DB backup runs from 1:00 am to around 3:00 am. Now I disabled the audit log backup and the duration dropped by 30 minutes. I'm a bit puzzled why it only uses 22 connections at max. Our second server instance used 100% maximum during this timeframe and has even much more agents connected and works fine.
I also checked if the database limits the connections, but it doesn't.
robinshen ADMIN ·
I just looked at the db pool usage of the last 6 hours and the maximum was 44% (22 of 50 connections used)

Is this being checked at database side, or at QB server side by enabling the connection pool log as below in "conf/log4j.properties"?
log4j.logger.com.mchange=TRACE
jossy ·
Obviously I made a mistake with the timeframe. The connection pool was fully used up with 50 connections.
How many connections is the DB backup using?

Does the backup lock the table which is being exported?
robinshen ADMIN ·
Database backup uses only one connection, but as all rows from all tables will be queried, it may affect database performance.