Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

java.net.SocketException: Unexpected end of file from server #1430

puneetg ·
Hi Robin

we have 116 windows/mac/linux agents in our setup. Once we migrated to QB3.1.37 ( from 3.1.12 ) , I see following error atleast 3-4 times day, retrying the build fixes the issue, Could you help me understand what maybe causing this?

Thanks
-Puneet

01:18:03,763 [master>checkout.code.cws>checkout.tools.and.setup@bammatrix02.efi.internal:8811] ERROR - Step 'master>checkout.code.cws>checkout.tools.and.setup' is failed.
java.lang.RuntimeException: Error executing grid job 'master>checkout.code.cws>checkout.tools.and.setup>setup.tools'
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:63)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:85)
at com.pmease.quickbuild.stepsupport.SequentialStep.triggerChildren(SequentialStep.java:46)
at com.pmease.quickbuild.stepsupport.CompositeStep.run(CompositeStep.java:95)
at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:442)
at com.pmease.quickbuild.stepsupport.StepJob.execute(StepJob.java:42)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:122)
at java.lang.Thread.run(Thread.java:637)
Caused by: com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketException: Unexpected end of file from server
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:195)
at $Proxy11.updateStepRuntime(Unknown Source)
at com.pmease.quickbuild.stepsupport.Step.updateRuntime(Step.java:688)
at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$601544e7.CGLIB$updateRuntime$74(<generated>)
at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$601544e7$$FastClassByCGLIB$$c42739d3.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:273)
at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$601544e7.updateRuntime(<generated>)
at com.pmease.quickbuild.stepsupport.StepJob.beforeExecute(StepJob.java:32)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:113)
... 1 more
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1368)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1362)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1016)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:177)
... 10 more
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:652)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:373)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:166)
... 10 more
  • replies 20
  • views 10301
  • stars 0
robinshen ADMIN ·
Hi Puneet,

Compared to 3.1.12, 3.1.37 does not have any significant changes affecting this. This error means that the server might be overloaded if network connection is not a problem. How many builds are you running a day and how long it takes approximately? Also please make sure that all builds are running on agents to reduce the server load.

Regards
Robin
puneetg ·
Hi Robin

In general 10-20 builds are running at any time, we see this error intermittently.

I have also noticed if a user clicks on "Grid" entire system slows down, Is it possible to add a option to remove Grid from a user group? This way we can side step this issue.

Also our current host is Windows 2008 Server, 2 CPU and 4 GB, Do you have any suggestions in term of No of CPU and memory or OS? Is Linux better or 4CPU/8GB will make a difference ?

Thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

We are working to re-design the grid system in order to make it easier to configure and avoid the slow response when grid tab is clicked. Since this is not trivial, the work will be delivered in QB4. For your current setup, I think your current server is not power enough to serve more than one hundred agents. Upgrading to 4CPU/8G will definitely improve the situation, Windows/Linux does not make too many difference according to our test. Please also make sure that the server at least has a giga bit network connection.

Regards
Robin
puneetg ·
Hi Robin

thanks, I will upgrade the server VM to a better config and update you. Our VMWare and Quickbuild infrastucture is on Giga network.

Is it possible to add a preference item for user group to remove "Grid" in QB 3.1 latest build ?

thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

Instead of removing the grid tab, we will modify the latest 3.1 code to not poll load/jobs for each node, so that grid tab will not be slow down even when there are many nodes connected. I've filed an improvement request for this:
http://track.pmease.com/browse/QB-910

Please watch on this issue to get notified when it is done.

Regards
Robin
puneetg ·
Thanks very much.

-Puneet
puneetg ·
Robin

We have upgraded to 3.1.40 and Grid is very quick to respond with 118 build nodes. Thanks a lot for the fix.

As we have very large build chains and Continues Integration for SVN and clearcase repositories (we donot use QB clearcase modules but use custom linux scripts) setup every 10mins and during the that time server is busy for several minutes. We have seen some build errors are reported during this time. While we are looking into other solutions like installing a dedicated QB server just for changeset detection, I hope you can look into the issue as well in QB4.

thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

Since in QB3, all change detection happens on server even if steps are set to run on build agents, this might put a heavy load if many projects are configured with frequent schedule. In QB4, we will investigate if it is possible to shift the change detection phase into build agents. For now, if this is a problem, I suggest to queue the change detection by configuring a less value for the queue.

Regards
Robin
puneetg ·
Hi Robin

We are getting these errors more and more with build failures. I checked network connectivity and that seems to be fine.

Whats does "Error caching build status in grid node '{bammatrix03.efi.internal:8811}" mean?

Thanks
-Puneet

In the server log we see following:

2011-06-19 23:44:33,986 [Thread-1453] WARN com.pmease.quickbuild.DefaultBuildEngine - Error caching build status in grid node '{bammatrix03.efi.internal:8811}'.
com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketException: Unexpected end of file from server
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:195)
at $Proxy53.cacheBuildStatus(Unknown Source)
at com.pmease.quickbuild.DefaultBuildEngine.cacheBuildStatusInGrid(DefaultBuildEngine.java:1275)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:286)
at com.pmease.quickbuild.DefaultBuildEngine.access$1(DefaultBuildEngine.java:242)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:753)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:177)
... 6 more
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at java.net.HttpURLConnection.getResponseCode(Unknown Source)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:166)
... 6 more
2011-06-19 23:44:34,283 [Thread-1452] WARN com.pmease.quickbuild.DefaultBuildEngine - Error caching build status in grid node '{bawqbvm02:8811}'.
com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketException: Unexpected end of file from server
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:195)
at $Proxy53.cacheBuildStatus(Unknown Source)
at com.pmease.quickbuild.DefaultBuildEngine.cacheBuildStatusInGrid(DefaultBuildEngine.java:1275)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:286)
at com.pmease.quickbuild.DefaultBuildEngine.access$1(DefaultBuildEngine.java:242)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:753)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Unexpected end of file from server
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:177)
... 6 more
robinshen ADMIN ·
QB caches build status into each node in the grid to avoid frequent status polling in "trigger build" or "quickbuild repository checkout" step. The build will not be affected except that the step will take a bit longer. This seems to me much like a grid overload problem. When this error happens, please check system log ("logs/quickbuild.log") of agent "bammatrix02.efi.internal:8811" to see if there is any error being printed.
puneetg ·
Robin

I saw one failure today while checking out code from a svn repository:

Entire agent log is attached. Is this due to heavy load? I see only 20 builds executing that time. Also Will open user browser sessions impact server performance?

Thanks
-Puneet

======
jvm 1 | 2011-06-22 23:23:52,677 INFO - Finished serving files.
jvm 1 | 2011-06-22 23:24:40,334 INFO - Serving files (base dir: C:\buildagent\temp\06e77a76-fed2-4091-a7b8-fd1195d30114, file patterns: *)
jvm 1 | 2011-06-22 23:24:40,334 INFO - Finished serving files.
jvm 1 | 2011-06-22 23:24:52,522 INFO - Serving files (base dir: C:\buildagent\temp\98c16c9d-af4c-46f1-b6c9-ffd7e8cb379c, file patterns: *)
jvm 1 | 2011-06-22 23:24:52,522 INFO - Finished serving files.
jvm 1 | 2011-06-22 23:25:14,991 INFO - Serving files (base dir: C:\buildagent\temp\0020b957-b9a3-4814-ab29-9bc043dcc4ff, file patterns: *)
jvm 1 | 2011-06-22 23:25:15,007 INFO - Finished serving files.
jvm 1 | 2011-06-22 23:25:30,960 INFO - Serving files (base dir: C:\buildagent\temp\ca78ccc3-f59f-4ed0-baa5-2d50c5c99466, file patterns: *)
jvm 1 | 2011-06-22 23:25:30,976 INFO - Finished serving files.
jvm 1 | 2011-06-22 23:33:32,360 ERROR - Error connecting server.
jvm 1 | com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketException: Unexpected end of file from server
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:195)
jvm 1 | at $Proxy14.connect(Unknown Source)
jvm 1 | at com.pmease.quickbuild.grid.AgentConnectivityTask.run(AgentConnectivityTask.java:55)
jvm 1 | at java.lang.Thread.run(Unknown Source)
jvm 1 | Caused by: java.net.SocketException: Unexpected end of file from server
jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at java.lang.reflect.Constructor.newInstance(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
jvm 1 | at java.security.AccessController.doPrivileged(Native Method)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:177)
jvm 1 | ... 3 more
jvm 1 | Caused by: java.net.SocketException: Unexpected end of file from server
jvm 1 | at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
jvm 1 | at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
jvm 1 | at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
jvm 1 | at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at java.net.HttpURLConnection.getResponseCode(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:166)
jvm 1 | ... 3 more




2011-06-23 04:04:12,313 ERROR - Error monitoring job 'master>checkout.code.cws>checkout.cws>cws.checkout.clientapp.components>checkout.clientapp.components.discovery'.
jvm 1 | com.caucho.hessian.client.HessianConnectionException: 500: java.net.SocketException: Unexpected end of file from server
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:195)
jvm 1 | at $Proxy11.queryUniqueResult(Unknown Source)
jvm 1 | at com.pmease.quickbuild.persistence.AgentQueryCarrier.invoke(AgentQueryCarrier.java:60)
jvm 1 | at $Proxy13.uniqueResult(Unknown Source)
jvm 1 | at com.pmease.quickbuild.entitymanager.impl.DefaultBuildManager.getStatus(DefaultBuildManager.java:382)
jvm 1 | at com.pmease.quickbuild.entitymanager.impl.DefaultBuildManager$$EnhancerByGuice$$57f8d7d.CGLIB$getStatus$10(<generated>)
jvm 1 | at com.pmease.quickbuild.entitymanager.impl.DefaultBuildManager$$EnhancerByGuice$$57f8d7d$$FastClassByGuice$$5e93e908.invoke(<generated>)
jvm 1 | at com.google.inject.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:187)
jvm 1 | at com.google.inject.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:66)
jvm 1 | at com.pmease.quickbuild.persistence.TransactionInterceptor.invoke(TransactionInterceptor.java:49)
jvm 1 | at com.google.inject.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:66)
jvm 1 | at com.google.inject.InterceptorStackCallback.intercept(InterceptorStackCallback.java:45)
jvm 1 | at com.pmease.quickbuild.entitymanager.impl.DefaultBuildManager$$EnhancerByGuice$$57f8d7d.getStatus(<generated>)
jvm 1 | at com.pmease.quickbuild.grid.GridJob$WatchDog.run(GridJob.java:254)
jvm 1 | at java.lang.Thread.run(Unknown Source)
jvm 1 | Caused by: java.net.SocketException: Unexpected end of file from server
jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
jvm 1 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at java.lang.reflect.Constructor.newInstance(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
jvm 1 | at java.security.AccessController.doPrivileged(Native Method)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:177)
jvm 1 | ... 14 more
jvm 1 | Caused by: java.net.SocketException: Unexpected end of file from server
jvm 1 | at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
jvm 1 | at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
jvm 1 | at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at java.net.HttpURLConnection.getResponseCode(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:166)
robinshen ADMIN ·
Open browser sessions will affect server if the auto-refresh interval is specified. Please turn it off to see if the situation will be better by specifying it as 0 in administration/system setting page.
puneetg ·
Robin

We have 120 users and at one time we can have max 20 open browser sessions. Do you think this can impact server performance?

It seems to me server is not responding to clients intermittently, we measured the bandwidth usage and that is not high or maxed out.

I hope in next version you take care of this and continues integration, they put toll on the server activities. The CI checks needs to be moved to a different process or to an agent.

thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

You might be right on this. QB forks a new process to check for SCM changes, and this may make the server quite busy if many projects are checking changes frequently. We will investigate the possibility of moving change detection logic to agent. But eventually you may still get to the bottleneck of one server. So dividing configurations in a single server to multiple servers might still be necessary.

Regards
Robin
puneetg ·
Hi Robin

One more question with regards to this:

I see repetitive spikes of high CPU/Disk/Network Usage in every 10-12 hrs interval, this is suspicious as this happens over weekends as well. We have very few builds scheduled during the weekends.

Before I look more in our setup, Is Quickbuild server performing cleanup or other activity during this time?

thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

Please check the build auto-cleanup settings. If set, QB will delete obsolete builds at 12:00PM every night.

Regards
Robin
puneetg ·
Hi Robin

We need a way to schedule this over a weekend or trigger of an agent, We have 100+ builds going though QB every day and cleanup takes quite a while and interfere with user activities. We also have engineers from various geographic locations accessing the system to no specific time in day is good for us, For now I have disabled auto-cleanup and checking performance. I will update.

Thanks
-Puneet
robinshen ADMIN ·
Hi Puneet,

Please upgrade to 3.1.50, and then you can define a system property in "conf/wrapper.conf" as below for example to change build cleanup schedule:
wrapper.java.additional.4=-Dcleanup_schedule="0 0/5 14,18 * * ?"

This is a temporal work around as we will redesign the cleanup mechanism in QB4.

Regards
Robin
puneetg ·
Hi Robin

thanks so much for your prompt fix, really appreciated.

I hope all these issues will prepare QB to work in high capacity and demanding engineering environment. We are using QB as a frontend to build system and Automated test harness built around LabManager4. Its pretty useful as it is, and I hope additional upgrades in the s/w will enhance productivity across the board by reducing errors.

One of the areas to consider is support for large number of logins and refreshes, this is seen to slow down the server. One option is to implement multiple read only processors for client updates.

Best regards
-Puneet
robinshen ADMIN ·
Hi Puneet,

Thanks for your suggestion. We will investigate to see if this can be improved in QB4.

Regards
Robin