Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Script not working after 13-14 upgrade #4577

drdt ·

In my build grid I define a resource identifying machines with insufficient disk space to run a job, using the 'node.nodeService.evalGroovyScript' method. I can't find this method in the documentation, so I presume I got it from you folks. Alas, after upgrading to v14, it no longer works, with the following error stack:

com.pmease.quickbuild.resource.NodeResourceType - Error calcualting resource count (resource: .NEEDS_CLEANUP_UNIX, node: unknown:0)
java.lang.RuntimeException: Failed to evaluate below expression: (script replicated at the end)
com.pmease.quickbuild.QuickbuildException: com.caucho.hessian.client.HessianRuntimeException:
com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://0.0.0.0:0/service/node'
java.net.ConnectException: Connection refused
(full stack omitted for brevity)

Here is the code; it works like a dream on QB 13:

groovy:
   def freeScript = """
     return new File(
       '/var'
     ).getFreeSpace();
   """
   
   // divide by 1gb for simple math
   def freeSpace =
     node.nodeService.evalGroovyScript( 
       freeScript
     ).intdiv( 1024**3 );

   // omit machines with less than 10gb free
   return (freeSpace < 10);

What do I need to do to make it better?

  • replies 8
  • views 33
  • stars 0
robinshen ADMIN ·

This works fine at my side on QB 14, except that I have to change last statement as below:

return (freeSpace < 10);

drdt ·

Yes, sorry, that was a copy error. For me the script doesn't even get that far, it fails during the call to node.nodeService.evalGroovyScript.

Are you testing it as a groovy step, or as a grid resource?

Oh, I forgot to mention, we are using JDK 21 on the upgraded server as well, but the agents are running JDK 8. Maybe that is a factor?

robinshen ADMIN ·

I am running Java 21 on both server and agent (also tested with Java 8 on both). And I am using the script when define node provider of a grid resource. The error "Error connecting 'http://0.0.0.0:0/service/node' seems odd to me. Ip address should never be "0.0.0.0", of have you just masked out the true ip?

drdt ·

No, that is taken directly from the log. Maybe QB is masking the true IP? Here is the full stack dump:

java.lang.RuntimeException: Failed to evaluate below expression in configuration 'root/containers/SAM_UI/latest':
    (script code omitted for brevity)
        at com.pmease.quickbuild.util.ExceptionUtils.wrapException(ExceptionUtils.java:87)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:332)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:75)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:80)
        at com.pmease.quickbuild.resource.nodeselection.ScriptNodeSelection.matches(ScriptNodeSelection.java:44)
        at com.pmease.quickbuild.resource.nodeselection.AndSelection.matches(AndSelection.java:29)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.OrSelection.matches(OrSelection.java:29)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.NotSelection.matches(NotSelection.java:28)
        at com.pmease.quickbuild.resource.nodeselection.AndSelection.matches(AndSelection.java:29)
        at com.pmease.quickbuild.resource.NodeResourceType.getCount(NodeResourceType.java:44)
        at com.pmease.quickbuild.DefaultBuildEngine.findCloudProfile(DefaultBuildEngine.java:1581)
        at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:1405)
        at java.base/java.lang.Thread.run(Thread.java:1583)
 Caused by: com.pmease.quickbuild.QuickbuildException: com.caucho.hessian.client.HessianRuntimeException: 
    com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://0.0.0.0:0/service/node'
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at jdk.proxy3/jdk.proxy3.$Proxy94.evalGroovyScript(Unknown Source)
        at com.pmease.quickbuild.grid.GridNode$1.evalGroovyScript(GridNode.java:239)
        at com.pmease.quickbuild.grid.NodeService$evalGroovyScript.call(Unknown Source)
        at script1718640878246216807463.run(script1718640878246216807463.groovy:10)
        at com.pmease.quickbuild.plugin.basis.BasisPlugin$34.evaluate(BasisPlugin.java:406)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:316)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:75)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:80)
        at com.pmease.quickbuild.resource.nodeselection.ScriptNodeSelection.matches(ScriptNodeSelection.java:44)
        at com.pmease.quickbuild.resource.nodeselection.AndSelection.matches(AndSelection.java:29)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.OrSelection.matches(OrSelection.java:29)
        at com.pmease.quickbuild.resource.nodeselection.ResourceNodeSelection.matches(ResourceNodeSelection.java:72)
        at com.pmease.quickbuild.resource.nodeselection.NotSelection.matches(NotSelection.java:28)
        at com.pmease.quickbuild.resource.nodeselection.AndSelection.matches(AndSelection.java:29)
        at com.pmease.quickbuild.resource.NodeResourceType.getCount(NodeResourceType.java:44)
        at com.pmease.quickbuild.DefaultBuildEngine.findCloudProfile(DefaultBuildEngine.java:1581)
        at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:1405)
        at java.base/java.lang.Thread.run(Thread.java:1583)
 Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://0.0.0.0:0/service/node'
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:101)
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
        ... 21 more
 Caused by: java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682)
        at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:592)
        at java.base/java.net.Socket.connect(Socket.java:751)
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:178)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636)
        at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:280)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:386)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:408)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1304)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1237)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1123)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1052)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1446)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1417)
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99)
        ... 22 more
        at com.pmease.quickbuild.plugin.basis.BasisPlugin$34.evaluate(BasisPlugin.java:430)
        at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:316)
        ... 14 more
robinshen ADMIN ·

Please switch to grid tab and check ip address of the agent involved to see the ip address is correct.

drdt ·

Okay, this gave great information. I couldn't figure out which node was causing the problem, so I ended up disabling all of them - and I still get the error! My actual criteria for node selection is:

  • any build agent AND
  • any linux node AND
  • the script returns true (freeSpace < 10)

If I disable all of my linux build agents, I still get the error. I added a logging step and this is the node ID, hostname and port:
[com.pmease.quickbuild.grid.GridNode@b283ef6a] = unknown:0
I get the error about eight times, each with a different GridNode id, all with host:port unknown:0. If I re-enable my agents, they do show up with the correct information in the log:
[com.pmease.quickbuild.grid.GridNode@95b10a74] = rh7arm1:8815

We have some cloud profiles set up, could this be interfering?

drdt ·

I unauthorized all of my agents, so there is nothing but the server available, and I still get the error, dozens at a time, each time preceded by my debugging info:

Linux [com.pmease.quickbuild.grid.GridNode@4012c15b] = unknown:0, 0.0.0.0

Each instance is repeated three times, IE, it generates output for the same GridNode@NUMBER three times in a row.
It only happens after I request a build; then it appears immediately, and repeats about twice a minute until I cancel the request.

If I remove the offending code, I no longer get the failure, but I continue to see the debugging info. So the problem is not that the script is bad, but rather that some unknown and nonexistent machine is being selected by "All build agents" && "All Linux nodes".

I think it makes sense that the cloud profiles are somehow contributing to this. Is there a way to filter out cloud profiles in this query?

robinshen ADMIN ·

Thanks for the info. This explains the behavior. QB may launch temporal node of cloud profile, in which case the ip address is initially "0.0.0.0" before the launched node starts and reports its actual ip. So it is impossible to run groovy script in this case. To fix the issue, just ignore those temporal nodes like below:

groovy:
   if (node.ip == "0.0.0.0")
     return false;

   def freeScript = """
     return new File(
       '/var'
     ).getFreeSpace();
   """
   
   // divide by 1gb for simple math
   def freeSpace =
     node.nodeService.evalGroovyScript( 
       freeScript
     ).intdiv( 1024**3 );

   // omit machines with less than 10gb free
   return (freeSpace < 10);