In this case I want to use the "resource pool" idea to link three build agents i.e. before a build agent can execute a step it first has to acquire a "mutex". Once Agent1 has the mutex, Agents 2 and 3 must wait before they can execute a step. Each build agent specifies the name of the mutex it requires via it's user attributes. I have created a variable at the root of our configuration tree with the value 0 to represent the mutex. If the node that is chosen to run a step requires the mutex it first increments the variable during it's node selection script. However if the attribute does not have a value of 0 the node selection script returns false. In a post execution script the value is decremented i.e. the resource is freed.
Node selection script:
groovy:
// First remove any nodes that don't have required resources.
String s = vars.getValue("resourceList");
def list = s.tokenize(',');
for (resource in list) {
if (node.hasResource(resource.trim()) == false) {
return false;
}
}
// Next: does this node require the mutex?
String mutex = node.getAttribute("qb.dbf.mutex");
if (mutex == null) {
return true;
}
// If so increment the number of nodes locking the mutex.
def lock = vars.get(mutex).increase(true);
// If the mutex was already locked remove our reference and return false.
if (lock > 0) {
vars.get(mutex).decrease(true);
return false;
}
return true;
The above script seems to work, one node is able to acquire the mutex everyone else has to wait.
Post-execute script:
groovy:
// Does the node require a mutex, if so it must have locked the mutex to get here.
String mutex = node.getAttribute("qb.dbf.mutex");
if (mutex != null && vars.get(mutex).getIntValue() > 0) {
// release our lock.
vars.get(mutex).decrease(true);
}
My configuration setup is as follows:
>- Parallel step:
>>- Sequential step:
>>>- Dummy step: acquires mutex (runs node selection script).
>>>- Ant step: runs build.
>>>- Dummy step: releases mutex (runs post-execute script).
In the post-execute code the value of the variable is always zero. Note that the Node Selection script and Post-Exectute script are run in different steps, however they are contained in the same sequential parent and are guaranteed to run on the same node.
I have added logging to these steps and can see the "resource" being acquired while all other parallel steps wait. I can see the post-execute code run on the correct node however the value of the variable is zero as such decrease() is not called and all other parralel steps continue to wait until the build times out.
Any help or thoughts on how to achieve the same result in a different way are appreciated.