|
Hudson supports the "master/slave" mode, where the workload of building projects are delegated to multiple "slave" nodes, allowing single Hudson installation to host a large number of projects. This document describes this mode and how to use it. How does this work?A "master" is an installation of Hudson. When you weren't using the master/slave support, a master was all you had. Even in the master/slave mode, the role of a master remains the same. It will serve all HTTP requests, and it can still build projects on its own. Slaves are computers that are set up to build projects for a master. Hudson runs a separate program called "slave agent" on slaves. When slaves are registered to a master, a master starts distributing loads to slaves. The exact delegation behavior depends on configuration of each project. Some projects may choose to "stick" to a particular machine for a build, while others may choose to roam freely between slaves. For people accessing Hudson website, things works mostly transparently. You can still browse javadoc, see test results, download build results from a master, without ever noticing that builds were done by slaves. Follow the Step by step guide to set up master and slave machines to quickly start using distributed builds. Requirement for master/slave supportTo use the master/slave support, a master needs to be able to run a "slave agent" program on the slave. There are two ways to do this: Have master launch slave agentOne way of doing this is to configure a master to launch a slave agent on the target machine. On Unix, this can be done by SSH, RSH, or other similar means. On Windows, this could be done by the same protocols through cygwin or tools like psexec. The slave agent program is a simple Java program that can be launched like java -jar slave.jar. A copy of slave.jar can be found inside hudson.war under WEB-INF. Therefore, a typical slave agent launch command would look something like ssh myslave java -jar ~/bin/slave.jar. This requires an additional initial set up on slaves (especially on Windows, where remote login mechanism is not available out of box), but the benefits of this approach is that when the connection goes bad, you can use Hudson's webui to re-establish the connection.
Launch slave agent via Java Web StartAnother way of doing this is to start a slave agent through Java Web Start (JNLP). In this approach, you'll interactively logon to the slave node, open a browser, and open the slave page. You'll be then presented with the JNLP launch icon. Upon clicking it, Java Web Start will kick in, and it launchs a slave agent on the computer where the browser was running. Java Web Start provides some means of automatically running the slave agent. For example, instead of manually clicking the icon, you can run the following command from CLI:$ javaws http://hudson.acme.org/computer/slave-name/slave-agent.jnlp See also Installing Hudson as a Windows service Other RequirementsAlso note that the slaves are a kind of a cluster, and operating a cluster (especially a large one or heterogeneous one) is always a non-trivial task. For example, you need to make sure that all slaves have JDKs, Ant, CVS, and/or any other tools you need for builds. You need to make sure that slaves are up and running, etc. Hudson is not a clustering middleware, and therefore it doesn't make this any easier. Example: Configuration on UnixThis section describes my current set up of Hudson slaves that I use inside Sun for my day job. My master Hudson node is running on a SPARC Solaris box, and I have many SPARC Solaris slaves, Opteron Linux slaves, and a few Windows slaves.
Scheduling strategySome slaves are faster, while others are slow. Some slaves are closer (network wise) to a master, others are far away. So doing a good build distribution is a challenge. Currently, Hudson employs the following strategy:
If you have interesting ideas (or better yet, implementations), please let me know. Transition from master-only to master/slaveTypically, you start with a master-only installation and then much later you add slaves as your projects grow. When you enable the master/slave mode, Hudson automatically configures all your existing projects to stick to the master node. This is a precaution to avoid disturbing existing projects, since most likely you won't be able to configure slaves correctly without trial and error. After you configure slaves successfully, you need to individually configure projects to let them roam freely. This is tedious, but it allows you to work on one project at a time. Projects that are newly created on master/slave-enabled Hudson will be by default configured to roam freely. Master on public network, slaves within firewallOne might consider setting up the Hudson master on the public network (so that people can see it), while leaving the build slaves within the firewall (because having a lot of machines on the internet is expensive.) This can generally be made to work in two means:
Note that in both cases, once the master is compromised, all your slaves can be easily compromised (IOW, malicious master can execute arbitrary program on slaves), so both set-up leaves much to be desired in terms of isolating security breach. Build Publisher Plugin (which looks almost ready as of this writing) provides another way of doing this, in more secure fashion. Troubleshooting tips
|

Comments (14)
Jun 29, 2007
Anonymous says:
You should consider expanding on the section about launching slaves via Java Web...You should consider expanding on the section about launching slaves via Java WebStart. Took me a bit to figure it out. I'll even write it up of you like.
Jun 29, 2007
Kohsuke Kawaguchi says:
Yes, please! Much appreciated.Yes, please! Much appreciated.
Aug 01, 2007
Anonymous says:
Can someone give me a hint please how to open the slave page (url) ? thx\! Can someone give me a hint please how to open the slave page (url) ?
thx!
Sep 20, 2007
Anonymous says:
On my master (Linux) node I have added an Ant instance which points to the /opt/...On my master (Linux) node I have added an Ant instance which points to the /opt/ant-1.7.0 directory. Now, some build can be performed only on Windows so I've defined a Windows slave spawned via JNLP. But every build will fail, because Ant is not in /opt/ant-1.7.0 but somewhere else (c:\ant or whatever).
Same question about JDK path.
Sep 20, 2007
Anonymous says:
Ok, I have found that if I put Ant into c:\opt\ant1.7.0 it seems to work. Nevert...Ok, I have found that if I put Ant into c:\opt\ant-1.7.0 it seems to work. Nevertheless I think that such things could be configurable per slave. Of course plugins could be able to contribute paths to slave configurations
Oct 24, 2007
Anonymous says:
If the Hudson master is running atIf the Hudson master is running at
http://hudsonmaster:8080/hudsonthen you would login to the remote slave server, open a browser and enter the above URL. On the left-hand menu, you will see Build Executor Status section with "Master" and your remote slave listed below. Click the slave name link and on the resulting page you will see the "Launch" button for Java Web Start.
Oct 10, 2007
Daniel Pike says:
Something very useful in corporate networks in the ability to run slaves as a wi...Something very useful in corporate networks in the ability to run slaves as a windows service. After a fair bit of playing I have been able to achieve this using the Tanuki software's Java service wrapper. I am happy to write something up and send it through if it would be useful?
Oct 10, 2007
Kohsuke Kawaguchi says:
Yes, by all means!Yes, by all means!
Oct 18, 2007
Daniel Pike says:
Done, sent through to Kohsuke's address :)Done, sent through to Kohsuke's address
Oct 16, 2007
Jeff Black says:
Second that\!Second that!
Jan 07, 2008
Anonymous says:
When hudson tries to launch a slave it complains that it cannot find mavenagent....When hudson tries to launch a slave it complains that it cannot find maven-agent.jar I don't know what maven is. What do I need to do to make hudson happy?
Jan 25
Anonymous says:
Within a \nix system, you might be able to view top or uptime looking for load a...Within a *nix system, you might be able to view top or uptime - looking for load average on a that host. Weighting it then scheduling work based on that value.
Feb 13
Anonymous says:
If I have 2 slaves build machines building the same project, is there a way to c...If I have 2 slaves build machines building the same project, is there a way to configure Hudson to utilize the build machines at the same time? for example, if machine 1 is already building and Hudson detects a change in the source code repository, can machine 2 start the build for the new checkin? That way there is no need to wait for machine 1 to finish to get feedback on the last checkin.
May 23
Michael Manz says:
An idea for a further rule for the scheduling strategy: 4. If a build depends on...An idea for a further rule for the scheduling strategy:
4. If a build depends on another build, try to build it on the same node that previously build the parent build.