RainforestCluster is an Amazon EC2 python program that manages and load-balances
dynamic clusters to allow for maximum workflow flexibility and speed at minimal
cost. It enables one to quickly and cheaply create dynamic
compute clusters in the cloud, which can then run computational pipelines generically. It is also able to optimize the use of
spot instances - idle computers in Amazon's cloud that are available at
drastically reduced cost (5x-10x cheaper) - but can be terminated at any moment
if capacity drops or the bid price rises. It also provides pre-installed
features such as GlusterFS distributed filesystems,
ThunderstormDistributor queuing system, RAID 0 /scratch, password-less
ssh, and automatic cluster management, for ease of use and maximum processing
speed for computational tools. Originally it was developed as a different
version for the Wall Lab at Harvard CBMI.
RainforestCluster and ThunderstormDistributor
I have used these two programs together to render a fractal movie based on my fractal art (fullscreen is best):
This is a screenshot of the statistics viewer:
Open-Source Version Download (Beta)
This RainforestCluster open-source version includes GlusterFS, RAID 0 /scratch, password-less ssh, automatic instance management, setup, and loadbalancing, and utilizes my ThunderstormDistributor job queuing system for improved performance and flexibility. This version is a different edition than was originally created for the Wall Lab, which used Grid Engine as the queuing system.
RainforestCluster is available for download under the MIT license on its SourceForge project page.
You can view the README, install instructions, and usage guide here.