RainforestCluster - Amazon EC2 Cluster Manager


RainforestCluster is an Amazon EC2 python program that manages and load-balances dynamic clusters to allow for maximum workflow flexibility and speed at minimal cost. It enables one to quickly and cheaply create dynamic compute clusters in the cloud, which can then run computational pipelines generically. It is also able to optimize the use of spot instances - idle computers in Amazon's cloud that are available at drastically reduced cost (5x-10x cheaper) - but can be terminated at any moment if capacity drops or the bid price rises. It also provides pre-installed features such as GlusterFS distributed filesystems, ThunderstormDistributor queuing system, RAID 0 /scratch, password-less ssh, and automatic cluster management, for ease of use and maximum processing speed for computational tools. Originally it was developed as a different version for the Wall Lab at Harvard CBMI.


RainforestCluster and ThunderstormDistributor

I have used these two programs together to render a fractal movie based on my fractal art (fullscreen is best):

This is a screenshot of the statistics viewer:


Open-Source Version Download (Beta)

This RainforestCluster open-source version includes GlusterFS, RAID 0 /scratch, password-less ssh, automatic instance management, setup, and loadbalancing, and utilizes my ThunderstormDistributor job queuing system for improved performance and flexibility. This version is a different edition than was originally created for the Wall Lab, which used Grid Engine as the queuing system.

RainforestCluster is available for download under the MIT license on its SourceForge project page.

You can view the README, install instructions, and usage guide here.