The case for docker in multicloud enabled bioinformatics applications
The introduction of next generation sequencing technologies did not bring only huge amounts of biological data but also highly sophisticated and versatile analysis workflows and systems. These new challenges require reliable and fast deployment methods over high performance servers in the local infrastructure or in the cloud. The use of virtualization technology has provided an efficient solution to overcome the complexity of deployment procedures and to provide a safe personalized execution box. However, the performance of applications running in virtual machines is worse than that of those running on the native infrastructure. Docker is a light weight alternative to the usual virtualization technology achieving notable better performance. In this paper, we explore the use case scenarios for using Docker to deploy and execute sophisticated bioinformatics tools and workflows, with a focus on the sequence analysis domain. We also introduce an efficient implementation of the package elasticHPC-Docker to enable creation of a docker-based computer cluster in the private cloud and in commercial clouds like Amazon and Google. We demonstrate by experiments that the use of elasticHPC-Docker is efficient and reliable in both private and commercial clouds. © Springer International Publishing Switzerland 2016.