Singularity: can it go to infinity?

By: @finchtalk
Singularity: the point at which a function takes an infinite value. 

"Eew!" Is how high performance computing (HPC) admins react to Docker, according to Dr. Vanessa Saurus when she described the motivation for developing Singularity [1] at the Cyverse Container Camp.  Like Docker, Singularity allows one to package programs and their dependencies in ways that they can be run as virtual instances with low overhead. Singularity improves on Docker to make it possible to run containers in HPC environments such as super computers. 

Why another container technology?

A challenge with Docker stems from its original development where it has supported communities that utilize local computing, cloud computing, and enterprise environments. In these environments Docker container instances are bounded by being in a user’s local computer, within a virtual machine (VM) running in a cloud server, or under tight systems administration control in enterprise settings. That’s good because Docker processes can be exploited to give individuals the highest systems administration level permissions ("root").  Docker container processes are spawned as children of a root owned Docker daemon. As users can control Docker daemons, they can gain escalated, “root”, privileges. Docker has mechanisms to minimize this risk, but implementing those mechanisms requires full system administration control of the environments in which Docker is being run. This is orthogonal to HPC where researchers must be able to run arbitrary code and systems administrators work to ensure that the system is not compromised by malicious code. Hence, the "eew" from HPC administrators. 

Despite the fact that Docker is very popular, the Docker company has not addressed the above challenge because it is largely a scientific computing problem. And scientific computing it is not a market driver for the kinds of large-scale changes Docker would need to make. Compared to the broader computer technology markets, which Docker supports well, research and research-oriented HPC is small. Because the scientific community recognizes the value of containers in enabling software reuse and aiding in reproducibility, Singularity was developed to make scientific software portable between local computers, cloud computing systems (academic and commercial) and super computers (HPC). Singularity’s initial and on-going development goals are community focused so that it is the most comprehensive in terms of supported requirements [1].

How does Singularity work?

Singularity is similar to Docker in that file images that contain a minimally needed operating system, programs, supporting files, and dependencies are created by a developer. A significant difference from Docker is that Singularity containers are run with the same user inside the container as outside the container. That is a user can only be root if that user has root in environment where the container is being run. This capability solves the root escalation problem that is Docker's Achilles Heel, and limits users abilities to make arbitrary modifications to running containers to further improve reproducibility. Singularity also has several HPC capabilities that Docker does not. 

Like Docker, there is a hub for sharing containers; according to the Singularity-hub.org pages there are nearly 32,000 images. A good start, but not the millions that are on Docker. Like Dockerhub it is also hard to find things on Singularity-hub because keywords are not used effectively. A search for bioinformatics yields zero, for example.  Yet, inspection of the list reveals many bioinformatics-related containers. 

Speaking of Docker, if you want to use Singularity, you may ask, "do I have to rebuild my Docker images?" The answer is, thankfully, no. Singularity’s authors appropriately recognized Docker’s momentum and developed accordingly. One can build a Singularity image from a Docker image, or simply run Docker containers. 

References:

[1] Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459. https:// doi.org/10.1371/journal.pone.0177459 

This, and previous posts (below) are based on notes from the Cyverse Container Camp hosted at the University of Arizona, Tucson AZ. I attended as part of my efforts to design the advanced bioinformatics portion of Shoreline Community College Immuno-biotechnology certificate

Previous posts:
Solving the Bioinformatics Singularity with Containers  
Docker - can it be contained?

Filed under: