Emma

Emma is a project to create a platform for development of application for Spark and DockerSwarm clusters.

4
contributors
568 commits | Last update: April 17, 2018

Cite this software

Choose a version:
DOI:
[[ releases.length > 0 ? releases[selectedIndex].doi : conceptDOI ]]
Copy to clipboard
Choose a citation style:
Download file

What Emma can do for you

  • It is designed for users deploying Spark and DockerSwarm clusters in a cloud infra-structure.
  • It helps the user to prepare cloud virtual machines
  • The provision of machines is done with Ansible, an automation tool for IT infra-structure.
  • It provides command line access to the users to install the required libraries and systems, configure them, start/stop services, add new modules for Jupyter notebooks, and even update the firewall

Emma is an open-source project to create a platform for development of applications for Spark and DockerSwarm clusters. The platform runs on an infra-structure composed by virtual machines that must be reachable by SSH. The machines are either cloud virtual machines or Vagrant machines. The latter tool allows the platform to be simulated on a local machine, i.e. in a local development environment.

Once the machines are prepared, the servers are provisioned using Ansible, an automation tool for IT infra-structure. Ansible playbooks are used to create a storage layer, processing layer, and JupyterHub services. The storage layer offers two flavors of storage, file-base by GlusterFS and Hadoop Distributed File System (HDFS), and object-based using Minio. The processing layer has a Apache Spark cluster and a Docker Swarm sharing the storage instances.

With Ansible we are able to deploy a platform with the same features at different locations, such as local cluster, national infra-structure, or even a commercial cloud provider. Such a feature allows us to have tool-provenance for easily repeatability of experiments between scientists.

Read more
Tags
  • Big data
Programming Language
  • YAML
License
  • Apache-2.0

Contributors

  • Niels Drost
    Netherlands eScience Center
  • Stefan Verhoeven
    Netherlands eScience Center
  • Jisk Attema
    Netherlands eScience Center
  • Romulo Gonçalves
    Netherlands eScience Center