The Data Mover Service offers users a simple yet rich connection between HPC file systems and the cloud object storage at the respective Fenix site.
We design the Data Mover Service based on Nodeum. This includes integration into SLURM, a sophisticated Bash Client and also provides authentication in using the central Fenix AAI.
Fenix Research Infrastructure (www.fenix-ri.eu)
Fenix is an e-infrastructure where data repositories and scalable supercomputing systems are in close proximity and well-integrated. The first implementation is done through the ICEI project (Interactive Computing E-Infrastructure), which is part of the European Human Brain Project (HBP).
Six European supercomputing centers, namely :
- Barcelona Supercomputing Center (BSC) - Spain,
- Commissariat Energie Atomique (CEA) - France,
- CINECA - Italy,
- CSC - Finland,
- Swiss National Supercomputing Centre (CSCS) -Switzerland,
- Jülich Supercomputing Centre (JSC) - Germany.
agreed to align their services to facilitate the creation of this Infrastructure. Researchers from or associated to the HBP are the initial prime users of this e-infrastructure.
Today, supercomputing systems are so performant and scalable that the speed of data generation has never been so fast. Research centers have to store the generated content in “data repositories” that are located close to each other and that are well integrated.
Two different categories of data repositories are used as storage tiers:
- Active Data Repositories which provide the performance when data is written by supercomputing systems
- Archival Data Repositories with interfaces used in Cloud systems, which are more suitable for data sharing
Research facilities have an automatic data mover engine which interoperate between the two different types of data repositories.
This ideal solution provides the following features to users:
- Organize the movement of the data from the Active to the Archival Data Repository
- Keep a direct access by the users to Active and Archival Data Repositories
- Integration with HPC workload managers like Slurm
- Provide a public API and SDK to facilitate integration with specific research applications
Federated cloud object stores are used at these locations ; in using standard Swift and S3 interfaces, they enable researchers to exchange their data. With the Data Mover, researchers can copy their data locally to the existing parallel file systems in order to process them on the fast HPC systems.
In addition, researchers can copy data, which have been generated on these supercomputers to the cloud object storage in order to make them accessible to other researchers from all over the world.