Move & Migrate Unstructured Data

Data management requirements are more and more important 

Even mandatory for some industries, especially for research centers. They need solutions for automatic data transfer engines between different storages systems.

Data Mover meet the following functionalities:

  • Organize data movement between different storage repository
  • Maintain direct user access to different tier of data repositories which are active and/or archived
  • Integration with workload managers for accelerating the efficiency of HPC/Supercomputing resources
  • Provide a public API and SDK to facilitate integration with specific search applications
  • Include end-users interfaces (HTLM GUI and bash client) to manage their owned data movement 

researchers can copy data locally to parallel file systems to process them on the fastEST HPC 


Copy data generated on supercomputers to the cloud object storage to make them accessible to researchers from all over the world

Key Benefits of Automated Workflow

Level of policy

Define tasks with a certain level of policy, such as Copy, Move.  It allows organizations to better control their data.

Extended Data Set Selector

An unlimited data set selector brings granularity in the data you have to select. Movement can be executed between any type of storage from storage where hot data are stored to cold storage.


The internal scheduler can execute automatically, manual or scheduled task.  Allow organization to optimize their data movement tasks by scheduling them during off-peak hours or when resources are available.

Basic and advanced filtering options are available.  The advanced filtering provides an embedded scripting editor for even greater customization.
Workflow Manager

Policy-Based Workflow Manager

Data Mover Designed to control any Movement

Data Movement Workflows

Robust task management capabilities that allow users to control and monitor data movement workflows. Users can easily create and manage tasks for migrating data between different storage tiers. The platform provides a centralized dashboard that allows users to monitor task progress and get real-time updates on the status of their data.

Monitoring and reporting capabilities

Enable user to track the progress of their data movement workflows over time. Provides detailed reporting and analytics, allowing users to gain insights into their workflow performance and identify areas for optimization. 

Nodeum - Product Shoot - Data Manager-1
Workflow Manager

Scalable - Safe - Optimized Data Movement


The solution is designed to optimize the file data management processing for different types of files. Here are three types of concepts that help to understand:

  • The system is designed to automate the processing order,
  • the solution is designed around multi-threaded file batching technology. Each task movement is defined by 3 stages: preparation, execution, and finalization. Tasks run in // with a batch system creating jobs, each job contain a list of files, the number of files in each job is determine by a maximum number of files or a when the capacity of all files is reached.
  • includes different queuing systems which allow to execute in // the different type of actions which must be performed during a data movement processing.

The benefit is to not lose time in discovery phase, files start to be copied once the first batch of data has been received. The discovery process continues to run during the copy


The workflow Manager includes a scheduler to allow the processing of different tasks launched from different inputs.

Execution are done either automatically, manual or either scheduled.


The workflow Manager includes a powerful filtering module. This feature allows users to easily manage and organize their data movement workflows by filtering files based on specific criteria.

Two different types of filtering are available : basic and advanced. Advanced provides embedded scripting editor. 

The user can set up filter which automatically exclude or include files.  Different criteria are available : file size, creation date, modification date, file type, and more.

User can create more complex filter rules that combine multiple criteria to further refine their filters.



  • Allow user to easily manage large volumes of data and automate their workflows. By setting up filters, users can ensure that only relevant files are included in their data movement tasks, which can help to speed up the transfer process and reduce the risk of errors or data loss.
  • Highly customizable, allowing users to create filters that are tailored to their specific needs. It can include setting up filters for specific file types, folders, or directories, as well as creating custom rules based on metadata or other criteria.

Workflow manager enable a priority management.

It ensures that higher-priority tasks do not consume all the available resources, leaving other workflows waiting indefinitely.

Nodeum uses Quality of Service (QoS) techniques to prioritize data movement operations based on their priority level. This approach ensures that critical demands receive the necessary resources, while still providing fair access to non-critical demands. By assigning different priority levels to different tasks, Nodeum ensures that resources are allocated based on their priority level.

Additionally, Nodeum uses Fair Queuing techniques to allocate resources to different requests, ensuring that each remaining request gets an equal share of the remaining resources. This method helps prevent one data movement workflow from taking up all the available resources, ensuring that all remaining workflows get fair access.

Overall, the priority management and QoS techniques implemented in Nodeum ensure efficient resource allocation and workflow management, preventing resource contention and ensuring that all workflows receive fair access to resources.


The workflow manager system provides natively mechanism to prevent task movement overload.

The system checks the access to the storage before transferring any file.

Nodeum always provides a clean stop of the processing if there is any problem in the source or destination storage.

Even the source can be stopped – not reachable - …. Or the same for the destination. The behaviour is that the task is tagged as “stop by system”, logs task is updated with the root cause, error code, … And even the file logging is updated with status of each file not well processed and again why a display of the reason.

Workflow Manager

Data Integrity Verification

Data integrity is a crucial aspect of any data management solution.  Ensuring that stored data is accurate, complete and unchanged is essential. This maintains user confidence and protects organization from potential liability.

Workflow manager purpose 2 main mechanism : non-cryptographic hashes (e.g. xxHash) and cryptographic hashes (e.g. MD5).


MD5 hash algorithm is a commonly used function for validating data integrity. An MD5 checksum is a 32 digit hexadecimal number that represents the hash of the contents of a file.

The calculation of an MD5 is an industry standard so the integrity can be checked on any system.


It is an Extremely fast Hash algorithm, the hash is faster to be generated. It is highly portable, and hashes are identical across all platforms (little / big endian). This is more and more popular in the video device, Nodeum uses the xxHash64be algorithm which is compatible with other product and software in the industries


HOOKS & Callback

Nodeum’s hook is a feature that allows user to execute custom scripts or commands during specific events within each Data Movement, such as before or after a data movement task.

These custom scripts or commands can be used to automate additional tasks, integrate with other systems, or perform specific actions based on the event that has occurred.

For example, a user can configure a hook to run a custom script after a data movement task has completed. This script could then perform additional actions such as sending an email notification, updating a metadata database, or triggering an event.

Hooks can be created through the Nodeum Console, the ND Client or with the RESTful API interface. The hook configuration allows users to specify the event that triggers the hook, the script or command to run, and any parameters or variables to be passed to the script or command.

Using Nodeum hooks can greatly enhance the automation and integration capabilities of the solution, allowing user to customize workflows and extend the functionality of the platform to meet their specific needs.

Get Started With Nodeum

Start the download, or compare the features of the different editions.


Searchable Catalog

Catalog in real-time data stored in any of your storage systems (primary and secondary storages). Accessing and finding your data have never been easier.

Leverage Cloud ML/AI

Enrich data stored on premise or in cloud with the best Cloud based ML/AI engines. Get the most of these platforms to facilitate the organization and classification of your contents. 

Reporting & Analysis

Storage data usage statistics reporting and data lifecycle overview. Generate your reports on the current data usage, control and master your data lifecycle.

TCO Simulator

Toolbox to control the cost of your storage usage. Provide simulation of the cost impact of a new Storage Data Management Strategy.


Turn any Linux platform to a Nodeum appliance in deploying easily an Ansible package. Get the most of your hardware, avoid any locking and get the most of next releases.

Highly Scalable

Scalability in data volumes, number of files and overall performance allow to grow with your needs in keeping the same user experience.


Enjoy the capability to customize, tag and retrieve your files' metadata. Facilitate your organisation 

Intuitive Interface

Provide a natural and unique experience for any user using the solution. This helps to keep focus on your business and productivity.

Key Features

Discover the full set of features available into Nodeum. Develop a business focus data management strategy to unlock the value and potential of your data.