A Novel Hadoop Security Model for Addressing Malicious Collusive Workers

By

Sauber A.M.

Awad A.

Shawish A.F.

El-Kafrawy P.M.

Artificial Intelligence

With the daily increase of data production and collection, Hadoop is a platform for processing big data on a distributed system. A master node globally manages running jobs, whereas worker nodes process partitions of the data locally. Hadoop uses MapReduce as an effective computing model. However, Hadoop experiences a high level of security vulnerability over hybrid and public clouds. Specially, several workers can fake results without actually processing their portions of the data. Several redundancy-based approaches have been proposed to counteract this risk. A replication mechanism is used to duplicate all or some of the tasks over multiple workers (nodes). A drawback of such approaches is that they generate a high overhead over the cluster. Additionally, malicious workers can behave well for a long period of time and attack later. This paper presents a novel model to enhance the security of the cloud environment against untrusted workers. A new component called malicious workers' trap (MWT) is developed to run on the master node to detect malicious (noncollusive and collusive) workers as they convert and attack the system. An implementation to test the proposed model and to analyze the performance of the system shows that the proposed model can accurately detect malicious workers with minor processing overhead compared to vanilla MapReduce and Verifiable MapReduce (V-MR) model [1]. In addition, MWT maintains a balance between the security and usability of the Hadoop cluster. © 2021 Amr M. Sauber et al.