Pipistrello

Pipistrello is a utility that allows you to create and run Map/Reduce jobs with (almost) any executable or script as the mapper and/or the reducer in pretty much the same way as hadoop-streaming. The fundamental difference is that Pipistrello is specially designed to operate on binary files (NetCDF, jpg, png, mp3... you name it!).

Background

Pipistrello’s original goal is to bring powerful technologies used by the industry (Google, Facebook, Spotify, Amazon...) into scientists laboratories. As the amount of data generated by science grows at unprecedented speed, scientists need software to catch up and harness the power of their data. Unfortunately, the general situation up to now is that the software available in their labs is quite poor when compared to the software they use at home for handling their music or photo collections. Science deserves better.

Acknowledgements:

This project was entirely developed by Juan Manuel Carmona-Loaiza as part of the Master in High Performance Computing (MHPC) of the International Center of Theoretical Physics (ICTP) and the Scuola Internazionale Superiore di Studi Avanzati (SISSA) under the supervision of Graziano Giuliani and funded by the Istituto Nazionale di Oceanografia e di Geofisica Sperimentale (OGS) and CINECA under HPC-TRES program award with number 2015-04.