Michael Jing Xu
MIT EECS - Actifio Undergraduate Research and Innovation Scholar
Hardware-Accelerated Map-Reduce on Distributed Flash Storage
2014–2015
Arvind Mithal
BlueDBM is a distributed flash data store which aims to accelerate analytics over large volumes of data. At its core, BlueDBM comprises a network of flash storage devices linked by FPGA controllers. In addition to providing low-latency communication, the FPGAs can implement powerful application specific accelerators. Unfortunately, to reap the benefits of the FPGA-based accelerators as of now, users need to write code in a hardware description language. Most distributed systems designers don’t know Verilog or BlueSpec, but are comfortable with languages similar to PigLatin for Hadoop. Our project involves designing a similar high-level framework that defines MapReduce programs for BlueDBM. Our goal is to make BlueDBM as usable as it is powerful.
At Palantir, I spent time rewriting their custom finance language using a newer version of ANTLR. At Google, I improved the reindexer for the storage system responsible for the company’s corporate data. At MIT, Distributed Systems and Multicore Programming got me interested in distributed computing. But it was Computer System Architecture, one of my favorite courses, that led me to this project and hardware in general.