Storm: a fast transactional dataplane for remote data structures

  • Stanko Novakovic ,
  • Yizhou Shan ,
  • Aasheesh Kolli ,
  • Michael Cui ,
  • Yiying Zhang ,
  • Haggai Eran ,
  • Boris Pismenny ,
  • Liran Liss ,
  • Michael Wei ,
  • Dan Tsafrir ,
  • Marcos Aguilera

12th ACM International Systems and Storage Conference (SYSTOR) |

Organized by ACM, USENIX

RDMA technology enables a host to access the memory of a remote host without involving the remote CPU, improving the performance of distributed in-memory storage systems. Previous studies argued that RDMA suffers from scalability issues, because the NIC’s limited resources are unable to simultaneously cache the state of all the concurrent network streams. These concerns led to various software-based proposals to reduce the size of this state by trading off performance.

We revisit these proposals and show that they no longer apply when using newer RDMA NICs in rack-scale environments. In particular, we find that one-sided remote memory primitives lead to better performance as compared to the previously proposed unreliable datagram and kernel-based stacks. Based on this observation, we design and implement Storm, a transactional dataplane utilizing one-sided read and write-based RPC primitives. We show that Storm outperforms eRPC, FaRM, and LITE by 3.3x, 3.6x, and 17.1x, respectively, on an InfiniBand cluster with Mellanox ConnectX-4 NICs.