Seattle Conference on Scalability: GIGA+: Scalable Directories for Shared File Systems

Posted in Conferences, Companies on December 23, 2008

Traditionally file system designs have envisioned directories as a means of organizing files for human viewing; that is, directories typically contain few tens to thousands of entries.Users of large, fast file systems have begun to put millions of entries in single directories, probably as simple databases. Furthermore, many large-scale applications burstily create a file per compute core in clusters with tens to hundreds of thousands of cores.

This talk is about how to build file system directories that contain billions to trillions of entries and grow the number of entries instantly with all cores contributing concurrently. The central tenet of our work is to push the limits of scalability by minimizing serialization, eliminating system-wide synchronization, and using weaker consistency semantics. We build a distributed directory index, called GIGA+, that uses a unique,self-describing bitmap representation that allows the servers to encode all their local state in a compact manner and provides the clients with hints required to address the correct server. In addition, GIGA+ also handles operational realities like client and server failures, addition and removal of servers, and "request storms" that overload any server. I'll describe the implementation of our prototype in the PVFS2 parallel file system and experimental evaluation that demonstrates high degree of scalability.

(this is joint work with Garth Gibson at CMU)

Speaker: Swapnil Patil
Swapnil Patil is a third-year Ph.D. student in CS at Carnegie Mellon University, working with Professor Garth Gibson. For fun (and profit), he likes to think about end-to-end issues in large-scale computer systems, particularly parallelism, reliability, and scalability. At CMU, he is a member of the Parallel Data Lab (PDL) and and the Petascale Data Storage Institute (PDSI).design of Chapel. He received his Bachelor's degree in Computer Science from Stanford University with honors in 1992.

Google Tech Talks
June 14, 2008

Watch Video

Tags: Techtalks, Google, Conferences, Scalability, engEDU, Education, Google Tech Talks, filesystem, Conference on Scalability 2008, GIGA+, Companies