Dryad and DryadLINQ

Chandu Thekkath Microsoft Research India Scientia 196/36, 2nd Main Sadashivnagar Bangalore 560 080 h
Wednesday, 27 Jan 2010 (all day)
This talk describes a set of distributed services developed at Microsoft Research Silicon Valley to enable efficient parallel programming on very large datasets. Parallel programmes arise naturally within scientific, data mining, and business applications. Central to our philosophy is the notion parallel programmes do not have to be difficult to write and that the same programme must seamlessly run on a laptop, desktop, a small cluster, or on a large data center without the author having to worry about the details of parallelization, synchronization, or fault-tolerance. Dryad and DryadLINQ are two services that embody this belief. The combination is extensively used within Microsoft, and is available free to academics, researchers, and non-commercial users. Our goal is to enable users, particularly non computer-scientists, to treat a computer cluster as a forensic, diagnostic, or analytic tool. The talk will describe the details of the system and the characteristics of some of the applications that have been run on it.