Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you have alot of data, and network IO is a big issue, you'll want to use something like hadoop (or disco) becuase they come with an integrated distributed file system and they preserve data locality.

If you don't have that much data, MR on redis is fine



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: