On the interview for my current job, they asked me what I would do if I needed to download the entire Wikipedia. They expected me to design a scalable system with a separate crawler, parser, download queue, exponential backoff, and stuff. I said that I would just download a tarball. To their credit, they quickly accepted the answer as correct.