Is anyone using Athena on AWS for tasks like this? It goes without saying that the documentation is hit or miss, but SQL-ish queries of flat files on S3 (even gzipped) can be a nice way to get the same result without managing Spark instances.
Obviously, I can't speak for "anyone", but all the groups I work with do not. There are a myriad of reasons for this, and reasonable people could effectively argue to the merits of those reasons.
It's also important to remember that some of the reasons AWS, Google, and other cloud services are often NOT used are legal in nature. For example, some EU laws prohibit any personally identifiable (genetic) data from studies being put in the cloud. So, even if summary statistics - or data with PII data removed - can be put in the cloud, work has to be done on that data to remove it.