I have been advocating for the longest time that "repeatable read" is just a bad idea. Even if implementations were perfect. Even when it works correctly in the Database, it is still very tricky to reason about when dealing with complex queries.
I think two isolation levels that make sense are either:
* read committed
* serializable
You either go all the way to have a serializable setup, where there are no surprises. OR, you go in read committed direction where it is obvious that if you want have a consistent view of the data within a transaction, you have to lock the rows before you start reading them.
Read committed is very similar to just regular multi-threaded code and its memory management, so most engineers can get a decent intuitive sense for it.
Serializable is so strict that it is pretty hard to make very unexpected mistakes.
Anything in-between is a no man's land. And anything less consistent than Read Committed is no longer really a database.
I honestly don't think I've seen people reason about read committed well, especially as an application grows it becomes very difficult to understand all the cases in which locks are grabbed/data is accessed. (I feel that way about multi-threaded code and locks too, but that's for another time).
So I really only see serializable to be the only sane isolation model (for r/w transactions), and snapshot isolation is a good model for readonly transactions (basically you get a frozen in time snapshot of the database to work with). This also happens to be the only modes in which Spanner gives you: https://cloud.google.com/spanner/docs/transactions
Yeah there are many use cases where you aren't really using transactions, or where hard accuracy isn't a requirement (you're taking a percentage out of millions of rows) but you need fast write performance. It's very nice and can help you put off transitioning to a "real" data warehouse or a cache for a while.
Here is my issue with this. We are assuming you need to read a "consistent snapshot" in some type of real time application. Because if it isn't real time, you can always have those snapshot type of querying on "replicas", since that is a lot easier to implement correctly without sacrificing performance.
So assuming you are looking at reading "consistent snapshot" in the context of a real time transaction. If the data that you want to read as a "consistent snapshot" is small, locking + reading is good enough in most cases.
If the data to read is too large (i.e. query takes long time to execute, and pulls a lot of data), you are going to have ton of scaling issues if you are depending on something like "repeatable read". Long running transactions, long running queries, etc are bane of all the database scaling and performance.
So you really want to avoid that anyways, you would almost always be much better of changing your application logic to make sure you can have much shorter, time bounded transactions and queries and setup better application level consistency scheme. Otherwise you will at some point hit scaling/performance problems and they will be an absolute nightmare to fix.
Under MVCC this does not require locks which is a non trivial overhead savings for use cases that can tolerate slightly stale data, but want that data to be internally consistent, ie no read skew (and write skew is irrelevant in read only queries). This combination of strict serializable + read only snapshots is quite common with recently developed databases.
That is sort of my point. I think "repeatable read" is a fools gold. You think you wont need to do locking, but it is too easy to make incorrect assumptions about what guarantees "repeatable read" provides and you can make very subtle mistakes which leads to rare, extremely hard to diagnose correctness issues.
Repeatable read type of setups also make it much easier to accidentally create much longer running transactions, and long running transactions/too many concurrent open transactions/etc can create really unexpected, very hard to resolve performance issues in the long run for any database.
I agree that "read committed" is clearer in knowing what you're getting than "repeatable read". The latter can be convenient and workable if you accept that you will need to do locking to avoid write skew, etc.
I've used both "read committed" and "repeatable read" with MySQL and learned to deal with each in their own way.
The problem I've seen is with large/long-lived transactions that impact performance, where the solution is to divide writes into smaller transactions in the design--"read committed" does tend to encourage smaller transactions.
I think two isolation levels that make sense are either:
* read committed
* serializable
You either go all the way to have a serializable setup, where there are no surprises. OR, you go in read committed direction where it is obvious that if you want have a consistent view of the data within a transaction, you have to lock the rows before you start reading them.
Read committed is very similar to just regular multi-threaded code and its memory management, so most engineers can get a decent intuitive sense for it.
Serializable is so strict that it is pretty hard to make very unexpected mistakes.
Anything in-between is a no man's land. And anything less consistent than Read Committed is no longer really a database.