That's true. I use MIT myself (although I don't care if people cite me).
I guess this just all seems rather activist to me but people aren't seeing the big picture; our jobs as programmers are about to change in a very big way. It won't be long before more competition enters the space (e.g. Salesforce and Amazon) and ultimately it won't matter if this model saw some GPL/MIT code because the next one will work twice as well without having seen any of it.
People are seeing the big picture. That's all the more reason to make sure that, as new tools get developed, they treat it as a requirement to actually respect Open Source (licensing, credit, provenance, copyleft, patent non-aggression, and all the other reasons people use such licenses), rather than just abusing it as an input.
Perhaps I should have said idealistic. I mean I've seen several people make demands like they're somehow already in a courtroom with Microsoft. No mention made of the fact that this is likely covered under fair use. If it isn't, then basically every deep neural net trained on a web dataset is in violation. This means open works like from University at Heidelberg (Latent Diffusion, VQGAN) or EleutherAI (GPT-J) would be impossible.
Like, maybe you could all get together and learn how machine learning works and train your own clean model? That feels far more positive and in the spirit of progress to me. I guess from my point of view it's clear - no laws will save you from the next wave of large language models. The weights are trivially distributed meaning once it's trained it's more just a fact of life we all have to deal with. So making demands when you have basically zero leverage rather than admitting defeat and working within the new constraints of progress.
I just feel like it's an inherently philosophical position and people are acting like code theft hasn't been common practice for the history of all software.
I guess this just all seems rather activist to me but people aren't seeing the big picture; our jobs as programmers are about to change in a very big way. It won't be long before more competition enters the space (e.g. Salesforce and Amazon) and ultimately it won't matter if this model saw some GPL/MIT code because the next one will work twice as well without having seen any of it.