ai companies are already having issues with (and not doing) training on new data because lots of it is llm generated, a tiny tiny fraction of that being grammatically correct but nonsense really won’t do anything.
also, to make enough nonsense to get the llms to reproduce it would need like, a large fraction of the internet to be complete (human generated) nonsense. at that point you’ve ruined the internet in a different, entirely human caused, way.
ai companies are already having issues with (and not doing) training on new data because lots of it is llm generated, a tiny tiny fraction of that being grammatically correct but nonsense really won’t do anything.
also, to make enough nonsense to get the llms to reproduce it would need like, a large fraction of the internet to be complete (human generated) nonsense. at that point you’ve ruined the internet in a different, entirely human caused, way.