Coherency requires relating symbolic meanings. AI just uses statistical analysis.
Consider if you were locked in the national library of Thailand. You don’t speak Siamese, and any pictures or bilingual dictionaries were removed.
Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there’s still no coherency because you cannot connect the meaning behind any of the words.
That doesn’t necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you’re fundamentally unable to be trained to do that yourself.
If a human being takes people’s work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?
We’re getting into ethics territory. IP is a social construct and we live under capitalism, our model for determining what is and isn’t theft should be selected by what supports artists and consumers against capitalists.
Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there’s still no coherency because you cannot connect the meaning behind any of the words.
That doesn’t necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you’re fundamentally unable to be trained to do that yourself.
You’re comparing an LLM to something similar to the infinite monkey theorem. In your analogy, you should consider that someone who knows perfect Siamese is giving me feedback to optimize and improve my outputs, even I don’t really know the meaning of anything.
While an LLM may not have a conscience to evaluate if its output is coherent, it can identify patterns and relationships from its training and can generate text that is still appears coherent to human readers.
Coherency requires relating symbolic meanings. AI just uses statistical analysis.
Consider if you were locked in the national library of Thailand. You don’t speak Siamese, and any pictures or bilingual dictionaries were removed.
Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there’s still no coherency because you cannot connect the meaning behind any of the words.
That doesn’t necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you’re fundamentally unable to be trained to do that yourself.
We’re getting into ethics territory. IP is a social construct and we live under capitalism, our model for determining what is and isn’t theft should be selected by what supports artists and consumers against capitalists.
Ah, the Siamese Room argument.
You’re comparing an LLM to something similar to the infinite monkey theorem. In your analogy, you should consider that someone who knows perfect Siamese is giving me feedback to optimize and improve my outputs, even I don’t really know the meaning of anything.
While an LLM may not have a conscience to evaluate if its output is coherent, it can identify patterns and relationships from its training and can generate text that is still appears coherent to human readers.