Never let the future disturb you. You will meet it, if you have to, with the same weapons of reason which today arm you against the present. Marcus Aurelius Today, as we look with amazement at the power of Large Language Models, we have arrived at a moment similar to the past revolutions of Copernicus and Darwin. The capabilities of Large Language Model and, just as significantly, the nature of these models and their training process, have far-reaching implications for our understanding of statistics, science, and the human condition.
Interesting piece, but, in my opinion, a bit too quick to accept "primacy of language" as a foregone conclusion. I find the arguments laid out by Jacob Browning and Yann LeCun here rather persuasive and would be curious to see your response: https://www.noemamag.com/ai-and-the-limits-of-language/. Bottom line is, Moravec's paradox won't go away regardless of how many tokens you include in your context and how much data you train on.
The other thing is Chomsky. By now, I think we can all pretty much agree that it's not very sportsmanlike to beat up on Chomsky, and anyway Chomsky is not the only game in town. In fact, the informational/probabilistic view of language, inspired to a large extent by Shannon, was given by Zellig Harris (who just happened to be Chomsky's advisor). Fernando Pereira had a nice overview of Harris' ideas: https://www.princeton.edu/~wbialek/rome/refs/pereira_00.pdf. Many people, including myself or Cosma Shalizi, have pointed out that LLMs are an instantiation of Harris' informational linguistics. I gave a talk at the Santa Fe Institute in June, where I discussed some of these things, here are the slides if you're interested: https://uofi.app.box.com/s/r32s6wz579astndv1ghcpeyl6ldaej7w.
Overall, though, I do agree that we witnessing the emergence a new paradigm of experimental philosophy, so buckle up, everyone!
I don't agree with "Best current LLMs, such as GPT4, have linguistic competence comparable or exceeding that of an average human. Furthermore their ability extends across numerous languages and nearly every domain of human knowledge not requiring manipulation of physical objects"
It is true, I think, that aspects of linguistic competence are suddenly within reach, but this goes too far. Even human beings who are not yet able to formulate complete sentences in their native language have a working grasp of other aspects, such as narrative and relevance, which thr LLMs do not reliably command. It's not that one set of abilities is superior to the other, far more that they are incommensurate. In the same way that grandmaster level chess is neither harder nor easier than making a cup of tea in an unfamiliar kitchen. We are predisposed to be impressed by things that humans find hard.
formal linguistic competence in the most studied languages. That is, its responses follow linguistic
conventions and are fluent and grammatical, but they might be inaccurate or even hallucinate ... they also show signs of functional linguistic competence in its responses, i.e., discursive coherence, narrative structure and linguistic knowledge, even if not fully consistent (sometimes they
do not consider context or situated information, and
Interesting piece, but, in my opinion, a bit too quick to accept "primacy of language" as a foregone conclusion. I find the arguments laid out by Jacob Browning and Yann LeCun here rather persuasive and would be curious to see your response: https://www.noemamag.com/ai-and-the-limits-of-language/. Bottom line is, Moravec's paradox won't go away regardless of how many tokens you include in your context and how much data you train on.
The other thing is Chomsky. By now, I think we can all pretty much agree that it's not very sportsmanlike to beat up on Chomsky, and anyway Chomsky is not the only game in town. In fact, the informational/probabilistic view of language, inspired to a large extent by Shannon, was given by Zellig Harris (who just happened to be Chomsky's advisor). Fernando Pereira had a nice overview of Harris' ideas: https://www.princeton.edu/~wbialek/rome/refs/pereira_00.pdf. Many people, including myself or Cosma Shalizi, have pointed out that LLMs are an instantiation of Harris' informational linguistics. I gave a talk at the Santa Fe Institute in June, where I discussed some of these things, here are the slides if you're interested: https://uofi.app.box.com/s/r32s6wz579astndv1ghcpeyl6ldaej7w.
Overall, though, I do agree that we witnessing the emergence a new paradigm of experimental philosophy, so buckle up, everyone!
I don't agree with "Best current LLMs, such as GPT4, have linguistic competence comparable or exceeding that of an average human. Furthermore their ability extends across numerous languages and nearly every domain of human knowledge not requiring manipulation of physical objects"
It is true, I think, that aspects of linguistic competence are suddenly within reach, but this goes too far. Even human beings who are not yet able to formulate complete sentences in their native language have a working grasp of other aspects, such as narrative and relevance, which thr LLMs do not reliably command. It's not that one set of abilities is superior to the other, far more that they are incommensurate. In the same way that grandmaster level chess is neither harder nor easier than making a cup of tea in an unfamiliar kitchen. We are predisposed to be impressed by things that humans find hard.
I like the claim in https://arxiv.org/pdf/2308.16797.pdf "these models have achieved a proxy of a
formal linguistic competence in the most studied languages. That is, its responses follow linguistic
conventions and are fluent and grammatical, but they might be inaccurate or even hallucinate ... they also show signs of functional linguistic competence in its responses, i.e., discursive coherence, narrative structure and linguistic knowledge, even if not fully consistent (sometimes they
do not consider context or situated information, and
fail to adapt to users and domains)."