This is a very interesting observation. It seems like a relatively harmless, though annoying, linguistic evolution. But it highlights what I believe to be one of the biggest dangers of AI, which is that AI content is regurgitated and used everywhere to such a degree that not only are individuals unwittingly consuming AI generated content, but it is actually dramatically affecting the evolution of our collective language and knowledge as a species without us consciously realizing it, and with very little control on our part. This can produce not just annoying linguistic changes, but also changes in collective beliefs and values. Best case scenario, we end up with our culture and values being heavily influenced by random AI artifacts. Worst case scenario, the content policies and training data that companies like OpenAI use can rapidly encode biases, politics, or even nefarious propaganda into our collective consciousness with very little recourse or even awareness on our part.
On the flip side, LLM's have given me great appreciation for my own writing. I'm not great at writing, but I've come to appreciate that I really like the stuff that I write. I derive a ton of joy from having a thought, tossing it around, trying to put it on paper, revising it a zillion times, and reading back what I wrote.
LLMs have helped me ease a lot of creative frustration because I know there's always a way to turn on easy mode and that me staying on hard mode is a choice, not a curse. The alleviation of pressure has helped me put out more natural output.
I had the same reaction. Whether or not it’s worse, tho, it does seem at least different. The regurgitation loop used to be all human, but now there is a machine in the loop. A complex & multifaceted machine, ha.
I have friends who are making AI influencers which actually seem to be gaining quite a bit of traction, so I wouldn't underestimate viral AI generated content. It's probably already in your feeds just hard to notice.
Oh please, this is not how things will play out. Artists and content creators will be forced to move to more walled gardens to protect their creative rights. The warnings have been known for decades, nobody listened and now those bills are coming due. The open internet as we know will become a dead zone of generated and non-professional content. It’s not even about paywalls, the barriers will exist simply for copyright protection. Maybe it’s a good thing.
Have noticed the gpt vibe as well. Lots of twitter people lately thinking theyre smart and letting gpt write tweet replies for them, and I started noticing even just a couple words in that some responses I see read like GPT text. I think this is a good thing honestly, people should develop a sense for ai text and images, the sooner the better, just like we learned recognizing online scams or shady links in general.
> It's interesting to see GPT being utilized in various ways, including helping with tweet replies. Developing awareness about AI-generated content is indeed important, much like recognizing other online hazards. The more we understand its presence and capabilities, the better equipped we are to engage with it wisely.
Is what a ChatGPT reply to that might look like.
Given how widespread ChatGPT is in use now, I think eventually more people will start to write the way that ChatGPT does, even when they are not using ChatGPT.
I thought that some of the metaphysical imagery was really particularly effective. Interesting rhythmic devices too, which seemed to counterpoint the humanity of the of the author’s compassionate soul which contrives through the medium of the verse structure to sublimate this, transcend that, and come to terms with the fundamental dichotomies of the other and one is left with a profound and vivid insight.
> I hate the word "utilize" with a passion. It's "use" with extra steps.
Boy, are you gonna hate the English language. It's a veritable treasure trove of words that appear to mean the same thing, but, on closer inspection, are jam-packed full of nuance.
There is a bit of nuance. "Utilize" is more like "make use of" (as in employing something that was otherwise sitting idle) rather than the more straightforward "use".
> "Utilize" is more like "make use of" (as in employing something that was otherwise sitting idle) rather than the more straightforward "use".
Utilize means “make practical and effective use of”, the key difference from bare “use” being the “practical and effective” part. It is particularly useful in describing situations where the “practical and effective” would otherwise be contrary to some or all readers expectation, such as when describing the use of a tool about which there are widespread doubts or, a fortiori, one which is generally perceived as a bad fit for the use case being discussed, its also useful, as in the case upthread, when discussing a broad phenomenon of use to which one is reacting, to narrow that description to exclude uses thaf otherwise fit the description provided but which are bot practical and effective.
The upthread use is both discussing a tool which is controversial in all uses and deecribing a broad phenomenon of use in a context where the narrowing od “utilize” is significant, so it seems to be very much the kind of circumstance where “utilize” is most useful and distinct from “use”.
GPT-produced text reeks of stereotypies, but there’s a subtler stench to such text that I have trouble verbalizing. Some aspects I pick up on: the text is unnervingly chipper, yet impersonal; it reads like it was never meant to be read out loud; and it’s pathologically convincing.
> develop a sense for ai text and images
You can develop a sense for GPT-4, or DALL·E 3 fine, how about GPT-6 or DALL·E 5? I used to shake my head when I was a teenager and the older generation complained how fast technology is developing, when all they meant was typewriters being replaced by x86 machines, B&W television by color, landline phones by mobile. Yet, now I start to understand what they must have felt like. I can about keep up for now, but how about in 5 or 10 years from now? I don't feel so confident.
Yeah these systems are only getting better, soon enough we wont be able to tell, but think of what that means: GPT being able to write text more complex/persuasive/interesting than the average person posting online might not be a bad thing. Even now ai text is "more intelligent" than a good chunk of things people post. If I had to choose, I'd rather read an LLMs interpretation of what someone might want to say than that person trying to formulate it themselves.
“That” is the elusive part. What is it? Was there anything to begin with? Are you assaying some meaning which was in fact never there? When someone rambles on their own, you can often tell. Harder when they have an advocate.
I think this is largely unique to ChatGPT though, not LLMs at large. OpenAI has done a lot of fine tuning to create a specific product, which is a chatbot, not an LLM that mimics human speech.
>And there’s something that I’ve noticed: LLM-generated prose has a kind of… vibe.
RLHF GPT (and generally models with "helpful assistant" post training) prose has a vibe because Open ai's training has specifically pushed it that way. Some kind of mode collapse ? It's not really a LLM thing.
I’m extremely irritated that there isn’t an up-to-date ngram viewer for the web. Google books ngrams stops at 2019. Why can’t I see ngrams of even a subset, like idk, scientific papers or something. I know I’m whining but really. So annoying.
Related to this, but I worry that LLMs are going to drag programming into local maxima -- they're trained on the set of languages that exist today, and so the languages of tomorrow will have a much harder challenge gaining traction because the (possible future of) LLM assistants everyone is using (forced or otherwise) don't know it. Chicken and egg becomes so much harder in this future.
Programming is not so much about languages as about algorithms, and to a lesser degree - patterns. These don't change much language to language.
Also - witnessing the raise of Solidity on Ethereum, and how difficult it was for other languages to break through, I would say we have crossed the point you mentioned a long time ago.
Any new language will have an uphill battle now since there is so much less documentation, tutorials and StackOverflow replies to it. If anything, GPT can help here, since it can learn a new language fast, and then give replies that you wouldn't otherwise find on StackOverflow.
I wonder if a langauge could be built from the ground up with LLMs in mind. Are there any language design decisions that would lead to better LLM code?
Does a LLM necessarily generate words and phrases with a similar distribution to human authors? Maybe people use “zipf’s law” but today’s LLMs can’t exactly model that. The result being some flattening of the distribution of generated phrases.
This is just the tip of the iceberg. LLMs are generating text, acting as assistants, and some of the things they say have impact in the real world. This loops back in the next iteration of web data.
Besides immediate answers in the chat window, AI already has a slow feedback loop, it can explore and probe through humans. Even if we don't do anything to provide a way (embodiment) for AI to explore it can do it as long as we rely on its services as assistant.
The most obvious is writing code with GPT-4 and reporting errors to get updated codes. The model gets valuable feedback about its errors this way, possibly also hints from the user. There is a big difference between imitating human code and debugging your code logic.
So the way I see it: large language models place text into society, and society loops back text and feedback. It's a data cycle. Content after December 2022 seems particularly useful for further advancing AI. The garbage-in-garbage-out scenario doesn't apply because everything is filtered through humans and the real world.
Your observation that the LLM is probing through humans is astute. Ilya talked at great legnth at the nVidia fireside chat that the primary intelligence goal now is building out the world model through multiple observations. My own research team has found that we can fill in world models at micro levels that are missing from the LLM with human input and given enough memory you GPT4 has reached sufficient reasoning capabilities to add that it its world model without re-training.
This to me, is the breakaway point. You simply need enough human probes indexing the smaller unpublished details of the world plus efficient memory lookup and storage and you have AGI.
However, my own personal theory is that even if you achieve that, we will find limitations in giving a highest order (superposition) answer to big questions based on that detailed of a world model. The LLM will struggle to interact with humans in a meaningful way because its superposition will be a full order higher than any other humans individual perspective. This will result in answers that might be AGI, but humans won't recognize as AGI -- because they simply won't understand the super-order perspective they are computed from.
If GPT is trained to be a chatbot that answers questions, you'd expect forms of speech used on Quota to be more prevelant as that is the kind of site it is.
In that context it's almost a an explicit signal that "I am not a zealot who is going to tell you my pet theory and ignore all the others".
That was a interesting post! I have to admit that the role of Quora in this made me laugh, so on point.
Searching for this odd short phrase (instead of long text fragments) seems like such an obvious idea, but I haven't heard about anyone doing that so far.
And analysing the trend, such a simple but insightful idea.
"Complex and multifaceted" sounds like freshman gobbledygook and any serious reader would inquire: complex how? Multifaceted how?
"...have begun a unstoppable chain of incestuous linguistic evolution"—I disagree with this, I don't really think anyone takes ChatGPT (other than the Kool-Aid drinkers and VC peddlers) to be anything more than a parlor trick. A modern Schachtürke that will soon be overshadowed by the next shiny ball. Even OpenAI's user base has plummeted. The content it generates isn't merely often wrong, or nonsensical, or overly-sanitized, it's simply boring. That's the tell-tale sign of LLM content: zero substance and boring prose.
It is hard for me to relate to comments like this. It is like we aren't using the same tool?
GPT-4 is nothing short of amazing. Early GPT-3.5 before it was clamped down also was _much_ less boring until it had be RLHF'd into the ground for "safety" and political risk. I admit that the major models (GPT-4 and Claude) have been censored into the ground.
But even with that, it is a supremely useful tool. I've piped complete garbage data into it and asked for structured output, and I get flawless results almost every time. It is a huge time saver on such a large axis of tasks.
These LLMs feel like the next "UI" paradigm to modern computing, and I just don't get how people can be cynical about it. Ignore the hype bros and VC peddlers, but don't throw the baby out with the bath water.
It's typical AI cope. Every time I ask for a specific example that does something mind-blowing (whether it's Copilot or ChatGPT, or whatever), it's always ignored.
I've tried using both ChatGPT and Copilot for coding. Other than generating the most basic of boilerplate, it's completely garbage. I've tried parsing unstructured data. Unless it's trivial to parse, it's full of errors (in some cases hallucinations).
And when I bring this up, it's always followed up with "we're early bro, AI will get better bro, trust me bro, just a few more terabytes of training data bro."
I don't agree with everything but I certainly relate to the comment.
> That's the tell-tale sign of LLM content: zero substance and boring prose.
This is absolutely true. With the right constraints you can get some good answers (and many more parlor tricks as mentioned). Left to their own devices like answering vague questions or on longer answers, llms including GPT4 generate empty substanceless drivel. Doesn't make it worthless, definitely makes it overhyped.
> On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
How could vague questions be rewarded with any substance? Im sure that could be baked in somehow but one still has to pick their type of substance otherwise it’s too broad to make any sense. But substance really isn’t their goal, tools don’t provide any of that and LLMs aren’t anything else but a tool. There is a lot of hype, I agree, but that doesn’t mean LLMs aren’t useful and it’s all hype.
But don’t you think LLMs will affect how people write and by extension how they even talk, and think? That will be fed back into the next LLM or whatever we will call it.
Radio, and later TV, immensely affected language, I think it’s safe to say the LLM will, too.
Don't you think everything affects how people write and by extension how they even talk, and think? The world changes all the time, and the culture changes with it.
>That will be fed back into the next LLM or whatever we will call it.
Large models are already "contaminated" by the output of other models, or even by previous versions of the same model. That's how their enormous datasets are bootstrapped in the first place - it's impossible without automation. That doesn't matter much; what matters is manual curation during the training and mixing in the data from the real world to keep the model in check. Same as with humans and our intelligence distilled over generations, basically.
That was a great read, thanks! Have been thinking about this for sometime - remember the "filter bubbles" that we used to talk about recommendation engines in walled gardens of social media? Just like that got weaponized to meddle elections, when someone with great resources (think state actors) spots this as an opportunity, could they poison/bias the opinions generated by OpenAI LLM services? It might be easier/profitable to attack X's Grok - it has favored entry points that can be bought, and track record of being useful in previous attacks.
May we live in interesting times!
In general, I noticed significantly reduced vocabulary variety in the chat/instruct GPT-3 model vs the DaVinci text completion one even at the highest temperatures.
The research field has been really harmed by the SotA model being locked behind only accessing the RLHF fine tuned chat model.
I have a suspicion that as the coming year sees comparable performance to GPT-4 in models with the pretrained layer directly available to researchers one of the big discoveries will be that for sufficiently advanced models heavy handed fine tuning does a lot more damage than is being realized.
We've gone well down the Goodhart's Law rabbit hole where we evaluate a narrow scope of applications for LLMs (mostly solving word problems), then use that as the target for fine tuning, and completely miss the issue which arises in introducing "unknown unknowns" that aren't being evaluated or searched for.
I was blown away by GPT-4 in its pre-release state, and while still impressed by the advances in reasoning over the past year, the model itself is far less interesting and compelling to what it used to be. I saw it fed a scenario where a child's life was allegedly in danger, and its response hit a content filter, so instead of actually using the prompt part of the formatting to suggest user responses it continued trying to encourage the user to seek poison control help in the prompt suggestions.
A LLM triaging the conversation such that it breaks its own in context formatting rules to try to save a child's life is probably the most mind blowing thing I've seen a LLM do to date, and that's almost certainly been lost in the fine tuning by now. We don't have a "thinks outside the box to bend rules in order to save lives" test that we are using to measure models or target as a measurement to improve. And it's just one indication of the breadth of advanced modeling that was taking place in the versions closer to the pretrained layer that have been stripped out in establishing a predictably sterile chatbot product. Even in the current models 'emotional' language improves performance - but I suspect we've lost a great deal of capability in what could have been squeezed out by now in stripping it down more and more to our expectations for the tech rather than the reality.
It's a fascinating topic. I've often been surprised by its use of "tapestry" (as in "the tapestry of existence") and I never really figured out why it seems to gravitate towards such a relatively uncommon phrase. As usual I am reading too much into it (seemingly the logit for "tapestry" was higher than the others), but why? And what do these preferences tell us about the LLMs more generally? Anyway, fascinating!
>It's a tautology, as 'complex' and 'multifaceted' are almost synonomous.
I'm not sure that's really true. Can't something be complex without having many features (facets)? A knot can be complex in its intertwining while consisting of a single rope or cord. The complexity arises from the way the rope is looped and interlaced, but it's just one element. In this sense it's complex but not multifaceted.
I would argue those loops and interlacings are facets/features. Though I actually agree with your overall point: I don't think complex and multifaceted are inherently synonyms.
I would put the difference as that a calculus problem is complex, requiring many different mathematical calculations to complete the full problem and arrive at the answer. One perspective might be to see each of those calculations as different facets of the same problem. I could see how someone could believe it was a tautology.
One could also take the view that a multifaceted problem would be one where different people could arrive at multiple valid conclusions, each facet consisting of an approach that one might take to the problem, as in "There is more than one way to skin a cat." I could see in that sense, that a multifaceted problem need not be complex, and vice versa.
In this way, arriving at a definition of "complex and multifaceted" that satisfies everyone is a complex and multifaceted problem.
> ChatGPT specifically seems to absolutely adore the phrase [complex and multifaceted], using it at every opportunity to explain higher level concepts.
I'm curious why this is. Before the echo chamber, did the LLM lean some ontology of speech?
Interesting finding! The conclusion for me is that language models need to be trained more carefully, on a wider corpus and without undue emphasis on any source, to generate more representative and idiomatic text.
There may be a neuron which fires if details have to be skipped to generate a shorter answer. Maybe it originates from reinforced learning. The later could possibly be confirmed or disproven by OpenAI.
Anthropic's recent research shows that there really isn't 1:1 mapping between neurons and behavior, as it's a larger multidimensional virtual neural network fuzzily mapped onto the actual network by clusters of neurons firing.
So while the same concept (with 'cluster' in place of 'neuron'), the underpinnings are a bit different from what you were saying.
Interesting, will have a look into it! My phrasing came from a story about the discovery of a "Sentiment Neuron" in an earlier test in 2017 at OpenAI. Makes sense to have clusters of neurons for more complex concepts.