The funny thing is that it will be impossible to know, unless you are at Microsoft.
In the event that an artificial intelligence does get released into the wild that people lose control over, we will never know what is hallucination, and what is real when interacting with it.
I think, in this case, this is not so much gatekeeping as a different definition of what it means to be an engineer. Most anywhere outside of the United States shares that definition - one that requires one to have met certain obligations, both in terms of education and demonstration of abilities, but also of responsibility for the results of their efforts - that the US does not have. An engineer is responsible for what they have built in most places in a way that software engineers are not, and prompt engineers even less so. Those individuals (rightly) feel slighted that the term of art that applies to them does not hold the same weight in other circles, and that those individuals profit off of this difference in definition.
Sometimes the gate is kept because to pass beyond is to take on a responsibility that not everyone is fit or willing to bear.
Prompt engineer is derived from software engineer, which only is sometimes justified. It is basically diluting the word with increasingly less meaningful context. An engineer needs to be able to make guarantees and statements about what he engineers. You can do that with software if you are careful but the vast majority of software development is not engineering because engineering is too expensive and overkill most of the time.
But now that AI is in the mix and the results are purely stochastic, you basically threw out the entire meaning behind the word. The prompt engineer is unable to make guarantees and accurate statements about his prompts. He can't predict what the machine learning model is going to do, otherwise he wouldn't have asked.
> Many people are using it as a digital secretary or assistant
This is the most exciting development to me -- a personal assistant that can keep track of my schedule, find interesting articles, and automate other logistical tasks throughout my day.
It's possible to achieve some of this functionality via prompt engineering, but I'm looking forward to the more structured applications that use GPT for summarization and natural language communication, sitting on top of engineered systems.
Yes this, I have a hard time forming words some times due to a brain aneurysm. It is such a great tool to form my thoughts. The responses need a little tweaking but it really helps me write well structured arguments for why I dont want to implement a piece of technology which results in me having to do more work.
This is a great illustration of the risks of LLMs. As a user, if I am asking this question to a search engine, I definitely do not expect to need to fact-check the results. That's the whole reason to use the search engine in the first place!
We're about to enter a dark ages of crappy AI products that are touted as game changing, outcompeting each other to be the best chatbot that can compose haiku about how grapes turn into raisins.
We fact check search engine results all the time. But most of the time, such fact checking is in the form of looking at a result, considering whether it seems like a credible source, seeing if multiple credible-seeming results have the same answer, etc.
Getting a completely untrustworthy, unsourced response seems worse than useless. Google has been going this way for a while, with its instant answers or whatever, but at least those try to cite a search result and you can read the surrounding context which Google got the result from.
We're slow-motion singing on to a future with a fundamental shift to receiving information in a completely opaque manner.
A few sources will control the information we get in a much more direct and extreme way than now, that conscious skepticism will no longer be able to defend. Whatever handwaved promises we get now will be gone ten years from now.
If there wasn't such a gee-whiz coolness factor about conversational search results distracting us, we'd never tolerate that in principle.
> We're slow-motion singing on to a future with a fundamental shift to receiving information in a completely opaque manner.
There's nothing fundamentally new about this. The average person is blissfully unaware of the conversations being had between powerful individuals, PR experts, producers, and so on about how said person should be manipulated.
I pretty much have this ever-present aura of "citation needed" floating around in my head when I listen to people speak. Any kind of news/press thing, politician speaking (lol), or just anyone trying to claim/assert anything -- I can feel nothing but "citation needed" until they provide some kind of supporting evidence. I feel like people expend a shocking amount of energy proclaiming things they literally do not know with even the slightest degree of certainty - they saw it on some guy's sensational YouTube video and repeat it uncritically. The fatigue is real.
Even a citation is insufficient nowadays. They will cite a "reputable" source like NYT, which in turn cites "an anonymous intelligence official", and so on. Unverifiable.
Misinformation and manufactured narratives are omnipresent and all we can do is consume as diverse a media diet as possible and develop a good nose for bullsh*t.
Oh please. As if the reputation of any news outlet even matters anymore. They all fired their real journalists and fact checkers long ago. Everything you read is full of inaccuracies, agenda pushing and misinformation. If you think it doesn’t, you’ve been had.
That's true of (commercial) news outlets. It doesn't apply to "everything you read". There are many ways to check facts that don't rely on news outlets, at least for people who have time and resources.
But relevant to the topic at hand, do any of those sources regularly make it onto the search engine results page? Is the quality of Forbes, Cnet, Reddit, Wikipedia and Quora clearly better than these AI generated responses in some way?
I know there are expert vetted information sources, but you generally have to pay for the quality and they do not get linked to from Bing and Google.
I don't know exactly what he was referencing but the easiest way to verify the authenticity of points on issues where there tend to be two sides saying mutually incompatible things is to look at the overlap of what they both say is true. That is going to usually be true. And all it takes to find that is to look at sources for both sides.
Expert vetting doesn't even touch the underlying problem, because the pursuit is not expertise, in and of itself, but objectivity. And that's something far scarcer than expertise, and increasingly fleeting in today's world. A Chinese expert is probably going to have a different perspective on e.g. the Uyghurs than an America expert on such, even if both are in no way trying to mislead but giving their most sincere analysis of the situation.
Even on topics that are not conventionally controversial, you'll find a similar issue. Ask two astrophysicists of different worldviews on dark matter, and you are going to get two very different answers that, in many ways, will be incompatible. Simply "believing" one over the other doesn't really make any sense, nor does randomly polling astrophysicists and taking that as the definitive truth.
This is a great summary of this issues with trying to even determine objective reality. I would say that the simple popular consensus approach is not even that great because there are plenty of things in the past that have had consensus that were later determined to be objectively false.
My point was even a step before this, that even getting the consensus facts correct is a major challenge when the internet is written by children, bored volunteers, mechanical Turk contributors from across the world, adversarial actors, and content producers churning clickbait. Having a professional journalist investigate a topic, then have a separate professional fact-check, and yet another professional edit all for a publication that is trying hard to maintain a reputation and will publish retractions if necessary is all miles better as a starting point for determining truth, but sadly that cultural activity is nearly dead.
For the record, the “New Bing” AI results will not be unsourced but with key facts in sentences tagged in Wikipedia style, pointing towards the source URL. Finally, below the reply there will be a domain summary for an overview but where each domain name is clickable to get to the respective articles on said domains.
In this case, Bing AI will operate very differently from ChatGPT.
This is going to get real nasty when you start getting into science of various sorts, where an increasing number of papers, particularly in the social sciences, tend to make extreme and outlandish claims, but then walk them back extensively in the actual paper itself emphasizing it's just a correlation, or otherwise demonstrating an extremely marginal effect.
Yet the media tends to miss the internationally understated nuance and run with the claims at face value, which an LLM will then pick up on and state, "Yes scientists have proven that [x] does cause [y]. [1][2][3][4][5]" That claim then gets repeated elsewhere, and eventually that "elsewhere" goes on to become part of the LLM's new training material where it's basically training off its own output.
It'd be ironic if something that's ideally designed to make the breadth of human knowledgeable more readily available and accessible than ever before, ends up just making society vastly more misinformed than ever before.
Instant answers seem like a cautionary example since Google has gotten a fair amount of flack over the cases where it inaccurately summarized content. I think these services are going to be very interesting to study whether the average person thinks they're more authoritative because they're branded by a huge corporation and whether that'll decline over time as people realize the limitations.
It's weird. I've personally experienced cases where the highlighted instant answer is obviously incorrect and the full context actually claims the opposite of what the excerpt claims, and those kinds of examples circulate around the web pretty frequently, and everyone who has ever tried to ask ChatGPT or similar systems tricky questions should know how AIs just invent stuff when they don't know the real answers.
So why do companies like Microsoft and Google push in this direction? Why are they making the results more and more opaque? You'd hope that they would care enough to be good stewards of the power granted to them through their information monopoly, but barring that, you'd hope that they'd recognize that people want results they can verify, not just random answers.
Or maybe they're hoping that people don't care about verifying results, hoping that people just want an answer that's not necessarily the right answer? It seems like a dangerous gamble.
It really goes back to the toxic incentives of ad sales. Google wants you to stay on Google.com so they get more chances to show ads. They only care about low quality results if it means you stop using them, which for years wasn’t a serious risk but can change relatively quickly if alternatives arise. They should know, given how rapidly they pulled users away from Alta Vista, Yahoo!, et al. 20 years ago.
I would definitely fact check search results as much as AI, especially the info snippets that appear at the top of Google's SERPs.
For example, until a few months the results for "pork cooked temperature" and "chicken cooked temperature" were returning incorrect values, boldly declaring too low of a temperature right at the top of the page (I know these numbers can vary based on how long the meat is at a certain temperature, but I verified Google was parsing the info incorrectly from the page it was referencing, pulling the temperature for the wrong kinds of meat). This was potentially dangerous incorrect info IMO
Snippets have become so useless I use a plugin to remove them.
What is ridiculous is, when, say, Stack Overflow has a good answer, it is a few lines down or on the next page in the search results, but some page-mill SEO site is in snippets up top with a completely wrong or naively pathetic partially correct answer. It is so annoying it has lowered my opinion on Google a lot in recent times.
> I would definitely fact check search results as much as AI, especially the info snippets that appear at the top of Google's SERPs.
Yes, so would I. And I also double check things like Google Maps -- a tool I find very helpful but don't trust blindly. But... do most people think to take a close look at Google Maps to make sure it makes sense, and trust their own judgement if they disagree with the map? Will most people fact check confident LLM outputs?
The content I write is often half-assedly plagiarised by copywriters or incorrectly interpreted by lazy journalists. This is just an automated version of it. They can use my hard work for their own profit at an unprecedented pace, while still remaining factually incorrect.
Disagree. I think this is akin to Netflix’s Chaos Monkey, which relied on the insight that it is impossible to build infallible systems, so you design failure and recovery in.
Existing Google searches are polluted with false information, and Google’s has been losing that battle. It’s probably not even possible to win.
So rather than saying search engines should always be perfectly accurate and errors are catastrophic, we should accept that search engines are, and have always been imperfect, and need to give us enough info to validate facts for queries important enough to merit it.
Ever since Google started adding those quick answer boxes at the top of search results I've had the double check everything they say. They're quite often incorrect. I mean I know that, but this grandma? They've all been conditioned to trust Google.
> I am asking this question to a search engine, I definitely do not expect to need to fact-check the results.
Genuine, honest question: How did you come to the belief that search engines are reliable sources of truth?
I completely agree that search engines provide a valuable service. But in my own work, I find them to very often point to inaccurate information, sometimes greatly so. I don't think this is terribly surprising, given Sturgeon's law, but still.
I can see how someone could extrapolate Google's goal of indexing knowledge(JTB) into being a reliable source of truth. It's simply a matter of taking them at their word on the J and T parts. The B is up to the user.
Google's branding frames itself as the expert in the novice-expert problem. The vast number of users implicitly take on the role of the novice by virtue of using the product. They've already self-identified as a novice which makes both parties complicit in the arrangement.
When I ask ChatGPT a question, it explains it's reasoning and gives me concepts I can follow up with googling to learn more.
When I use Google for research, I get articles written for SEO to push products and often have to refine and refine and refine to get something useful, which I then can follow up by googling to learn more. With difficulty.
Honestly I don't know how much I'd use ChatGPT if I had the internet of 2016 and Google.
Careful, It explains but both answer and explanation are sometimes completely hallucinated, it sometimes looks like a plausible answer, but actually it completely made up. And this happens way too often for me to take it seriously for now.
So i think the ability of the search engine to say "I don't know" is very important, and most of current chatgpt like models in the market don't have this feature.
This is like encryption: the cat is out of the bag and there's no getting it back in. As computing power increases, control of LLMs will be democratized / made widely available, and no amount of regulation will stop that.
More to the point, it will be incredibly difficult to reset the gen pop's impression of AGI. More structure to LLMs may give more reliability, but in the meantime, I have no doubt LLMs will wreak havoc in the world. We're in a mess it's not clear can be cleaned up.
These "augmented intelligence" applications are so exciting to me. I'm not as interested in autonomous artificial intelligence. Computers are tools to make my life easier, not meant to lead their own lives!
There's a big up-front cost of building a notes database for this application, but it illustrates the point nicely: encode a bunch of data ("memories"), and use an AI like GPT to retrieve information ("remembering"). It's not a fundamentally different process from what we do already, but it replaces the need for me to spend time on an automatable task.
I'm excited to see what humans spend our time doing once we've offloaded the boring dirty work to AIs.
In chess we had a tiny window of a few years when humans could use the help of computers to play the world's best chess. By 2000, computers were far better than humans and the gap has increased. Chess players are now entertainers, like all us humans are destined to spend our time doing.
The story of what happened with chess deserves a lot more elaboration, because it's fun and interesting (and may also foretell outcomes in other scenarios)!
In chess the first "new" (unplayed in a high level game) move in a game is called the novelty, or theoretic novelty. In times before computers this would not infrequently be an objectively strong move that simply had not been played in a given position before. And this continued for time after computers became quite strong with players using computers to find interesting strong ideas in all sorts of positions. Each time these sort of novelties would be sprung, positions would become redefined and our broader knowledge of the game continued to stretch on outward.
But then something fun happened - the metagame shifted. Now it's no longer really about founding some really strong move as your novelty - but often about finding a technically mediocre, if not simply bad, move that gives you good practical chances. So you're looking for moves that your opponent probably has not considered because they look bad (and the computer would agree that they're bad) but you're much more prepared and comfortable in than he is.
The big difference now also is that instead of a novelty redefining a position in a positive way, it's often something you spring once or maybe twice - and then never touch again. And this is now happening regularly at the absolute highest levels of chess. So rather than having humans just desperately trying to emulate machines, those machines became yet another tool to exploit and improve our practical results with.
It's kind of funny watching a game when this happens and less experienced players will immediately begin shouting "BLUNDER!" when the computer evaluation of a position suddenly drops, without realizing the player who just "blundered" is still well within his preparation. But the other guy is now probably out of his. Even the players themselves, there's often a sort of "u srs?" type response. This [1] is a fun one from the always emotive Ian Nepomniachtchi during the most recent world champions candidates event. He is now playing for the world championship. In any case, it's at that point that the game begins!
While reductive, isn't that true in many professional sports though? I have a wide variety of tools I can use to travel 100m faster than Usain Bolt, but its incredible to watch him do it on his own.
The way I read this, in a world with machines that can travel at high speeds, people still watch professional runners because they are interesting to watch, and we’re inspired by human achievement.
For similar reasons, it doesn’t really matter that computers are better at chess than us.
Also, I can drive my car to get places quickly, but sometimes I enjoy cycling or walking instead because it allows me to take in my surroundings more fully at the slower pace, and the exercise makes me feel good.
Well, isn't that true for most skills which have been suplemented by technology? In fact, if you look at the handcrafting bussiness, hand-made is now a selling point. Like someone prefers a product to be handmade, others will prefer watching a game of chess between two humans. Just because machines are better at something doesnt mean that humans have become useless. Its just a question of the point of view, IOW, how depressing you want to see the world.
It’s always interesting that this point is always from a competing context That is to say from a survival point of view. I mean nobody really wants AI because it could be fun. We are as a species really inept to move beyond our survival idioms I feel.
> like all us humans are destined to spend our time doing
Automation has been here for quite a long time now, if it took people out of the work pool for them to become entertainer we'd know about it.
It's always the same issue in fact, replacing workers by machines is good, but if your goal is still to have a "full employment" society you have to make them work somewhere else. It's not even a new concept but we seem to rediscover it every now and then apparently
> Automation, the most advanced sector of modern industry as well as the model which perfectly sums up its practice, drives the commodity world toward the following contradiction: the technical equipment which objectively eliminates labor must at the same time preserve labor as a commodity and as the only source of the commodity. If the social labor (time) engaged by the society is not to diminish because of automation (or any other less extreme form of increasing the productivity of labor), then new jobs have to be created. Services, the tertiary sector, swell the ranks of the army of distribution and are a eulogy to the current commodities; the additional forces which are mobilized just happen to be suitable for the organization of redundant labor required by the artificial needs for such commodities.
If you think the people who own the machines will be happy to support everyone else just sitting around "entertaining" themselves, you're in for a rude shock.
This raises huge red flags for privacy. Presumably OpenAI keeps full logs of every interaction with ChatGPT / GPT-3, but this isn't self-evident when you're using them. And it feels intimate--it feels like you're talking to a person--and that builds trust. To say nothing of applications like therapy, or personal coaching...
This makes me feel that open-source LLMs that can be estimated efficiently, and run on local hardware, is an urgent need. Otherwise we will live in a cyberpunk dystopia with a centralized company that knows everything about you.
I couldn't agree more, but I feel so defeated with the already egregious violations of privacy that pervade bleeding edge and big tech companies. What's there to be done?