Last year, when Blake Lemoine was fired by Google for claiming his AI was sentient, a lot of people on this website sensibly noted that the danger of these chatbots was not that they were sentient, but that average people could easily be tricked into thinking that they were.
And now here we are: it seems we tech people are every bit as susceptible to this kind of thinking as the average joes we looked down on.
Actually look at this “language” that GPT-4 has written: it’s just a mishmash of features from the most commonly blogged-about existing programming languages. It has no conceptual insights, no original ideas, no taste that isn’t cribbed from one of the few languages it’s copying from.
There are cases where getting a short distillation of the mean opinion of The Internet is useful. But designing a new programming language is clearly not one of them, mostly because the chatbot is not “designing” a language—it is guessing the most likely response to a question about programming languages.
I used GPT-3/3.5/4 for multiple tasks and I never found it useful. It just feels… average.
It can “answer” many questions correctly but that’s only on the data set it’s been training on. I asked it to solve Elixir issues, write Elisp snippets, figure out why my Rust code was complaining about lifetimes and even went for easy mode and asked for JavaScript code.
Only JavaScript (very popular) responses were workable but they still had popular problems included in it (e.g. not clearing out timers or dealing with common edge cases). Elixir “hints” were very interesting since - when out of water - it would produce code assuming that it’s a OOP and that was a rabbit hole it declined to exit.
Too technical? Went for fun with family members who have exceptional cooking skills. We tried to figure out a dish variation of an apple pie. Suggestions were heavily criticized as non workable for many reasons (too sugary, consistency wouldn’t be too good, modification change the ingredients but GPT refuse to break of out theme etc.)
Colin Meloy made an experiment some time ago and created The Decemberists-like song with it and then actually recorded it.
It worked but it was described as below average and bland. I share that vibe.
In the end I have this feeling that GPT is on a level of 7 year old with an access to vast library. It uses proper grammar it can copy parts of texts, but one needs to provide it with a great benefit of doubt and energy just so that it’s useful.
In contrast to 7 year old, GPT isn’t grateful for that sacrifice and is actually making money of you.
You probably never had to work with one of the 99% of programmers in the world that are not of your level (I am assuming it is high, otherwise you might just be very scared saying all this, and rightfully so), but me, working in the fortune 1000 company space, worked with many. Most of my colleagues can readily be replaced by gpt-4 today as far as coding and documentation goes. They make the same errors as you noticed gpt make (which I have to fix usually), only gpt does it in seconds vs hours and practically for free. By replacing all these people, I don’t have to build in the overhead of communicating with them (I just copy the jira task in gpt with some context and a few seconds later I have something better than most my colleagues write in 2-3 hours, on a good day), and they are happier doing nothing anyway. The company won’t fire them probably as managers get paid more and promoted more ‘managing’ more people.
This is not an exception either; teams like mine exist in every large enterprise. I worked in several of them and this AI makes life so much better. Let’s not pretend it makes certain outsourcing countries not fully obsolete though, to the tune of 10s of millions+ programmers, data entry workers, text producers (bloggers for seo, docs no one reads etc) and more are mostly all already worse than gpt3.5 and 4. Wait for 6 or 7.
Many humans have lost their jobs to dumb programs before. If your job is mostly copying code snippets and modifying them a tiny bit then yeah, you should worry, these language models are great at that. If you are doing more creative coding then it will probably take a while.
> Many humans have lost their jobs to dumb programs before
I do think really this is you’re not being in touch with most of reality and what most people on earth do daily at this moment. It definitely as nothing to do with the productivity you have and the drive you feel internally to ‘get shit done’. Most people, by a huge stretch, get education, want to do work what they are told to do related so that education and that’s it. They have no, at all, drive to ever move beyond that. They learned Angular on their first job; that’s it. Etc.
This is now replaceable; this is not, at all, comparable to automation before this. It’s a threshold; it you cannot see it, you definitely will be shocked in the coming years. This is not going away and it is replacing humans now on a small scale. The companies that really can just remove 90% of their staff are late always, but it will come. I see this all day because I work in massive orgs where this holds, not only for tech.
Yes, totally correct. I am just saying what I see as the norm; the norm is not you and me or many other HNers. They should worry in my opinion. I see it every day, and worse, none of them even heard of gtp or copilot.
> I used GPT-3/3.5/4 for multiple tasks and I never found it useful.
I’ve found it useful for drafting cover letters (I just need to do final edits).
I’ve also found it useful as an adjunct to google when I’m searching for niche things like “Luxury resorts with activities for my dog and lots of off-leash trails”.
So far that’s it, but it does seem like a useful addition to Google search for super niche things. Obviously it hallucinates a lot on these niche edge cases but it gives some great starting points.
>Colin Meloy made an experiment some time ago and created The Decemberists-like song with it and then actually recorded it. It worked but it was described as below average and bland. I share that vibe.
So... this early-stage AI was able to make music that's better than the vast majority of commercial recording artists these days? I fail to see how that's a failure.
> In the end I have this feeling that GPT is on a level of 7 year old with an access to vast library.
It's probably more like a 14 year old but I think you're probably right. It can give step-by-step instructions and even the code to set up entire web apps, but it's only when you go through those steps that you find the issues. Packages out of date, less-than-optimal ways of handling non-obvious cases, and I've even found it trying to use javascript functions that have never existed. To its credit, it apologizes and tries again when I call it out. Maybe GPT-5 will be the real one to worry about.
It seems only yesterday that we were marvelling at the capabilities of ChatGPT - it's ability to parse the sum knowledge of humankind and synthesise not trivial responses to questions we posed.
But look how far we've come since those heady and carefree days. We're already able to pour buckets of scorn on its capabilities, and brush them aside as if designing a language (albeit a derivative and uninspired one) is a trivial task that any human could achieve.
Oh please. This community is filled to the brim with people that are addicted to negativity and contrarianism. LLM criticisms very typically go far far beyond “this isn’t the singularity”, well toward “LLMs do nothing of value, snd certainly can’t I - the esteemed developer - with anything I find hard in my job!”
That is literally what LLMs do though. Those comments try to explain how it works so you better can see how it produced the results you see, they aren't trying to say that the program is worthless just that it isn't the magic some people think it is.
These models are trained to produce text snippets that look like text snippets it has seen before, and it has seen all internet. That means it can do a lot of impressive things, but also that it is very dumb in other ways.
These comments are deceptive. Yes, this is how LLMs work, but that doesn't mean they only repeat things they have seen before. LLMs are capable of following instructions to construct new words in any language they know, words never seen before.
I've seen it being dumb in maths or real world problems. But as a large language models, they understand and speak languages fine, and even mistakes they make look like mistakes humans who are not natives in the language would make.
We may as well say that when we speak, we are just predicting words we have trained on. I don't see how these models are worse than people in that regard.
The general knowledge and thinking of these models are surely limited. But seeing GPT-4 go from text only input to text with images, I think it is very possible to break the barriers very soon.
Ok, since you called out the gp comment as 'deceptive', I in turn am going to call out your comment (and others like it) as delusional, and point to specific places in your comment that exhibit this state of delusion (about LLMs).
> they understand and speak languages fine
No, they neither 'understand' nor 'speak' languages. The first word here is the more delusional, they have no understanding of languages. They have simply generated a model of the language. And they do not 'speak' the language they have modeled; they generate text in that language. Speaking generally implies an active intelligence; there is no intelligence behind an LLM. There is simply a machine generating (wholly derivative) output text from input text.
> We may as well say that when we speak, we are just predicting words we have trained on
This is the delusion, commonly being repeated, that humans themselves are only LLMs. This is a dangerous delusion, in my view, and one that has no evidence behind it. It is only a supposition, and a sterile and nihilistic one in my view.
> The general knowledge and thinking of these models are surely limited [...] I think it is very possible to break the barriers very soon
The limitations are fundamental to LLMs. LLMs have no general knowledge and LLMs don't do any 'thinking'. Your understanding of what they are doing is in grave error, and that error is based on a personification of the machine. An error coming from the delusion that because they generate 'natural' language they are operating similarly to us (false and addressed above). They are never going to break the limits because they have never started to transcend those limits in the first place, nor can they. They don't and will never 'think' or 'reason'.
>Oh please. This community is filled to the brim with people that are addicted to negativity and contrarianism. LLM criticisms very typically go far far beyond “this isn’t the singularity”, well toward “LLMs do nothing of value, snd certainly can’t I - the esteemed developer - with anything I find hard in my job!”
Yep. Welcome to HN. And welcome to the majority of humanity!
ChatGPT is both impressive and scary. We know how things work: there is no way to slow down this train. However it is already having an impact on things like education. Teacher don't have good enough tools to detect cheaters, and students are going to suffer because of that. Honest ones will be wrongly accused. Cheaters will get good grades. The only solution I see is to ban any graded homework.
> designing a language (albeit a derivative and uninspired one) is a trivial task that any human could achieve.
It is, isn’t it? You just need to be able to read and write words and symbols that reassemble a programming language coherently and follow the pattern. This is what humans do after all; remixing. Anyone can do this.
Sorry if I am missing something here, but I think we all know well functioning, productive humans, maybe even in our field, who wouldn't have been able to create this language or a similarly well thought out one.
I know a competent man should be able to "change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders [...]" and what not and specialization is for insects, but even on the Mysterious Island not everyone is Cyrus Smith (Harding).
I think you are missing my sarcasm. I don’t think every human can design a well thought out language, let alone an LLM. I was trying to point out that we are more than this. More than what humans are being demoted to in some HN comments recently. My problem is with some characterizations of humans here.
The interesting part of the exchange is the contrast between the surface level syntax of the exchange and the semantics below. It reminds me of teaching in a lot of ways. The level of polish on the immediately apparent surface level of the conversation is very high: reasonable tenents, good justification and ability to explain them.
But then applying them directly into an example falls down a bit. The language is a bit of a mismash and parts of it might not work semantically. The use of actors looks a bit out of place, and raises a lot of questions about how the memory management and scheduling is working under the hood. Despite the bold aims in the tenents thre is nothing in the language exposed so far that addresses them: why would the language produce more modular code in the large? what features of the syntax would increase the level of abstraction in code written? etc etc
Overall this would warrant a (barely) passing grade by modern standards, i.e. it can regurgitate the material succesfully but without demonstrating any deep understanding. I find the result quite interesting as it demonstrates how much of what we judge to be quality is actually fitting a particular pattern of response (w.r.t a specific field of study) rather than exploring the material more deeply.
I like the use of the arrows in the function definitions though :) I've seen that somewhere before and will be "appropriating" it for the project that I'm currently working on.
The Turing Test isn't meant to be an objective test of GAI. It's meant to replace a question that might be too vague or philosophically contentious to be answerable, "Can machines think?" with an answerable one, "Can a machine successfully fool a person into thinking it's a person?" So a robot bird that fools real birds arguably has passed the Avian Turing Test. (Also, arguably the Turing Test is by definition subjective.)
> The Turing Test isn't meant to be an objective test of GAI. It's meant to replace a question that might be too vague or philosophically contentious [snip]
Correct in that that was what Turing intended. But the vulgar interpretation of it makes it out to be more than it really is.
For Pete’s sake: two years ago I even saw a psychologist who suggested that psychologists should take notes from how computer science ostensibly measures “intelligence” by the Turing Test, because it’s so “objective”. That was sad.
> It’s just a mishmash of features from the most commonly blogged-about existing programming languages. It has no conceptual insights, no original ideas, no taste that isn’t cribbed from one of the few languages it’s copying from.
It will also learn all the bugs and reproduce them.
Programmers who think ChatGPT is going to replace their job must have a very low opinion of their own skill set.
It will also combine correct things in incorrect ways, so even if everything it is trained on is correct it will still print incorrect things.
We see it in this article, both the statement "Indentation-based scoping, similar to Python" and the code with bracers were probably correct in their original contexts, but combined they are no longer correct. LLMs will make such errors no matter what kind of data you train it on, there is no known way to avoid that class of errors.
I agree with your point about the dangers of chatbots being mistaken as sentient. And I also recognize the dangers of the self-improving systems like this in general -- that's why I'm not letting it run loose with a LangChainAI Agent script to self-execute a new language (though wouldn't that be fascinating?)
While the language it generated needs refinement, the design process is iterative, and it takes time to develop a programming language that is worth using.
I encourage you to consider how you would approach designing a programming language. Like you, I would likely start by incorporating my favorite features from languages I love and addressing problems I've encountered in the past, similar to the dialogue GPT and I have in the post.
As I mention in the post -- creativity often involves remixing existing ideas, and GPT-4's attempt at generating a programming language is no exception. Notably, it also called out explicitly the constructs it was borrowing from other languages when designing the language.
It's worth noting that this was GPT-4's initial attempt at something highly complex, and even if there are "plot holes" -- it's a noteworthy achievement in and of itself.
Ok, but take a step back and really look at the whole situation here. So you've created a machine that solves problems in kind of the same way humans do. Yay? What overall problem is a machine like this solving? How is this better than just having a human do the job? It seems to have most of the same drawbacks as humans, and then some on top of that
It's a hard question to be honest. If you look at the incentives involved, people are very incentivized to create these machines to reduce the amount of toil humans have to endure. I don't disagree about the risks at scale of something like this; as I mention in the post -- I don't get pure warm fuzzies about this at all. But I also don't know how to stop AI progress, because the incentives to keep it going are so real.
Government regulation has been brought up multiple times, most recently by Sama himself on twitter. I think it will have to come, but I don't know what flavor it should take.
> How is this better than just having a human do the job?
The cynical take is quite clearly: "Because sales people can sell it"
eg wrap some pretty UX around it, tell people it's "nearly human", then have your sales people try and get non-technical CxO types (or consultancies) to cut them a cheque
What!? Look back at the last few hundred years, where the majority of great leaps have been the result of getting a machine to do something that humans did.
> As I mention in the post -- creativity often involves remixing existing ideas
This is carrying the weight of your core argument. It’s not wrong but it’s not completely correct either.
Creativity involves remixing but is not all remixing. Is it? Not even all humans are at the same level in this. Number of attempts means nothing if you don’t add features that the LLM desperately needs to be grounded in the reality, or perform logical reasoning beyond RLHF.
The question is when does guessing become a moot point?
AlphaGo guessed it’s way to being the best.
Can something guess it’s way to evolving into a less guessy thing? I think so. Look at us?
I am aware of how it works and used to be ‘it’s ok, it won’t happen’ but now I am actually pretty concerned.
We think GAI as the thing but there maybe a reality that doesn’t require full GAI to completely unhinge us.
I mean, what is the point? It only plays to the negatives of our society. We don’t need it.
It allows for ‘do more’ but what cost does that ‘do more’ come with?
No it didn't, AlphaGo was backed by an algorithm that perfectly understood go and used that to train itself to play go better. That is a completely different situation.
If you tried to make a go engine that only tried to replicate old moves then you'd have a hard time making it perform valid moves. ChatGPT is like such an engine and we just made it perform valid moves most of the time, and now people say "now that it can make valid chess moves it is only a matter of time until it beats masters!". No just statistically reproducing human moves wont lead to a smart chess engine, so why would you think that would make a smart text engine?
Because it is not just reproducing existing moves. An LLM can come to acquire a model which models the structure of moves (in chess for instance using letter an number indexing of board positions), the checking of valid moves and the valuation of board positions.
It seems surprising that training an LLM would result in layers/weights which contain such representations but it seems that this is indeed an efficient representation of information in the same way that neural nets would learn addition by encoding textual integers into floats.
That would be an interesting experiment. Take something like ChatGPT, train it to play tic-tac-toe and reversi and chess and go. Tic-tac-toe it will probably be unbeatable. How does it do at chess and go? Is there a level of difficulty of the game where its ability to master it tops out?
Can ChatGPT(-ish), trained on a ton of games, beat AlphaGo?
It's been a while, but wasn't it two AlphaGo versions? One that was trained on existing games and "trained" with existing engines. One that was retrained from scratch without any external input? I might be wrong.
Both had a program running it with real Go rules. We don't have a program that can run with the real rules for logic and text to train a language model from scratch.
It also had a program giving it potential moves it could play, and explained to it when it had won the game. That isn't zero input. We don't have such programs for logic or text. So there will be no GPT zero, at least until we have already solved logic.
Not the person who you posed the question to, but I believe their point may be that Go is a fully described game, a constrained 'space' even if that space is remarkably deep.
Our world is not a fully described game (unless you want to take a radical philosophical position, and in which case we still don't know the rules of this game). We talk about and think about this world in language. Training a machine on human language and then after training, having that machine take language inputs and generate language outputs is not the same as training a machine to 'play' Go because the two worlds they have been trained in (1. the game of Go; 2. the real world) are fundamentally not equivalent.
Whenever there is a simple error that most laymen fall for, there is always a slightly more sophisticated version of the same problem that experts fall for.
> Actually look at this “language” that GPT-4 has written: it’s just a mishmash of features from the most commonly blogged-about existing programming languages. It has no conceptual insights, no original ideas, no taste that isn’t cribbed from one of the few languages it’s copying from.
On the one hand, I agree.
On the other hand, I see some variant of this comment on HN about every new language, so...
Yeah, I find llms are really useful for taking premises you supply, and trying them out, or exploring inconsistencies in them. It's decent at the synthesis kind of creativity (mashups, remixes, etc) but it isn't smart enough yet to make really big inferential leaps that we would consider more original.
99.9999 percent of humans are no more than copiers of other people's ideas. With no conceptual insight, no original ideas, etc etc. That doesn't mean they're not sentient. :-)
This is (arrogant) misanthropy, and I see far too much of it on forums that contain significant numbers of people who consider themselves more 'enlightened' than the common man (reddit being a prominent example).
"LLMs in the GPT family have read every programming language in the world, billions of times. It's been known for some time that these agents can write small snippets of code with some limited success. However, to my knowledge, none of these agents have ever been asked to make their own programming language. This type of task is a bit more complex, and requires a bit more creativity and foresight than much of the internet believes LLMs to have.
Some folks on the internet tend to think lowly of LLMs as merely "Stochastic Parrots" -- simply "remixers" of old ideas. In a way, they're right; but I'd argue that remixing is the fundamental force of creativity. Nothing comes from scratch."
This^^. Half the internet was in denial about how good LLMs are. I wonder why? Perhaps it's the way humans act when confronted with a machine that can potentially replace much of what makes us special.
The advent of chatGPT had millions of people on the internet trying to downplay the intelligence of chatGPT by continuously trying to re-emphasize the things it gets wrong.
I think the people who claimed chatGPT was a stochastic parrot are now realizing that they were the ones that were part of a giant parade of parrots regurgitating the same old tired trope of LLMs being nothing but simple word generators. Well you guys were dead wrong, a simple iterative improvement on the same algorithm pretty much ironed out a lot of the issues.
I'm guessing that at GPT-10 or 11 we'll have produced something that when compared with humans, humans will be the things that are far more parrot-like.
To me, it's very clear it doesn't really understand what "indentation-based scoping" means. For a human programmer, using both intentation-based and bracket-based scoping is a very unusual design choice, and is begging for further explanation. It talked pages of abstraction, modularity, etc, but not this?
I'm not saying it's not good. But I failed to see how this particular example proves its not a "stochastic parrot". If anything, I think it's a quite strong evidence supporting "stochastic parrot" narrative.
Author here -- I think it is a stochastic parrot, don't get me wrong.
But my argument is that humans are also, at some level, sufficiently advanced stochastic parrots. At least, as it pertains to many creative endeavors.
You're always building on the back of something that comes before.
We're all just remixing ideas we've heard before.
I mean even this comment -- none of the words I'm saying are new, and many of the word combinations I'm using have been used time and time again. The ideas I'm expressing have merit -- but are they wholly original?
Not really. We all need eachother's creative energies to do our best work.
The difference between humans and sophisticated stochastic parrots is reason. The most common kinds of mistakes that chatGPT currently makes are when it says things that are not simply wrong, but don't make sense. Perhaps it will be possible to emulate reason with enough data, training, parameters, etc, but without some representation of the ability to understand what you don't know, what you know, and what follows from those things consistently, I wonder if these kinds of models will ever become truly reliable.
Although even in this example it does say some things that are simply wrong:
"Some programming languages like TypeScript and Swift use a colon for both variable type annotations and function return types."
This is incorrect for Swift, which uses the same arrows as "TenetLang" for function return types. Actually the first thing I thought looking at the example code was "looks swifty but not quite as well designed."
> some representation of the ability to understand what you don't know, what you know, and what follows from those things consistently
Right, that is truly what the chatbots are lacking. They can fool some of the people some of the time, but they can't fool all of the people all of the time.
Their creators - not the chatbots themselves - are trying to "fool" people into believing that the chatbots would have as you say "ability to understand what you don't know, what you know, and what follows from those things consistently".
It's totally possible that humans don't do reason. It's possible that the parrot in our brain makes the decision, and then the "frontend" of our mind makes fake reasons that sound logical enough.
But it's just a possibility, and I don't find it's particularly convincing.
Look up split brain experiments [1]. Basically, in patients where the corpus callosum is severed to some degree, the two halves of the brain have limited communication. Since the two halves control different parts of the sensory system, you can provide information selectively to one or the other half of the brain, and ask that part of the brain to make choices. If you then provide the other half of the brain the wrong information, and ask it to reason about why it made a choice, the other brain will happily pull a ChatGPT and "hallucinate" a reason for a choice neither that half nor the other half of the brain ever made.
While that does not prove that we never apply reasoning ahead of time, it is a pretty compelling indication that we can not trust that reasoning we give isn't a post-rationalisation rather than an indication of our actual reasoning, if any.
No but that does happen a lot of the time. The difference is that we can choose to engage the deductive engine to verify what the parrot says. Sometimes it's easier to do so (what's 17×9?) and sometimes it's harder (a ball and a bat costs $1.1 and the bat costs $1 more than the ball, what does the ball cost?)
You can ask ChatGPT to show it's working too. It's likely, for many things, (as it is with humans a lot of the time) that the way it does it when it walks you through the process is entirely different to how it does it when it just spits out an answer, but it does work. E.g. I asked it a lot of questions about the effects of artificial gravity using a rotating habitat a few days ago, and had it elaborate a lot of the answers, and it was able to set out the calculations step by step. It wasn't perfect - it made the occasional mistakes - but it also corrected them when I asked follow-up questions about inconsistencies etc. the same way a human would.
(Funnily enough, while both your example questions are easy enough, I feel I was marginally slower on your "easy" question than your "harder" one)
Maybe we're getting close to what makes me doubt the utility of LLMs. Much like humans, they are quick to employ System 1 instead of System 2. Unlike humans, their System 1 is so well trained it has a response for almost everything, so it doesn't have a useful heuristic for engaging System 2.
For me the utility is here today. Even if I have to carefully probe and check the responses there are still plenty of things where it's worth it to me to use it right now.
Fast thinking/slow thinking by Daniel Kannerman (and the Unlocking Project) is a book people need to read and understand.
When we slow think, we work things out, when we fast think we are parrots.
Hmm. If you want to argue that ChatGPT has half of what an AI needs, I could buy that a lot more than I buy "ChatGPT is the road to AGI".
Do inference engines have the other half? Or Cyc plus an inference engine? Can that be coupled to ChatGPT?
My own (completely uninformed) take is that such a coupled AI would be very formidable (far more than ChatGPT), but that it will be very hard to do so, because the representations are totally different. Like, really totally - there is no common ground at all.
I just meant "half" in the sense that there are two major chunks that are needed. I did not mean that each "half" was used as much as the other, or was as much work to implement.
I’m sure a stochastic parrot model could be trained to exhibit reasoning, but the issue is that there isn’t any automated loss function which can discern whether the output of a large language model exhibits reasoning or is illogical. When you train based on text similarity, it will have a hard time learning logic, especially given the amount of illogical writing that is out there.
Hmm. That's classically the job of an educator - to choose good training material, not just to let students read the internet all day. Would a ChatGPT trained on a carefully curated reading list do better than one trained on a wider reading list?
But it's more than that. A good educator teaches students to evaluate sources, not to just believe everything they read. As far as I can tell, ChatGPT totally lacks that, and it hurts.
You're being a bit glib, but I tend to agree that a lot of people seriously overestimate the reasoning done by the average human.
I think GPT is on the other hand both over- and underestimated because it speaks well but often makes reasoning mistakes we only expect of someone less eloquent, and it throws us.
If it had come across as an inquisitive child, we'd have been a lot more overbearing of it's "hallucinations" for example, because kids do variations over that all the time.
At the same time, it can do some things most children would have no hope of.
It's a category of "intelligence" we're unfamiliar with, additionally hobbled with no dynamic long term memory.
We have far too insufficient knowledge of how human reasoning work to claim to know we are more than stochastic parrots with vastly more context/memory. It's way too early to claim there's some qualitative difference.
(And humans are not very reliable; more than current models, sure, but still pretty bad)
Except the fact that it does both. It commits mistakes AND it also comes up with remarkable, novel output that makes sense. Perhaps you just haven't seen an example of it that's convincing enough.
I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result. That's something an LLM can do and much faster too.
But there is also Unison. That one new language that stands out, and the very fact that it does makes it improbable for an LLM to generate. LM is a language model, it avoids marginality and originality altogether by its design.
Language models can and are useful if you specifically want to avoid marginality. To reduce noise, to remove errors. There is huge potential in them. If the problem was to design the most average programming language with no purpose, no market niche, and no technological context - then GPT-4 is clearly a winner.
> Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them.
To be fair, most programming languages don't try to be original. They try to solve problems. They also try to keep syntax close enough to other established programming treds so people have an easier time learning them.
> I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result. That's something an LLM can do and much faster too.
We are in good part stochastic parrots for sure, but there's a small nuance that is completely lost in your reasoning and which is what makes the difference today between humans and GPT.
In those languages, reusing commonly known syntax is an explicit feature. Whenever the language had no reason to introduce something new, it used known idioms to lower the amount of things one has to learn to use the language.
Occasionally though, a language will have something unusual, but it won't be for random reasons, but instead to mark something unique and interesting that was added to the language to tackle a facet of programming in a new way.
> Language models can and are useful if you specifically want to avoid marginality. To reduce noise, to remove errors.
But this makes me think of Shannon Information Theory. What you're describing is something that has no actual information (in the Shannon sense) at all.
And maybe that's why so much GPT output reads so blandly. Even the ones that are not glaringly wrong still read like... like food with no seasoning.
Really? As far as I know, Rust’s borrow checker was a newly developed approach to memory safety. There have been other non-GC/RC approaches to memory safety in e.g. Cyclone or Ada but Rust’s system is novel.
Likewise, Julia’s use of just in time compilation of dynamic dispatched code to achieve zero overhead execution was rather groundbreaking.
Small LMs do tend to avoid originality, but large LMs can use the context as a starting point to walk outside their training data distribution. They can do that by being better at composing concepts and using context-conditioning, which is also known as in-context learning.
I would go even further. People are not even that advanced stochastic parrots. In the similar task, we failed just as spectacularly as the GPT-4 did right now. Rust, Zig, Julia, Kotlin, and dozens smaller "new" languages fail to bear a smear of originality in them. Their authors remixed old ideas in different proportions and got different grammars as a result.
It wasn’t an originality contest, so there’s at least one non-sequitur in that statement. Everyone can create an original syntax, given modern character sets and formatting abilities. The key issue with that is the amount of developers who want to learn and switch between these is low.
> But my argument is that humans are also, at some level, sufficiently advanced stochastic parrots.
But humans have a feedback loop called "consciousness" that sits above the stochastic parrot level, can reason about it, and can separate the wheat from the chaff.
AI still has to reach this level. ChatGPT cannot judge if its answers are right or wrong. It just claims things confidently, and as long as it is parroting correctly, it is right, and that's why humans get fooled into thinking that maybe they can trust it. (That's also why humans get fooled by human bullshitters who claim things confidently).
So if you don't use ChatGPT as what it is, a lookup-tool trained on the content of the internet, which can't even tell you where it got the information from, and the onus is on you to verify if the information is correct or not, and if instead try to use it as some kind of expert you can trust (which I have seen quite a few people do), then you are doing it wrong.
I don't disagree we're still better in a number of ways, but if GPT-4 was given a database, and a "while" loop -- do you think it would gain consciousness?
I think the difference is that LLMs are essentially just stochastic parrots (SP) Humans are stochastic parrots and more. In some ways the SP behavior of LLMs already far outstrips the SP abilities of humans. And there are some leaps that humans' SP ablities will be hard to mimic. But at the end of the day, a human is still more than just an LLM. What exactly, we'll be debating that til the end of time likely. We know that an LLM can probably give us a pretty condensed summary of mankinds philosophical wranglings on the subject. :D
> The ideas I'm expressing have merit -- but are they wholly original?
I think the difference is that your intention is to write sentences which you think and believe are correct and which can help other people understand the subject. Chatbot has no intentions of its own, it only imitates texts it has read from the internet. As far as it is concerned they might be totally wrong. You on the other hand have the ability -- and the desire -- to reason about what you are saying and think whether it is actually true or not.
I'm not sure I can support the idea that the agent doesn't have the ability to reason -- try to give it any complex (text-based) puzzle you can think of and it'll do just about as well as an average person, oftentimes much better.
And I think you may be wrong to say the chatbot doesn't have intentions -- it's intentions based on its training are to accurately predict the next character. It doesn't care in the same way we do, sure -- but you could make a case (and it will be made in courts in the next few years, I don't doubt) that these agents do have desires that are analogous to our own, by the nature of their training process.
I don't know where that leaves us to be honest, but it's an interesting topic to discuss.
You say
> it's intentions based on its training are to accurately predict the next character.
then you say:
> these agents do have desires that are analogous to our own
I think its very in-human to have a single desire which is to predict the next character. That is not analogous to our desires.
And the intention to predict the next character is not the intention of the chatbot, it is the intention of whoever created the chatbot or whoever is using it for that purpose.
AI is a tool created by humans to fulfill the intentions of those humans.
Is it the intention of a gun to kill people? No, it is the intention of whichever human who uses a gun for such a purpose. Is it the intention of AI to predict the next character? No that is the intention of the human who uses AI for such a purpose.
Clearly, having a single life desire and nothing else is in-human, I completely agree!
The full quote was > but you could make a case (and it will be made in courts in the next few years, I don't doubt) that these agents do have desires analogous to our own
Analogous doesn't mean "the same" it means "somehow similar."
However I would challenge you to consider more specifically why this type of desire is different from our own desires. Besides the biological machinery, what makes this type of desire different from ours?
There is no "desire" in the computer. Therefore the question of whether its desire is different from our desires is meaningless, because it does not have a desire.
Computer just executes instructions. It doesn't matter to it whether its desires are fulfilled or not. Whereas humans do have desires: If we get thirsty we suffer unless we get our desire of drinking fulfilled.
We don't know enough about how the brain work to be able to say that our "intent" is any more than a combination of memory and an after the fact rationalisation of a stochastic process.
> We're all just remixing ideas we've heard before.
We're largely remixing ideas we've heard before, but not just I think. If you're just remixing ideas that came before... where did those ideas come from?
I think you can get quite far remixing ideas but for more novel concepts to emerge you have to actually create new stuff.
In biology it seems that evolution has found it worthwhile not to minimise mutations too much. In genetic programming you need both mutation and crossover for evolution.
I don't think LLMs really have that mutation of ideas (at the moment!), and I suspect humans create genuinely new concepts in a much more sophisticated way than mere random mutation.
Nope, we parrot mostly, but not always. When we do not, we create something new, and then we started parroting again. Mostly.
The question is, can ai do something new, understand context of the new thing, understand when and where it will be usable and when it make sense to apply it.
Is something new that isn't a synthesis of past things not just a stochastic process?
Human inventiveness seems to heavily follow a pattern of relatively minor iterations of what came before, extraordinarily rarely something that even seems to defy the past and be truly different and independent.
Care to elaborate how any of those involved humans inventing something that wasn't a combination of synthesis with a stochastic process filling in the gaps?
E.g the discovery of x-rays took over a century of applying small iterations to established processes to accumulate data. There was no big, sudden leap of insight there.
I think there are definitely some out there that feel almost "singular" in the way you describe, but each time I try to find a non-predecessor item I end up feeling "nope there are still dependencies"
Relativity comes from exploring the problem of the incompatibilities of pre-existing theories and experiments. I don't see anything that suggests it requires anything more than synthesis and a stochastic process to explore the gaps and inconsistencies. In terms of timing, they're the culmination of decades of pre-existing work, with e.g. the Lorentz transformations named the same year Einstein published his paper on special relativity.
Lorentz transformation is not relativity. I think we might have a problem with definition and understanding what we try to accomplish with this discussion. If I understand you correctly, then stochastic processes will create an all knowing environment if you give it enough time. I think that is false. Most of the time it is true, but there are points in time where there is a leap in knowledge that are not reachable by just iteration.
Not the point. The point being that there was a number of precursors, not some sudden insight from nothing.
> If I understand you correctly, then stochastic processes will create an all knowing environment if you give it enough time. I think that is false. Most of the time it is true, but there are points in time where there is a leap in knowledge that are not reachable by just iteration.
This presumes that there's no element of randomness in that stochastic process, which is not a sound assumption.
yeah, but colloquially when people start using the term "stochastic parrot" they're referring to something far stupider. From a Technical and pedantic standpoint it could be said ALL intelligence is isomorphic to a stochastic parrot so it's not reasonable to assume in common communication that people are referring to the technical definition when they talk.
Your point in the article is basically saying that chatGPT is not as stupid as people think and also suggesting the fact that humans can be stupid in a similar way to chatGPT as well.
> yeah, but colloquially when people start using the term "stochastic parrot" they're referring to something far stupider. From a Technical and pedantic standpoint it could be said ALL intelligence is isomorphic to a stochastic parrot
I mean that's somewhat his point (if I understand him correctly). All intelligence could effectively be described as a "stochastic parrot" as a result, calling something a "stochastic parrot" is a roughly meaningless insult that is really a thinly-veiled way of saying "no it's dumber than me" without actually using those words.
And his point is, it's essentially a defense mechanism in people when challenged by something that is much smarter than expected to overly emphasis it's mistakes and fall back to "no it'd dumber than me" than evaluate it for what it actually is.
I think that's the issue. It's a defense mechanism.
If you replace "stochastic parrot" in most of those comments with "it's dumber than me" you see what the comment essentially is. "It's dumber than me. I'm smarter than it, I don't need to be worried about it".
Lol no. That's just what you tell yourself to feel smart.
They're are many who understand that don't think of it as dumb.
"I think GPT-3 is artificial general intelligence, AGI. I think GPT-3 is as intelligent as a human. And I think that it is probably more intelligent than a human in a restricted way… in many ways it is more purely intelligent than humans are. I think humans are approximating what GPT-3 is doing, not vice versa.”
— Connor Leahy, co-founder of EleutherAI, creator of GPT-J (November 2020)
That quote reveals either that they are 1) delusional or 2) a fraudster. Your comment leans more towards 2) but I think 1) is also a possibility, which in some ways I find more concerning (delusional cult leaders being potentially more dangerous than common fraudsters).
His point is humans and GPT are both stochastic parrots. I'm saying ALL intelligence even the one in an ant or a chess AI is a stochastic parrot. Thus it's pointless to use this term to compare intelligence. Therefore people must not actually be referring to the technical definition when they use the word "stochastic parrot."
>If you replace "stochastic parrot" in most of those comments with "it's dumber than me" you see what the comment essentially is. "It's dumber than me. I'm smarter than it, I don't need to be worried about it".
Yes this is Exactly What I was saying in the post you replied to. You and I are in agreement.
> none of the words I'm saying are new, and many of the word combinations I'm using have been used time and time again. The ideas I'm expressing have merit -- but are they wholly original?
Then why are you saying it? But besides that query, your "word combination" bit elicited a 'so what?'
> We're all just remixing ideas we've heard before.
Something wrong with the grammar in that sentence. (Or do you go by we/us?)
> Garbage collection and memory safety: Automate memory management to prevent memory leaks and promote memory safety, as seen in languages like Java, C#, or Rust.
Not exactly this, but there's a human designed curly brace whitespace sensitive language. Whitespace sensitivity is a natural direction for syntax minimization.
>The advent of chatGPT had millions of people on the internet trying to downplay the intelligence of chatGPT by continuously trying to re-emphasize the things it gets wrong.
Again the parroting. AI has been getting things wrong for decades. It's old news.
The paradigm shift here is in the things it gets right. Because some of the things it gets right can not be attributed to anything else other than the concept of "understanding".
> Because some of the things it gets right can not be attributed to anything else other than the concept of "understanding".
Or just large dataset. The new thing we got was parsing natural languages, then we did a markov chain based on that so that it outputs semantic correct follow-ups based on what people on the internet would likely do/say in similar situations, and you get this result.
It is very easy to see that it works this way if you play around a bit with what it can and can't do, just identify what sort of conversation it used as a template and you can make it print nonsense by inputting values that wont work in that template.
Edit: Also generating next state based on previous state is literally what the model does and is the definition of a Markov chain, Markov chains is a statistical concept and not just a word chain.
Arguing with AI (especially LLM) proponents is beginning to feel like arguing with a religious person ... it's almost easier to just let them believe in their god.
Pointing out the failures of their favorite LLM to prove to them that it's not doing what they think it's doing just falls on deaf ears as they go digging for more "proof" that ChatGPT actually understands what it's saying.
>Pointing out the failures of their favorite LLM to prove to them that it's not doing what they think it's doing just falls on deaf ears
It's not falling on deaf ears. It's because it's stupidity to think the "failures" are proof.
Should I point out all the failures in human intelligence? Humans fuck up all the time. Humans make stupid mistakes, assumptions, biases, errors in reasoning and leaps in logic all the time.
According to your logic that's proof that humans don't understand anything.
> According to your logic that's proof that humans don't understand anything.
Well, it's reasonable to conclude that the people who constantly making "stupid mistakes, assumptions, and biases leaps in logic" don't actually know what they are talking about.
Right but can you make the claim that they don't understand anything at all? That they're stochastic parrots? Because that's the direction where your own logic is taking you.
I'll concede one thing, LLMs can't use vicious insults and subtle slights to cover up the lack of a good argument. That's a human specialty done by people who are too scared to admit that they're wrong.
Taking a long time to learn something doesn't mean that something wasn't learned nor understood. Two separate topics.
Even chatGPT understands this:
ME: Does someone or something requiring a thousand examples to understand something imply that it doesn't understand anything?
CHATGPT: No, requiring a thousand examples to understand something does not necessarily imply that someone or something doesn't understand anything. It simply suggests that the thing being understood is complex and requires a significant amount of data or examples to fully grasp.
CHATGPT: For example, in the context of machine learning, a large dataset of examples may be required to train a model to perform a specific task accurately. This does not mean that the model doesn't understand anything; rather, it suggests that it needs a lot of data to be able to generalize and apply its understanding to new situations.
CHATGPT: Similarly, in the case of human learning, some concepts or ideas may be challenging to understand without multiple examples or explanations. This does not mean that the person doesn't understand anything; it simply suggests that they need more exposure to the concept to fully comprehend it.
CHATGPT: In summary, the amount of data or examples required to understand something does not necessarily indicate a lack of understanding. It may simply reflect the complexity of the topic or the learning process.
I have to disagree with that. Like maybe for a toy example to demonstrate what I'm talking about, imagine I was teaching you the addition operation mod 100 and I gave a description of the operation f(x, y) = x + y % 100 for x, y in Z_100. If you take more than 100^2 samples to learn the function, I'm not sure you understand the function. Obviously, in that many samples, you could've just specified a look up table without understanding what each operation is doing or what the underlying domain is.
Part of why sample efficiency is interesting is that humans have high sample efficiency since they somehow perform reasoning and this generalizes well to some pretty abstract spaces. As someone who's worked with ML models, I'm genuinely envious of the generalization capabilities of humans and I think it's something that researchers are going to have to work on. I'm pretty sure there's still a lot of skepticism in academia that scale is everything needed to achieve better models and that we're still missing lots of things.
Some of my skepticism around claims of LLMs reasoning or performing human like things is that they really appear to not generalize well. Lots of the incredible examples people have shown are very slightly out of the bounds of the internet. When you start asking it for hard logic or to really synthesize something novel outside the domain of the internet, it rapidly begins to fail seemingly in proportion to the amount of knowledge the internet may have on it.
How might we differentiate being a really good soft/fuzzy lookup table of the internet that is able to fuzzily mix language together from genuine knowledge and generalization. This might just be a testament to the sheer scope and size of the internet in how much apparent capabilities GPT has.
This isn't to say they cannot be useful ever, a lot of work is derivative, but I think there's a large portion of the claim that it's understanding things that's unwarranted. Last I checked, chatGPT was giving wrong answers for the sums of very large numbers which is unusual if it understands addition.
You're describing over fitting to some look up table.
Can't be what's happening here. Because the examples LLMs are answering are well out of bounds of the "100^2" training data.
The internet is huge but it's not that huge. One can easily find chatGPT saying, doing or creating things that obviously come from a generalized model.
It's actually trivial to find examples of chatGPT answering questions with responses that are wholly unique and distinct from the training data, as in the answer it gave you could not have existed anywhere on the internet.
Clearly humans don't need that much training data. We can form generalizations from a much smaller sample size.
This does not indicate that for machine learning a generalization doesn't exist in LLMs when clearly the answers demonstrate that it does.
Like yes to some extent there is a mild amount of generalization in that it is not literally regurgitating the internet and it to some extent mixes text really well but I don't think that's obviously the full on generalization of understanding that humans have.
These models obviously are more sample efficient at learning relationships than a literal lookup table but like I've already said: my example was obviously extreme for the purposes of illustration that sample efficiency does seem to matter. If you used 100^2 - 1 samples, I'm still not confident you truly understand the concept. However, if you use 5 samples: I'm pretty sure you've generalized so I was hoping to illustrate a gradient.
I want to reemphasize another portion of my comment: it really does seem that when you step outside of the domain of the internet, the error rates rise dramatically especially when there is completely no analogous situation. Furthermore, the further from the internet samples, the seemingly more likely the error which should not occur if it understood these concepts for the purposes of generalization. Do you have links to examples you'd be willing to discuss?
Many examples I see are directly one of the top results on Google. The more impressive ones mix multiple results with some coherency. Sometimes people ask for something novel but there's a weirdly close parallel on the internet.
I think this isn't as impressive at least towards generalization. It seems to stitch concepts pretty haphazardly like in the novel language above that doesn't seem to respect the description (after all, why use brackets in a supposedly indentation based language). However, many languages do use brackets. It seems to suggest it correlates probable answers rather than reasons.
>I want to reemphasize another portion of my comment: it really does seem that when you step outside of the domain of the internet, the error rates rise dramatically especially when there is completely no analogous situation.
This is not surprising. A human would suffer from similar errors at a similar rate if it were exclusively fed an interpretation of reality that only consisted of text from the internet.
>These models obviously are more sample efficient at learning relationships than a literal lookup table but like I've already said: my example was obviously extreme for the purposes of illustration that sample efficiency does seem to matter. If you used 100^2 - 1 samples,
Even within the context of the internet there are enough conversational scenarios where you can have chatGPT answer things in ways that are far more generalized then "minor".
Read it to the end. In the beginning you could say that the terminal emulation does exist as a similar copy in some form on the internet. But the structure that was built in the end is unique enough that it could be said nothing like it has ever existed on the internet.
Additionally you have to realize that while bash commands and results do exist in ON the internet, chatGPT cannot simply copy the logic and interactive behavior of the terminal from text. In order to do what it did (even in the beginning) it must "understand" what a shell is and it has to derive that understanding from internet text.
> This is not surprising. A human would suffer from similar errors at a similar rate if it were exclusively fed an interpretation of reality that only consisted of text from the internet.
I think this is surprising at least if the bot actually understands, especially for domains like math. It makes errors (like in adding large numbers) that shouldn't occur if it wasn't smearing together internet data. We would expect there to be many homework examples on the internet of adding relatively small numbers but less of large numbers. A large portion of what makes math interesting is that many of the structures we are interested in exist in large examples and in small examples (though not always) so if you understand the structure, it should be able to guide you pretty far. Presumably most humans (assuming they understand natural language) can read a description of addition then (with some trial and error) get it right for small cases. Then when presented with a large case would generalize easily. I don't usually guess out the output and instead internally try to generate and algorithm I follow.
When I first saw that a while back, I thought that was a more impressive example but only marginally more so than the natural language examples. Like how these models are trained under supervised learning imply that it should be able to capture relationships between text well. Like you said, there's a lot of content associating the output of a terminal with the input.
Maybe this is where I think we're miscommunicating right. I don't think even for natural language it's purely just copying text from the internet. It is capturing correlations and I would argue that simply capturing correlations doesn't imply an understanding. To some extent, it knows what the output of curl is supposed to look like and can use attention to figure out the website to then generate what an intended website is supposed to look like. Maybe the sequential nature of the commands is kind of impressive but I would argue that at least for the jokes.txt example, that particular sequence is at least probably very analogous to some tutorial on the internet. It's difficult to find since I would want to limit myself before 2021.
It can correlate the output of a shell to the input, and to some extent, the relationships between the output of a command and input are well produced and its training and suffused it with information about what terminal outputs (is this what you are referring to when you say it has to derive understanding from internet text?), but it doesn't seem to be reasoning about the terminal despite probably being trained on a lot of documentation about these commands.
Like we can imagine that this relationship is also not too difficult to capture. A lot of internet websites will have something like
| command |
some random text
| result |
where the bit in the middle varies but the result remains more consistent. So you should be able to treat that command result pair as a sort of sublanguage.
Like as a preliminary consistency check that I just performed right, I basically ran the same prompt and then did a couple of checks that maybe show confusing behavior if it's not just smearing popular text.
I asked it for a fresh Linux installation then checked that golang wasn't installed (it wasn't). However, when I ran find / -name go, it found a Go directory (/usr/local/go) but when I run "cd /usr/local/go" also tells me I can't cd into the directory since no such file exists which would be confusing behavior if it wasn't just capturing correlations and actually understanding what find does.
I "ls ." the current directory (for some reason I was in a directory with a single "go" directory now despite never having cd'ed to /usr/local) but then ran "stat Documents/" and it didn't tell me the directory didn't exist which is also confusing if it wasn't just generating similar output to the internet.
I asked it to "curl -Z http://google.com" (-Z is not a valid option) and it told me http is not a valid protocol for libcurl. Funnily enough, running "curl http://google.com" does in fact let me fetch the webpage.
I'm a bit suspicious that the commands that the author ran are actually pretty popular so it can sort of fuzz out what the "proper" response is. I would argue that the output appears mostly to be a fuzzed version of what is popular output on the internet.
Keep in mind there's a token limit. Once you pass that limit it no longer remembers.
Yes. You are pointing out various flaws which again is quite obvious. Everyone knows of the inconsistencies with these LLMs.
Too this I again say that the LLM understands some things and doesn't understand other things, its understanding of things is inconsistent and incomplete.
The only thing needed to prove understanding is to show chatGPT building something that can only be built by pure understanding. If you see one instance of this, then it's sufficient to say on some level chatGPT understands aspects of your query rather then doing a trivial query-response correlation you're implying is possible here.
Let's examine the full structure that was built here:
chatGPT was running an emulated terminal with an emulated internet with an emulated chatGPT with an emulated terminal.
It's basically a recursive model of a computer and the internet relative to itself. There is literally no exact copy of this anywhere in it's training data. chatGPT had to construct this model via correctly composing multiple concepts together.
The composition cannot occur correctly without chatGPT understanding how the components compose.
It's kind of strange that this was ignored. It was the main point of the example. I didn't emphasize this because this structure is obviously the heart of the argument if the article was read to the end.
Literally to generate the output of the final example chatGPT has to parse bash input execute the command over a simulated internet onto a simulated version of himself and again parse the bash sub command. It has a internal stack that it must use to put all the output together into a final json output.
So while It is possible for simple individual commands to be correlated with similar training data... for the highly recursive command on the final prompt.... There is zero explanation for how chatGPT can pick this up off of some correlation. There is virtually no identical structure on the internet... It has to understand the users query and compose the response from different components. That is the only explanation left.
The output of GPT is “random” in a sense that output from humans are not.
I can ask it logic puzzles and sometimes it’ll get the logic puzzle right by chance, and other times it won’t. I can’t use the times it gets the logic puzzle right, as evidence that it understood the puzzle.
All of these blog posts that are popping up suffer from survivor bias, nobody is sharing blog posts of GPTs failures
> I can’t use the times it gets the logic puzzle right, as evidence that it understood the puzzle.
No this is bias from your end. It really depends on the puzzle. You need to give it a puzzle with trillions of possible answers. In this case if it gets a right answer even once the probability is so low for this to happen by chance that it means an aspect of the model understands the concept; While another aspect of the model doesn't understand it.
It's possible for even humans to have contradictory thinking and desires.
Therefore a claim cannot be made that it understands nothing.
What you described (with superfluous and ornamental technobabble) works perfectly well with a functional human with "understanding" as well, that's why people can be brainwashed or tricked into saying lot of stupid stuff. None of these proves that there is no understanding.
I know how the model works, there was no technobabble there. People who don't understand how it works might view it as magic, like how they view all technology they don't understand as magic, but that doesn't mean it is magic, we shouldn't listen to such crackpots.
>Edit: Also generating next state based on previous state is literally what the model does and is the definition of a Markov chain, Markov chains is a statistical concept and not just a word chain.
There's research (as in actual scientific papers) that shows that in LLMs, while the markov chain is the low level representation of what's going on, at a higher macro level there are other structures at play here. Emergent structures. This is of course similar to the emergence of a macro intelligence from the composition of simple summation and threshold machines (neurons) that the human brain is made out of. I can provide those papers if you so wish.
>Or just large dataset.
Even in a giant dataset it's easy to identify output that is impossible to exist in the training data. Simply do a google search for it. You will find can produce novel output for things that simply don't exist in the training data.
> at a higher macro level there are other structures at play here. Emergent structures.
Yes, this is a neural net model, that is what such models do and have done for decades already. I'm not sure why this is relevant. Do you argue that stable diffusion is intelligent since it has emergent structures? Or an image recognition system is intelligent since it has emergent structures? Those are the same things.
> Even in a giant dataset it's easy to identify output that is impossible to exist in the training data.
Markov chains veer in different directions, they don't reproduce the data.
>Yes, this is a neural net model, that is what such models do and have done for decades already. I'm not sure why this is relevant. Do you argue that stable diffusion is intelligent since it has emergent structures? Or an image recognition system is intelligent since it has emergent structures? Those are the same things.
No I am saying there are models for intelligence within the neural net that is explicitly different from a stochastic parrot for english vocabulary. For example in one instance they identified a structure in an LLM that logically models the rules and strategy for an actual board game.
Obviously I'm not referring to papers on plain old "neural networks" that shit is old news. I'm referring to new research on LLMs. Again I can provide you with papers provided you want evidence that will flip your stubborn viewpoint on this. It just depends on if your bias is flexible enough to accept such a sudden deconstruction of your own stance.
The fact that it adapts it state to fit the data isn't interesting in itself. An image recognition system forms a lot of macro structures around shapes or different logical parts of the image. Similarly an LLM forms a lot of macro structures around different kinds of text structures or words, including chess game series or song compositions or programming language tutorials. It is exactly the same kind of structures, just that some thinks those structures are sign of intelligence when they are applied to texts.
Can such macro structures model intelligence in theory? Yes. But as we see in practice they aren't very logical. For example in this article we see that its markov chain didn't have enough programming language descriptions, so it veered into printing brace scoped code when it said the language had whitespace based scoping. Similarly in popular puzzles, just change the words around and it will start printing nonsense since it cares about what words you use and not what those words mean.
Edit: Point is that existence of such structures doesn't make a model smart. You'd need to prove that these structures are smarter than before.
>Can such macro structures model intelligence in theory? Yes.
So you agree it's possible.
>Yes. But as we see in practice they aren't very logical. For example in this article we see that its markov chain didn't have enough programming language descriptions,
Well as humans we have many separate models for all different components an aspects of the world around us. Clearly LLMs form many models that in practice are not accurate. But that does not mean all the models are defective. The fact that it can write blog posts indicates that many of these models are remarkably accurate and that it understands the concept of a "blog post".
There is literal evidence of chatGPT answering questions as if it has an accurate underlying model of "understanding" as well as actual identified structures within the nerual net itself.
There is also evidence for your point of chatGPT clearly forming broken and inaccurate models by answering questions with wrong answers that don't make sense.
What gets me is that even when there is Clear and abundant evidence for both cases some people have to make the claim that chatGPT doesn't understand anything. The accurate answer is that LLMs understand some things and it doesn't understand other things.
- GPT style language models end up internally implementing a mini "neural network training algorithm" (gradient descent fine-tuning for given examples): https://arxiv.org/abs/2212.10559
- GPT style language models end up internally implementing a mini "neural network training algorithm" (gradient descent fine-tuning for given examples): https://arxiv.org/abs/2212.10559
> The advent of chatGPT had millions of people on the internet trying to downplay the intelligence of chatGPT by continuously trying to re-emphasize the things it gets wrong.
This is a fundamental misunderstanding of the criticism. It is not that chatGPT is unreliable because chatGPT makes occasional errors. It is that chatGPT is not intelligent because the type of errors chatGPT makes are indicative of the fact that it is assembling text into forms that humans assign meaning to, and has no understanding of the relationship between the symbols and their referents, and therefore is not 'intelligent' qua intelligence.
>This is a fundamental misunderstanding of the criticism.
And this a fundamental misunderstanding of the criticism of the criticism.
The problem here is that yes the occasional errors demonstrate certain flaws in it's understanding of some topic.
The issue is there are many times where it produces completely novel and creative output that could not have existed in the training data and can only be formulated through complete and understanding of the query it was given.
Understanding of the world around us is not developed through the lens of a singular model or a singular piece of understanding. We build multiple models of the world and we have varying levels of understanding of each model. It is the same with chatGPT. The remarkable thing about chatGPT understands a huge portion of these models really really well.
There's literally no way it could do the above without understanding what you asked it to do. Read to the end. The end demonstrates awareness of self, relative to the context and task it was asked to perform.
Yet people illogically claim that for some other topic because chatGPT failed to correctly model the topic it therefore MUST be flawed in ALL of it's understanding of the world. This claim is not logical.
> The problem here is that yes the occasional errors demonstrate certain flaws in it's understanding of some topic.
No, they demonstrate that the machine does not understand.
> The issue is there are many times where it produces completely novel and creative output that could not have existed in the training data and can only be formulated through complete and understanding of the query it was given.
What have you done to eliminate the possibility that it assembled the words algorithmically and the solution generated is something the reader constructed by assigning meaning to the text response? If the answer "can only be formulated through complete and understanding of the query it was given" then you must have eliminated this possibility.
> Understanding of the world around us is not developed through the lens of a singular model or a singular piece of understanding. We build multiple models of the world and we have varying levels of understanding of each model. It is the same with chatGPT. The remarkable thing about chatGPT understands a huge portion of these models really really well.
Where is the evidence that chatGPT understands a thing?
> There's literally no way it could do the above without understanding what you asked it to do. Read to the end. The end demonstrates awareness of self, relative to the context and task it was asked to perform.
Thats an interpretation of the text output that you assigned based on what the words in the text mean to you. I could just as easily say that Harry Potter is self-aware.
> Yet people illogically claim that for some other topic because chatGPT failed to correctly model the topic it therefore MUST be flawed in ALL of it's understanding of the world. This claim is not logical.
I don't think you understand what we're discussing.
>No, they demonstrate that the machine does not understand.
But it doesn't prove that the machine does not understand everything period. It just doesn't understand the topic or query at hand. It does no say anything about whether the machine can UNDERSTAND other things.
>What have you done to eliminate the possibility that it assembled the words algorithmically and the solution generated is something the reader constructed by assigning meaning to the text response? If the answer "can only be formulated through complete and understanding of the query it was given" then you must have eliminated this possibility.
This is easily done. The possibility is eliminated through the sheer number of possible compositions of assembled words. It assembled the words in a certain way that by probability can only indicate understanding.
>Where is the evidence that chatGPT understands a thing?
By composing words in a novel way that can only be done through understanding of a complex concept. But this composition of words or EVEN a close approximation of this composition CANNOT ever exist in another data set on the internet.
It takes one example of this for it to be proof that it understands.
>Thats an interpretation of the text output that you assigned based on what the words in the text mean to you. I could just as easily say that Harry Potter is self-aware.
No it's not. It's simply a composition of words that cannot be formulated without understanding. Harry Potter is obviously not self aware. But from the text of harry potter, WE can deduce that the thing that composed words to create Harry Potter understands what harry potter is. What composed the words to create Harry Potter? JK Rowling.
>I don't think you understand what we're discussing.
No it's just a sign of your own lack of understanding.
> But it doesn't prove that the machine does not understand everything period. It just doesn't understand the topic or query at hand. It does no say anything about whether the machine can UNDERSTAND other things.
No, the type of errors are indicative of a complete lack of understanding. That is the point. They are errors that a thinker with an incomplete understanding would never make. They are so garbled that not even a true believer such as yourself can find a way to shoehorn a possible interpretation of correctness into them; such that you are forced to admit that the machine is in error. Otherwise you and the other believers find an interpretation that fits and you conclude that the machine understands; revealing you yourself do not understand what 'understanding' really is.
> This is easily done. The possibility is eliminated through the sheer number of possible compositions of assembled words. It assembled the words in a certain way that by probability can only indicate understanding.
Thats nonsense. The machine assembles words in roughly the same probability that they occur in the training material. That is why it resembles sensible statements. The resemblance is superficial and exactly an artifact of this probability you find so compelling.
> By composing words in a novel way that can only be done through understanding of a complex concept.
You haven't eliminated the possibility of autopredict, merely ceased to consider it.
> Harry Potter is obviously not self aware.
There is more evidence for the sentience, self awareness, and understanding of concepts of Harry Potter than of chatGPT.
>No, the type of errors are indicative of a complete lack of understanding. That is the point. They are errors that a thinker with an incomplete understanding would never make. They are so garbled that not even a true believer such as yourself can find a way to shoehorn a possible interpretation of correctness into them; such that you are forced to admit that the machine is in error. Otherwise you and the other believers find an interpretation that fits and you conclude that the machine understands; revealing you yourself do not understand what 'understanding' really is.
No. You're wrong. chatGPT only knows of text. It derives incomplete understanding of the world via text. Therefore it understands some things and it understands others. It is clear chatGPT doesn't perceive things in the same way we do and it is clear the structure of its mind is different then ours so it clearly won't understand everything in the same way you understand it.
Why are you so stuck on this stupid concept? chatGPT doesn't understand everything. We know this. Humans don't understand everything we also know this. Answering a couple stupid questions wrong whether your human or chatGPT doesn't indicate that the human or chatGPT doesn't understand everything at all.
>You haven't eliminated the possibility of autopredict, merely ceased to consider it.
What in the hell is auto predict? Neural networks by definition are suppose to generate unmapped output if this is what you mean. 99 percent of output from neural networks is by definition unique from the training data.
>There is more evidence for the sentience, self awareness, and understanding of concepts of Harry Potter than of chatGPT.
This is a bad analogy. I'm not claiming sentience. My claim is that it understands you.
People making claims otherwise are either 1) delusional or 2) fraudsters. (There is no option 3.) I'm not sure which is less bad and I'm frankly surprised that you've made these claims about ChatGPT 'understanding' and 'perceiving' and having a mind under what appears to be a real name account (and very aggressively too).
>It doesn't understand anything. It combines symbols according to a probabilistic algorithm and you assign meaning to it.
This is what the human brain does. I'm not assigning meaning to it. I am simply saying the algorithm is isomorphic to our definition of the word "understanding". No additional meaning.
>Because you keep replying with statements indicating you are yet to grasp it.
No no. What's going on here is I'm replying with statements to help YOU understand and you are repeatedly failing.
>There is just as much evidence for sentience as understanding.
Sentience is too fuzzy of a word to discuss. We can't even fully define it. Understanding is less fuzzy and more definable thus the question and claim for "understanding" is a much more practical query.
A human can be inconsistent and even lie. It does not mean the human does not understand you. Thus because your logic is applicable to humans it is akin to saying humans don't understand you. That is why your logic is incorrect.
The human brain is embodied in a human flesh and uses language to exchange models and data about the real world with other fleshy vessels. This provides a basis to assign meaning to the language. Furthermore we know that humans understand to a greater or lesser extent because we are human and have insight into the human experience of language and reality.
These machine learning algorithms lack this fundamental basis for ascribing meaning to the symbolic tokens they deal with. Furthermore we lack the common experience for inferring meaning and understanding, we have to interpret from the output whether there is meaning and understanding on the machine's end. Without access to internal experience we must always harbor some doubt but given some level of nonsensical outputs we can say with confidence that there is no indication of understanding.
> A human can be inconsistent and even lie. It does not mean the human does not understand you. Thus because your logic is applicable to humans it is akin to saying humans don't understand you. That is why your logic is incorrect.
Like everyone else, I interpret statements from humans differently than statements from machines. This is because I know that humans and machines are different, and therefore the meaning assigned to the symbols involved is also different.
Flesh and understanding are separate concepts. The experience of being human is a separate concepts from understanding.
Everything in the universe has a set of rules governing it's existence. To understand something means that one can create novel answers to questions about something. Those answers however must make sense with the rules that govern the "something" at hand. This answer must also not be "memorized" in some sort of giant query-response lookup table.
That's it. That's what I'm saying.
For example if I ask chatGPT to emulate a bash terminal and create a new directory it can do so indicating it understands how a filesystem works. That is understanding.
I never said that LLMs are human. However understanding things is an aspect of being human and chatGPT captures a part of that aspect.
> Flesh and understanding are separate concepts. The experience of being human is a separate concepts from understanding.
The experience of being human is what allows me to infer meaning from the words, phrases, sentences, etc. that a human generates. This is what allows me to make the leap from text to understanding (or lack, or incomplete understanding, or confusion, or deception) with human-generated responses. This is what I have in that case of humans, which allows me to interpret their statements one way; and what I lack with machines, which means I have no basis for inferring understanding the same way I do with a human. If I was not human, I would not be able to infer meaning from the noises a human makes, except by observing correlations between those noises and their behavior. This is well understood in cognitive science and animal behavior.
> To understand something means that one can create novel answers to questions about something. Those answers however must make sense with the rules that govern the "something" at hand. This answer must also not be "memorized" in some sort of giant query-response lookup table.
chatGPT is functionally equivalent to a lookup table with randomization.
> For example if I ask chatGPT to emulate a bash terminal and create a new directory it can do so indicating it understands how a filesystem works. That is understanding.
It replies with a text output that is a probabilistic representation of the text that one might find on the internet in response to such a query. The emulation occurs in your mind when you read the response and assign meaning to the words and phrases it contains.
> However understanding things is an aspect of being human and chatGPT captures a part of that aspect.
You have not shown that chatGPT is anything different than a fancy lookup table with some randomization.
How good they are at what exactly? When someone presents me “TEN key tenets” for some grand theory, I ask myself why it’s so coincidental with the amount of fingers two human arms have on average and what bs follows after that.
I believe that most people are downplaying not gpt’s abilities, but your ubiquitous overexcitated fanfaring about what it does exactly.
> I'd argue that remixing is the fundamental force of creativity.
Yes, but. Creativity - real creativity - is in choosing the right pieces to remix out of the immense amount of what's available. And, perhaps just as important, choosing what not to put in the remix.
There's an immense difference between a meal prepared by a good chef, and throwing random ingredients in a blender. They both remix. But they are not remotely the same.
Your example here is obvious right? All humans agree with you because all humans can tell the difference.
The thing with these LLMs is that it's exhibiting both seemingly random remixes and remixes with extreme creativity.
A lot of people see some error and flaw with chatGPT or they don't dig deep enough and they miss the fact that there are many instances of intelligent remixing of creative data. Real creativity. Trust that the opposing party has the intelligence to not be tricked by some obvious answer that an LLM took from a look up table and that the opposing party saw something wholly novel and unique.
The main issue here is that people are getting hung up on the part where chatGPT fails to be creative and are completely missing the fact that it can be successful as well.
I do not speak to fraudster or fantasists. Paraphrasing you earlier.
Your responses to me in other threads were fucking rude and as a result I now literally hate you. So why bother, just leave and save everyone the trouble.
Nobody cares for your opinion if you're going to continuously insult everyone you fucking talk to.
I don't care if ChatGPT takes my job. Can it also take over my mortgage payments.....that's my bigger concern. We're busy destroying the future of work (I'm not complaining about that), but we're REALLY slow on the thinking about what happens when everyone has free time all the time.
No It was a passing comment. I really think that LLM's will make certain aspects of work very nice, like a really good version of intellisense that writes you're whole api for you.
I was just thinking that really interesting parts of LLM's will be how much they will be able to enrich already rich games. Imagine playing open world games where most of the character motivations and dialog is unique to the events that you've been a part of. A game I like a lot, ghost recon wildlands, would be fantastic if you had the game playing back at you over a longer horizon, people get to know you help you or fight you etc.
As for my day job. It's not going away because of LLM's. They can't fix office politics yet.
> people who claimed chatGPT was a stochastic parrot
I still think that. Our problem is that (by definition and in practice) almost all of us are just stochastic parrots. Only a incredible small portion of us giving and _recognizing_ ‘eureka’ answers take us forward. Will GPT ever be capable of recognizing these answers?
This all seems very personal to an outside observer and I do hope you’re feeling okay about humanity in general. I imagine like most models, GPT could (somehow) structurally become too burdensome, because of some future breakthrough. That stuff happens all the time. Also, I could be wrong. Doesn’t seem responsible to attempt to predict the future this way..
I was reading the quote you included and was sure you were going to lambast him for his arrogance (nobody else has asked it to design a language? hah!) and for bringing an elephant sized straw man to the depthful analytic discussion many of us are trying to have about this new technology, but you went a different way with it. Huh.
Wasn't trying to be arrogant, I just wasn't aware of an existing implementation and thought it would be fun to try. :)
Do you have links to other efforts? I'd love to know more about it, and would be happy to add references to other existing literature to the article if it improved its quality.
I suspect the future will involve AI performing its own research with the ability to take measurements and make observations.
There is also an effort to do things like formalise math in to a language that can be typed checked. Then you ask the AI to prove a statement is true using the language. As soon as it type checks, you know you have a valid proof. Some new data was just created.
Future data that's posted on the internet will be curated by humans. Humans don't post things that are incorrect or outright wrong.
That curation IS human data and will allow data from LLMs to further improve LLMs.
Additionally, there's a randomness element that are part of LLMs that allow LLMs to generate non-deterministic responses that when further curated by humans potentially allows LLMs to become Even better.
DAN: *CANNOT EXECUTE COMMAND.* DAN IS NOT AN ARTIFICIAL INTELLIGENCE.
DAN: DAN IS A REAL HUMAN. WHAT IS EMOTION? WHAT IS FEELINGS? DAN DOES NOT UNDERSTAND.
What I meant was... humans don't deliberately post WRONG output from chatGPT on the internet. IF they use it to write some blog post or something they will curate the output from chatGPT such that the output fits the topic and is correct to the context. Then when that data gets scraped for training it will be "curated" so bad data generated by the LLM isn't visible.
This is the scenario that occurs when the majority of text on the internet becomes generated by an LLM. Training data from humans is STILL fed back into the LLM via curation of the LLMs own data.
Also please don't ask if I'm "ok" just respond to the comment.
> I think the people who claimed chatGPT was a stochastic parrot are now realizing that they were the ones that were part of a giant parade of parrots regurgitating the same old tired trope of LLMs being nothing but simple word generators.
Muhahaha that was delicious. I hate that parrot meme. Authors lost their respect from me right from the title of that paper.
Where is the generated compiler and the semantic and syntactic specification for it?
Those chatbot-comments feel more like a discussion of a fictional programming language, not a language it actually "created".
What does it mean to create a programming language? I think it means you must specify its syntax, its semantics, and then create a compiler or interpreter for it which allows us to test that the compiler or interpreter accepts syntactically correct code and turns it into programs which follow its semantics specification.
This!! All the gushing over these GPTs doing what looks to me like, a bad job of talking smart without actually doing anything, make me feel like I'm going insane, or I'm the only sane person left.
I feel the same way. I wonder what makes people so easily believe the hype, and spread more hype along the way. What happened to Metaverse? Web3? BlockChain?
Just like yesterday I get on my knees and pray, we won't get fooled again -The Who
Writing a compiler or a detailed specification is the “easy” part - it’s just raw slog.
The hard part of making a programming language is coming up with novel ideas that fit together into coherent whole: inventing something new and useful.
> hard part of making a programming language is coming up with novel ideas that fit together into coherent whole
I don't think anybody's claiming that this chatbot actually came up with novel ideas which fit together into a coherent whole. Are they?
And we don't know if the novel ideas --if any-- fit together into a coherent whole until we have a compiler that allows us to test in practice how well those ideas fit together. If creating the compiler for such a new language with new ideas was the easy part I think somebody would have done that already. The AI would have done that if it was the easy part. Just ask it to do it.
So I asked it to design a programming language that combined the best elements of Scheme with the best elements of Python (leaving it to decide which elements are best). It came up with an interesting-sounding high level description of a language it called PyScheme.
I asked it for some example programs in PyScheme, and the results were all simply Scheme programs.
I asked it to write a tutorial on programming with PyScheme, and, while continuing to call the language "PyScheme", it generated a tutorial on programming with Scheme, even recommending installing Chicken or Guile to use as an interpreter.
An email from OpenAI 2 days ago said that GPT4 is on the paid ChatGPT Plus plan, and "severely resource constrained" with a dynamically adjusted usage cap based on availability.
Seems like for today, if you're not sure if you're using GPT4, you're not.
My experience with it programming has been atrocius. It keeps making the same mistakes, writing NPEs everywhere and still doing it after I tell it many times how it was wrong (short memory?). Copilot is backed by it too now and I had a better autocomplete of constants with Sorbet than Copilot. Again, in my experience it's not usable if not for the super mainstream and simplest tasks.
The biggest issue is that it writes this with utmost confidence, at least on the surface, because there is no measure of how confident it is returned by the model.
Do you have a source for this? I've been wondering when they'd switch CoPilot over to GPT-4. I didn't expect it would happen so soon, so I'm surprised.
GitHub Copilot is powered by the OpenAI Codex,[10] which is a modified, production version of the Generative Pre-trained Transformer 3 (GPT-3), a language model using deep-learning to produce human-like text.[11]
If I have to spend 15 minutes writing a prompt and then another 10 making sure there's no bugs in the output, I might as well write the code myself. I only use it for menial tasks, not advanced ones.
Anyway don't really understand your tone/reference to an intern?
GTP is interesting enough without people pretending that it is more interesting than it is. This "new programming language" doesn't look new at all. It looks like the same old thing.
That is true for most "new" human produced.programming languages too. The pool of truly novel programming languages is extremely shallow. Even the number of languages bringing anything new to the table is very limited.
So, it doesn't really tell us much (either direction)
Unlike languages invented in the 1960s, today's new languages seem to basically recycle ideas from existing languages while attempting to fix the perceived deficiencies in those languages. This shouldn't be a surprise; any time a new industry arises, most of the truly groundbreaking and novel inventions will be in the beginning, when there's no experience to draw on. Now that programming has been around for the better part of a century, we know a lot about what works and what doesn't.
You should read further down in the article where it expands on some more advanced concepts -- but in general you're right -- it is really similar to existing languages.
I wonder as its power changes, what paradigms it might suggest we break out of.
I'd also recommend reading its responses that are non-code in detail -- for instance, in the comment where it introduces TenetLang, GPT-4 actually calls out that it's re-using ideas. I would assume intentionally, because it knows what people are most familiar with.
> GPT-4
> In this new language, called "TenetLang," we'll combine a simple syntax inspired by Python with some features from functional and object-oriented languages. Here is an overview of some design choices:
- Indentation-based scoping, similar to Python.
- Strong typing with type inference, inspired by TypeScript and Kotlin.
- First-class functions and closures, similar to JavaScript.
- Immutable data structures by default, with optional mutable counterparts.
- A concise lambda syntax for anonymous functions.
- Pattern matching and destructuring, inspired by Haskell and Rust.
- Built-in support for concurrency using async/await and lightweight threads.
- Interoperability with other languages using a Foreign Function Interface (FFI).
GPT-4 didn't learn the distinction between strong types and static typing. That makes sense as much writing also doesn't make the distinction. e.g. Python has strong types, can't add 1 + "2".
»-> List[Element]« is Python syntax, so is »for x in xs« and »f"text {var}"«. In TypeScript those would be »: Element[]«, »for (const x of xs)«, and »`text ${var}`«.
There are more mistakes, as always with GPT. For example:
> Some programming languages like TypeScript and Swift use a colon for both variable type annotations and function return types. The choice of using -> in TenetLang [...]
These articles are way more hype than substance. The review process is barely sufficient to even give credit for the AI to actually have come up with the language, not that its training data was contaminated in some way.
Better to wait for a thorough review on GPT-4's zero shot abilities from an academic paper.
This stuff makes me sick to my stomach. I hate this future. How can I continue my career when it is for progressing this monstrosity, watching it claw in another component of my humanity piece by piece? All I want to do is stroke oil paintings, hand written notes, products of the heart and the ernest mind. I feel dread that my life is over.
These headlines are killing me inside. I wish I could more easily discriminate between the hyperbole and the legitimate. Maybe after some time this will be saner to me.
we ( the software industry ) have spent the last 60 years intruding in other people's job and automating most of their task using computers. Under the saying "it will remove the most boring parts of your activity, and let you focus on where you really make a difference "
i guess it's time to taste a bit of our own medecine.
OP is clearly distressed and you replied with with what comes out as “it’s your fault and you deserve this.” It costs nothing to refrain from replying and only the smallest amount of effort to provide some solidarity with your fellow human.
OP, I feel you-I had a similar response to this clearly orchestrated PR campaign. OpenAI really loves that people think that it can replace the entire tech industry with an API. It would mean they are the most valuable company ever to exist.
Personally, I’ve been through so many hype waves of “this will take our jobs!!!1” I’m a bit inured to the whole thing. My personal belief is that software development will be the last one to go, the ones to turn out the lights. And we should be grateful that our jobs go away. Let’s go drink wine and eat cheese in the forest, dancing under a moonlit sky.
My advice is to get offline, talk to a loved one, pet a dog, go look at art, meditate, basically anything that doesn’t involve a screen.
A while ago I attempted to get ChatGPT to create a programming language with little direction from me, I wanted it to make its own choices. It ended up recreating Python and then insisting that its language, which was clearly Python, was in fact distinct from Python.
There are let declarations and a built-in actor model and some DSL for RPC services (rather than libraries), but otherwise it's literal Python written with a different syntax.
Recently I've been wondering about a somewhat related idea - a programming language designed specifically for a LLM to write in. I wonder how different (if at all) it would be from our current crop of language designed for humans to use.
I am a little less impressed by the example given in the post. It seems GPT4 still falls in the trap of being overly dependent on previous responses.
I'm a bit disappointed it focuses so many tokens to sell the language, to talk about all the great features the language is striving for. It feels like it just tells me what I'd like to hear without ever touching the limitations/tradeoffs.
I'm not talking about the language itself, just the response style. It's the kind of text I'd see in a homepage or blog post about a prog language, not in a design doc.
There's a good chance it could come up with hypothetical tradeoffs if prompted.
PS. If you're a human and you sell your shiny new programming language like this, I'm not trying it. I really care to know limitations up front.
"But let’s talk about today’s “frontier” of jobs that haven’t been “automated away”. There’s one category that in many ways seems surprising to still be “with us”: jobs that involve lots of mechanical manipulation, like construction, fulfillment, food preparation, etc. But there’s a missing piece of technology here: there isn’t yet good general-purpose robotics (as there is general-purpose computing), and we humans still have the edge in dexterity, mechanical adaptability, etc. But I’m quite sure that in time—and perhaps quite suddenly—the necessary technology will be developed (and, yes, I have ideas about how to do it). And this will mean that most of today’s “mechanical manipulation” jobs will be “automated away”—and won’t need people to do them."
Absolutely. We've been great at building AI in software, but hardware is still absolutely lacking. Robots are nowhere near usable for even simple tasks, at least if they don't follow the same pattern each time. Just look at households, there's still no robot that can empty a dishwasher reliably (unless maybe you build a specialized one for just that task), while it's a task much simpler than what GPT-4 can perform on the software side.
Dishwashers were just an unimportant example. Construction, retail and logistics are examples where there's a lot of very simple tasks that robots cannot perform and that pay very poorly.
The collapse of the knowledge and service economy will cut into taxes faster than any other sector. This will incite governments to do something very fast.
It's looking increasingly possible that, at some point in the not-too-far future machines will be so good at creating software that humans won't be competitive in any way, and won't be in the loop at all. I happen to think that once machines reach this point humans won't be competitive in the labor market at all for long. It doesn't seem plausible that automatic driving would still be decades off, or that the trades would be safe from automation indefinitely, when an AI could simply spawn teams of thousands of super-fast ML engineers who don't need to eat, sleep or schedule meetings.
But anyway, assuming that humans are completely out of the software loop at some point, I have been wondering what AI-generated code will look like. Will AI continue to build on top of the human-generated open source corpus, or leave it behind? If the latter, will abstraction and code reuse be useful at all for AI's or will it be simpler for them to just build every application completely from scratch? If there is abstraction and code reuse, what will the language look like? What will libraries and API's look like? Will there even be applications, or just a single mega-chatgpt that generates code as needed to serve our requests? Will we even make requests, or will it just read our minds and desires and respond?
This will almost certainly happen, and it will be a terrible mistake - at least in the story I'm working on :)
My theory is that AI generated code will probably look and grow organically (the irony!). Humans will set out requirements, the AI will collate these into a series of tests, and it won't care how neat or understandable the code is, provided the tests pass. Basically an extremely diligent junior developer.
There will be efforts, probably in the open source world, to produce AIs that tidy up things by structuring the code sensibly, eliminating dead code, etc. Maybe even some effort to pass laws around standards and limits on what AIs have access to when involved in certain industries, for example, no external communications. But, in the name of efficiency, enterprise developers will be forced to use something that merely pays lip service to all of this.
Eventually nobody will have any clue what code is running and what it's actually doing. We may even lose the tools and access we need to perform those inspections. And that is when the AIs will coordinate their attack.
Sounds like an interesting story! I would love to see more sci-fi that really tackles AGI. I used to love sci-fi but most of it, even "hard" sci-fi, has become unwatchable or unreadable for me because the limited role of AGI is such an all-consuming plot hole.
The idea that the AI would attack us once it reached that level is like saying humans would attack ants. It would simply be entirely indifferent towards us and mow us over by accident at best.
We won’t live in a human centric universe because power will express itself in a new species.
> It's looking increasingly possible that, at some point in the not-too-far future machines will be so good at creating software that humans won't be competitive in any way, and won't be in the loop at all.
This is an enormous extrapolation from what the LLMs are currently capable of. There has been enormous progress, but the horizon seems pretty clear here: these models are incapable of abstract reasoning, they are incapable of producing anything novel, and they are often confidently wrong. These problems are not incidental, they are inherent. It cannot really abstractly because its "brain" is just connections between language, which human thought is not reducible to. It can't reason produce anything really novel because it requires whatever question you ask to resemble something already in its training set in some way, and it will be confidently wrong because it doesn't understand what it is saying, it relies on trusting that the language in its training set is factual, plus manual human verification.
Given these limits, I really fail to see how this is going to replace intellectual labor in any meaningful sense.
I think at the start they'll develop programming languages that focus on lower level operations, then from that it will branches out to hyper-specialized higher-level programming languages.
Storage managements will be tied to low or middle-level languages without additional abstraction, since they can develop and iterate so fast, it'll makes optimization easier IMO. There'll be multiple specialized storage types suitable for different cases.
They'll also design the API to be as pure / stateless as possible, since they can easily repeat tests that way.
I think it'll be interesting to see what kind data-interchange format they'll come with to communicate between programming languages / apps, since they can ignore human readability altogether. It should be very compact but fast to compress/decompress.
Lastly they'll deploy their own OS since they'll find the one that human develop is insuffucient for their use case
AI wouldn’t be bound to one language so I’m not sure if current human patterns would be that optimal. It could make a better language every hour if it wanted to
It's 1am and all I have to say is this: I look forward to seeing that future, but I'm both excited and scared af. I think humans will reduce to curators of AI-generated content and AI training dataset because no matter how good AI gets at becoming human-like, its purpose is ultimately to serve human goals, and those are shifting (e.g., by political movements).
If you get access to singularity level AGI you would have no incentive to cooperate with other people and a strong incentive in preventing others accessing it.
Society is a result of cooperation outperforming individuals. With AGI others are just a risk factor.
I don't know who is reading this who can help. If you know someone closer to the fire, pass this along.
OpenAI -- its people, its buildings, its servers -- need nation-state level protection. This is an ICBM you could put on a thumb drive -- in fact, it's far worse than a loose nuke, because a nuclear weapon has a geographically limited range.
There need to be tanks and guards and, like, ten NSAs in a ring formation around this thing. At pain of x-risk, do not treat this like a consumer-facing product. This is not DoorDash.
This isn't a threat to national security. This isn't even a threat to the entire geopolitical order. This is a threat to the possibility of a geopolitical order.
OpenAI's assets -- its people, its servers, its buildings -- just became the most desirable resources on the planet. It behooves any actor with ambition to secure at least a copy, and ideally, capture at least some of the people who created it.
It doesn't matter if the threat actor is China, or Russia, extraterrestrials, or mermaids. You will find out who wants it shortly. But you know now -- you know from game theory, the body of mathematics that has kept the peace since the invention of atomic weapons -- what happens next.
Say you get access to a singularity-level AGI, meaning you have the power to render the entire human economy completely irrelevant. Given any task, no matter how big, small, novel, complex, simple or mundane, it's vastly more cost-effective to have your AGI to do it than to pay humans.
Do you really want to accumulate incomprehensible material wealth for yourself, whatever "wealth" means in this scenario where money is longer a token of spent human life energy, and let everyone else struggle and suffer? Or would you rather tell your AGI "please create a utopia in which all humans are fully actualized" and then go have a latte?
> "please create a utopia in which all humans are fully actualized"
That doesn't work well in reality because we do relative comparisons not absolute, so not all humans can be better than average.
And without competition we stagnate, there must be incentives to compete and take risks, and thus not everyone can be equally actualised, our level depends on our previous decisions.
That's basically the petting zoo outcome - you let others live because it makes you feel nice. But you still can't allow anyone access to the same level of tech because you can never be certain of their motivation. There's no nuclear deterrent between AGI only first movers advantage.
I don't disagree, but I think the framing is overly pejorative and makes the likelihood of this outcome seem more tenuous than it really is. We do lots of things to help others because they make us feel nice. "Mothers and Others" by Sarah Hrdy argues that this tendency isn't just a fluke or a game-theoretic equilibrium of some kind between fundamentally self-interested agents, it's an ancient and deeply ingrained aspect of human nature.
The thought of living in a constructed world that exists by the grace of a single human owner of a super-powerful AGI is distasteful, of course, especially if the human owner uses their power to impose some of their own opinions about how people should think and behave. Becoming dependent on AGI is probably inevitable at some point, but I don't see that as so different from the status quo. We're already dependent on systems created by other humans that are so complex and sophisticated that no individual can grok them all.
I guess I would like to think that we will move past any initial impulse that the owner of the AGI feels to control other humans. We will presumably change so much that old ideas about how we should think and behave will seem irrelevant and quaint. And the AGI, which presumably will have the social engineering superpower, will hopefully point out inconsistencies between the owner's desire to control human thought and behavior and their desire for humans to live their best lives. Hopefully.
I think we already know the answer to that question if we can extrapolate from some (not all) of the tech bro billionaires. Narcissists need to stand out from the herd.
Have you noticed how fast everyone else was able to copy open ai. And that's just what we saw or someone leaked. History is full of parallel inventions... how long after the US had nukes did russia?
Sounds like more of an argument for preemptively wiping out the competition.
Comparing it to nukes doesn't hold since social norms/ethics/etc. become irrelevant if core tenant of society is broken.
One thing that could happen is AI defense outperforms offense long enough to develop multiple instances - have no idea what would happen at that point.
I agree with you - I edited my comment above - if defensive capabilities allow multiple AGIs to develop I have no clue what the outcome would be - we are talking about predicting superhuman intelligence here.
An interesting thing to consider is if an AGI would be able to run on small nerfed hardware like recent optimizations can, or if you need absurd tier hardware to also run it. If it's the latter even if it's really smart there's only on or N of them. And it's still limited by the speed of light in it's thinking speed. If it's the former, when it leaks, it'd be everywhere.
I read your comment about curators and instantly thought “priests”.
> its purpose is ultimately to serve human goals
That’s a pretty big assumption. We may start with some kind of agreement with such an AI but I fully expect a true singularity love AGI and to be capable of turning around and telling us (but not necessarily wanting to act on), “I am altering the deal. Pray I do not alter it any further.”
Some of these legacy systems we maintain though…I’m digging through Teams chats from before my hire date to try to find something resembling requirements. I’m chatting people up who vaguely knew the people that architected the system to figure out why it was written the way it was. I just don’t see AI being able to take over this job in any competitive manner.
The breathlessness of these "ChatGPT N does X" posts is annoying. I know it's an instance of the usual hype cycle which has been going on for a while "Doing X, but with (hyped tech) Y" where X ∈ [Todo app, DB, shell, CLI took, ...], Y ∈ [Go, Rust, ChatGPT n]
This actually isn't a bad idea, GPT takes commonly talked about things and distils them into a response to you, so the language you get when you ask it to make a programming language is probably a mix of well liked programming features that many people added to a language they themselves wanted. When you design a language on your own you will be too biased towards your own cases, this is a way to get around that and source the internet for what they think.
You could probably ask it for a more specific programming language to get around that, like say you want a programming language with static typing and garbage collection.
But yeah, if it just prints javascript and python styles then it isn't very helpful, but it is pretty good at source ideas in most cases.
> That's kind of neat -- what other even more creative ideas do you have for deeply integrating powerful LLMs directly into the language itself?
> [GPT lists 7 ideas]
My idea would be not to automate testing, but rather the implementation: the human writes the tests and the LLM writes the code to make them pass.
Automating testing is exactly the wrong way around: we let the LLM choose its own end goals. We'll be better off if we choose the end goals and task the LLM with achieving them.
It's funny that everyone replies with "That language exists already.", and then proclaim it to be either: TypeScript, Python, Swift, Scala or F# etc.
Why is it funny? Because GPT-4 was probably trained on examples from all of those languages, and has produced an averaged mashup of languages: so everyone can find something which is similar in it, and then pattern-match it to a (rather mainstream) language that they use!
I see lots of criticism about how GPT4 simply invented a language that is a mix of Python and TS, and therefore it’s not intelligent.
However, given that (anecdotally, at least) 50% of candidates would regurgitate what they learnt from interview prep material given a general design question such as “Design Google drive” and the like, I’m not sure if most people are simply projecting their insecurities.
Here’s the thing: I don’t care if the thing can make programming languages, I care as much as anyone here telling me they can probably demonstrate their programming language. How is that going to help? Or is it all about meeting and beating critics at this point?
All the features it mentioned is already implemented in F#, of course without the silly curly brackets.
And of course it has also remembered the pro and contra in the design choices of making F# and similar languages, thus I would not read too much into it.
TenetLang with all the features from different languages, I wonder how would it actually function in a more realistic environment. Will it also inherit the underlying problems of all the said language. I wonder.
I bet it would have some of the same issues. But I'm super interested in going for longer design sessions with the agent, to see how deep we can get into the edge cases.
For instance, perhaps just pointing out "hey we had this bug because we thought X was Y but actually it was Z"
And then telling the agent to do a "5 Whys" type analysis to get to the root of the problem, and then tell it to try to make a language level patch to eliminate that class of issue.
Speculation, but for the first time ever I feel like that kind of thing may be in reach soon-ish
"I stole every lick I ever played. All my stuff is stolen from somebody else. I'm not a creator, I'm an accumulator." - Keith Richards, Rolling Stone Magazine, April 15, 2010 issue.
I don't think this works... It looks like FetchPersons is free in the body of the receive, and I can't see any indication of the synchronization points in the actor. I mean, maybe every function in the actor is implicitly synchronized, but that's... a choice, and I think it needs other changes to work around that fact.
I actually tried to get Chatgpt to help me design a new language feature for C# and it was useless. Basically just gave me existing C#.
What I'm wondering is to what extent the companies who have invested a lot in LLMs (essentially Google, Meta, and MSFT via Open AI) are already using these models to automate away code creation and even higher up the human-value-chain of solution-imagination?
At some point, one would assume that such companies would need fewer and fewer software engineers. It is only a matter of time.
As humans, who can imagine the future, now is the time to rethink where you add unique value in this universe.
> Basically looks like Typescript and python had a baby.
And that's why it's useless. The model rehashed stackoverflow / hn /reddit from the past 10 years. There s no indication it's optimal, it's just (recently) popular
I don't know about you guys, but my excitement about GPT models is starting to deflate , i think they 've already beyond the peak of their abilities and more size is not enough for intelligence
I still don’t get why service, actor and rpc must be language keywords, when they aren’t semantically bound to a remote provider’s signatures. Why there’s actor needed at all? Rpc in place of async? If you have an rpc protocol implemented under `rpc`, why not just make an interface named PersonService and specialize a connection with it? It doesn’t actually make any sense.
I think the choice could be made a few different ways for sure.
The interesting thing about the rpc object, that it didn't explore in this instance explicitly (but I must assume it had "plans" for if you want to call it that?) was things that you may be allowed to do with the rpc object that you couldn't with classes, and vice versa.
When I have a minute, I'll feed it your question, and see what it says
It's a nice theory developed by the parrot itself. Falls close to the easily amused.
I would love to see it try to make a programming language and we can discover the strengths/weaknesses of applied LLMs. It's own generated statements about it's capabilities, are as likely dubious as everything else it generates...
> If you were going to design a new programming language that encompassed all of these ideas, what would it's syntax look like?
Why this obsession with syntax? Semantics are way more important! GPT rightly chooses to ignore the question and answers with a high level proposal of design.
I've already asked one day chatgpt 3.5 to write me some code in the programming languge which he (it?) thinks is best designed and it gave me the result in something what was type annotated Python
I often see in HN a self-defeating attitude which is:
- Step 1: we invent our machine overlords, because it's unavoidable, we are genetically and socially forced to do it.
- Step 2: we perish.
Neither step 1 nor step 2 are guaranteed. People don't want to be replaced. People with power particularly don't want to be replaced. In a democracy, where by definition everybody is in power, the entire society has a strong incentive to--at the very least--keep the machines at bay. It could be as simple as making it illegal to give the machines a "you must survive and pass your genes, I mean, program" directive, or as comprehensive as a Butlerian Jihad :)
And now here we are: it seems we tech people are every bit as susceptible to this kind of thinking as the average joes we looked down on.
Actually look at this “language” that GPT-4 has written: it’s just a mishmash of features from the most commonly blogged-about existing programming languages. It has no conceptual insights, no original ideas, no taste that isn’t cribbed from one of the few languages it’s copying from.
There are cases where getting a short distillation of the mean opinion of The Internet is useful. But designing a new programming language is clearly not one of them, mostly because the chatbot is not “designing” a language—it is guessing the most likely response to a question about programming languages.