Im curious what exactly you ask here. I consider myself to be a decent engineer (for practical purposes) but without a CS degree, and I might likely have not passed that question.
I know compilers can do some crazy optimizations but wouldn't have guessed it'll transform something from O(n) to O(1). Having said that, I dont still feel this has too much relevance to my actual job for the most part. Such performance knowledge seems to be very abstracted away from actual programming by database systems, or managed offerings like spark and snowflake, that unless you intend to work on these systems this knowledge isn't that useful (being aware they happen can be though, for sure).
He thinks it makes him look clever, or more likely subtlety wants people to think "wow, this guy thinks something is obvious when Matt Godbolt found it surprising".
This kind of question is entirely useless in an interview. It's just a random bit of trivia that either a potential hire happen to have come across, or happens to remember from math class.
I guess what's surprising here is that compilers are able to perform those optimizations systematically on arbitrary code, not the optimizations themselves, which should be obvious to a human.
Have you considered that maybe Matt isn’t all that surprised by this optimization, but he is excited about how cool it is, and he wants readers of all backgrounds to also be excited about how cool it is, and is just feigning surprise so that he can share a sense of excitement with his audience?
Whether they get the question exactly right and can pinpoint the specific compiler passes or algebraic properties responsible for reductions like this is totally irrelevant and not what you’re actually looking for or asking about. It’s a very good jumping point for a conversation about optimization and testing whether they’re the type of developer who has ever looked at the assembly produced in their hotpath or not.
Anyone who dumbly suggests that loops in source code will always result in loops in assembly doesn’t have a clue. Anyone who throws their hands up and says, “I have no idea, but I wonder if there’s some loop invariant or algebraic trick that can be used to optimize this, let’s think about it out loud for a bit” has taken a compiler class and gets full marks. Anyone who says, “I dunno, let’s see what godbolt does and look through the llvm-opt pane” gets an explicit, “hire this one” in the feedback to the hiring manager.
It’s less about what they know and more about if they can find out.
So in other words, it isn't "basic and essential optimizations" that you would expect even a junior engineer to know (as your comment implies), but a mechanism to trigger a conversation to see how they think about problems. In fact, it sounds like something you wouldn't expect them to know.
I didn’t write the GP comment. I wouldn’t call this basic and essential, but I would say that compilers have been doing similar loop simplifications for quite some time. I’d expect any mid to senior developer with C/C++ on their resume to at least consider the possibility that the compiler can entirely optimize away a loop.
> In fact, it sounds like something you wouldn't expect them to know.
I’d go a step further, I don’t think anyone, no matter how experienced they are, can confidently claim that optimized assembly will or won’t be produced for a given loop. That’s why the best answer above is, “I dunno”. If performance really matters, you have to investigate and confirm that you’re getting good code. You can have an intuition for what you think might happen, and that’s a useful skill to have on its own, but it’s totally useless if you don’t also know how to confirm your suspicions.
My question is in the context of doing those optimizations yourself, understanding what can be done to make the code more efficient and how to code it up, not the compiler engineering to make that happen.
Yikes, gross. That’s like an option of last resort IMO. I’d rather maintain the clean loop-based code unless I had evidence that the compiler was doing the wrong thing and it was in my critical path.
The compiler is only able to perform certain optimizations that have no observable behaviour.
For example it can only parallelize code which is inherently parallelizable to begin with, and unless you design your algorithm with that in mind, it's unlikely to be.
My belief is that it's better to be explicit, be it with low-level or high-level abstractions.
My interview aims to assess whether the candidate understands that the dependency of each iteration on the previous one prevents effective utilization of a superscalar processor, knows the ways to overcome that, and whether the compiler is able to optimize that automatically, and if so when it absolutely cannot and why.
I generally focus more on sum of arbitrary data, but I used to also ask about a formulaic sum (linear to constant time) as an example of something a compiler is unlikely to do.
My thinking is that I expect good engineers to be able to do those optimizations themselves rather than rely on compilers.
Cancer isn't caused by proteins in the way you might think. Its definitely not infectious at the protein level. You could ask if this disruption spreads out cancer cells themselves and that would be fair to ask. But then the cancer cells were already in your body and were likely trying to migrate to other sites anyway.
The success of surgery to remove solid tumors usually hinges on whether there are "clean margins," meaning they were able to remove all the bad tissue and a little good surrounding tissue just to be sure. It's likely that the same principle applies using this new procedure: if you blast the whole thing and trust the body to clean up the mess, hopefully there won't be anything left to worry about.
Isn't it better to put it in an agent loop, with the structured output json just specified as a tool? The function call can then just return a summary of the parsed input. We can add in the system prompt a validation step to ask the llm to verify it has provided inputs correctly. This will allow the llm itself to self reflect and correct if needed.
Until this administration there was no mandate to move manufacturing home, and importantly why would any company forgo significant profit to match an ideological framework, unless the ideology is what they sell or market?
I have built several agents based on OpenAI now that are running real life business tasks. OpenAI's tool calling integration still beats everyone else (in fact it did from the very beginning), which is what actually matters in real world business applications. And even if some small group of people prefer Anthropic for very specific tasks, the numbers are simply unfathomable. Their business strategy has zero chance of working long-term.
In writing code, from what I've seen, Anthropic's models are still the most widely used. I would venture that over 50% of vibe coded apps, garbage though they are, are written by Claude Code. And they capture the most market in real coding shops as well, from what I've seen.
What data are you basing your assumption on? OpenRouter? That itself is only used by a tiny fraction of people. According to the latest available numbers, OpenAI has ~800x more monthly active users than OpenRouter. So even if only 0.5% of them use it for code, it will dwarf everything that Anthropic's models produce.
As someone who switches between most CLIs to compare, Amp is still on top, costs more, but has the best results. The librarian and oracle make it leagues ahead of the competition.
I don't understand how people use these tools without a subscription. Unless you are using it very infrequently paying per token gets costly very fast.
Could you please share a little on why it's noticeably better than Claude Code on a sub (or 5? I mean, sometimes you can brute force a solution with agents)?
I think it’s great but also pricey. Amp like Claude Code feels like a product used by the people that build it and oddly enough that does not seem to be the case for most coding agents out there.
https://www.askmodu.com/rankings independently aggregates traffic from a variety of agents and amp consistently has the highest success rate for small and large tasks
But my first thought looking at this is that the numbers are probably skewed due to distribution of user skill levels, and what types of users choose which tool.
My hypothesis is that Amp is chosen by people who are VERY highly skilled in agentic development. Meaning these are the people most likely to provide solid context, good prompts, etc. That means these same people would likely get the best results from ANY coding agent. This also tracks with Amp being so expensive -- users or companies are more likely to pay a premium if they can get the most from the tool.
Claude Code on the other hand is used by (I assume) a way larger population. So the percentage of low-skill users is likely to be much higher. Those users may still get value from the tool, but their success rate will be lower by some factor with ANY coding agent. And this issue (if my hypothesis is correct) is likely 10x as true for GitHub Copilot.
Therefore I don't know how much we should read into stats like the total PR merge success percentage, because it's hard to tell the degree of noise caused by this user skill distribution imbalance.
To be honest, I've gave it a try a couple of times, but it's so expensive I'm having a hard time even being able to judge it fairly. The first time I spent just $5, second $10 and the third time $20, but they all went by so fast I'm worried even if I find it great, it's way too expensive, and having a number tick up/down makes me nervous or something. And I'm the type of person who has ChatGPT Pro so I'm not exactly stingy with paying for things I find useful, but there is a limit somewhere and I guess for me Amp is that.
It sounds like you're being temporarily stingy due to having ChatGPT Pro. Might be good to get rid of it if you think the grass might be greener outside of Codex.
No, ChatGPT Pro was an example that I'm not stingy to pay for things I find useful. I'm also paying for Gemini, Claude and other types of software to do my job, not even just coding. But even if I do, I still find Amp too expensive to be able to use for anything useful.
I run every single coding prompt through Codex, Claude Code, Qwen and Gemini, compare which one gives me the best and go ahead with using that one. Maybe I go with Codex 60% of the times, Claude 20% and Qwen/Gemini the remaining 20%, not often at all either of them get enough right. I've tried integrating Amp into my workflow too, but as mentioned, too expensive. I do think the grass is currently the greenest with Codex, still.
It depends on your perspective. From a startup perspective, this makes you a less interesting potential customer, to which one might attach the term stingy. From a perspective of willingness to invest in your own productivity it doesn't sound stingy, though.
It was really good in early stages (this past summer). But that was before Claude Code and Codex took off big time. I would say the biggest downside of Amp is that it’s expensive. Like running Opus the whole time expensive. But they don’t have their own model so what are you really paying for? Prompts? Not so sure. Amp is for people who are not smart enough to roll their own agents. So in that case, they shouldn’t be using agentic workflow.
While we can agree that adding AI just to tick a box will win no awards, it will be a laughable proposition to suggest that Apple doesn't need to do anything on AI.
If anything its laughable and points to the unoriginality of product creators that we haven't fundamentally transformed how we interact with technology given how much AI offers as functionality. Anyone (I'll bet 20% on Ive) who figures this out will eat Apple's dinner.
If you're giving 5:1 against Ive I'll take that in a heartbeat. He has zero historical record to show he can somehow capitalize on AI; even his design contributions are overall meh. Apple will have to do "something" but the beauty of mountains of cash and a business that doesn't need AI everywhere is that they can wait and see what something is, and then execute. They've actually been really good at figuring out implementation after the conceptual heavy lifting is done; it's deep in their DNA
"trained on our public and internal docs" trained how? Did you mean fine-tuned haiku? Did you actually fine tune correctly? Its not even a recommended architecture.
Or did you just misuse basic terminology about LLMs and are now saying it misbehaved, likely because your org did something very bad with?
Its a 2 day project at best to create your own bespoke llm as judge e2e eval framework. Thats what we did. Works fine. Not great. Still need someone to write the evals though.
I know compilers can do some crazy optimizations but wouldn't have guessed it'll transform something from O(n) to O(1). Having said that, I dont still feel this has too much relevance to my actual job for the most part. Such performance knowledge seems to be very abstracted away from actual programming by database systems, or managed offerings like spark and snowflake, that unless you intend to work on these systems this knowledge isn't that useful (being aware they happen can be though, for sure).
reply