Grab a classroom of children and ask them all to draw a nine-pointed star. EVERY SINGLE child, irrespective of their artistic proficiency, will have zero issues.
Those children also didn't need millions of training samples/data of stars with nine points on them. They didn't need to run in a REPL, look at the picture, and say, "Oh darn the luck, it seems I've drawn a star with 8 points. I apologize, you're absolutely right, let me try again!", and lock themselves in a continuous feedback loop until they got it correct either which incidentally is a script that I put together to help improve the prompt adherence of even the SOTA models like Imagen4 and gpt-image-1. (painfully slow and expensive)
Lots of kids will get this wrong, I don’t know what age you’re thinking of here. They need years of direct coaching to get to words, what stars are, how to hold and move a pen, how to count…
Comparing physical drawing to these models is frankly daft for an intelligence test. This is a “count the letters” in image form.
Those children also didn't need millions of training samples/data of stars with nine points on them. They didn't need to run in a REPL, look at the picture, and say, "Oh darn the luck, it seems I've drawn a star with 8 points. I apologize, you're absolutely right, let me try again!", and lock themselves in a continuous feedback loop until they got it correct either which incidentally is a script that I put together to help improve the prompt adherence of even the SOTA models like Imagen4 and gpt-image-1. (painfully slow and expensive)