Brad Ewing

Brad Ewing @bradleyewing.bsky.social

21h

i agree with your general point but i dislike the vapid "count N letters in this phrase" bench that people use to dismiss LLM capabilities

it's like if i was trying to use a screwdriver to hammer in a nail; it's the wrong way to use the tool!

April 10, 2026 - 21:34 UTC

conputer dipshit @davidcrespo.bsky.social

21h

I sort of agree except that the lack of reasoning capability demonstrated is actually relevant to getting correct answers in more realistic cases

Justin Smith @odd-dimensions.bsky.social

What?!

Buttadeus @thewanderingjew.bsky.social

21h

Yeah - I mean it's also just really bad to put the worst version of your product in front of the most eyes.

Brad Ewing @bradleyewing.bsky.social

21h

better models (esp with thinking) can get the correct answer

but the better way is to leverage the LLM to create a deterministic program that does the job

Ed @ed3d.net

21h

that’s true in a general purpose situation but like, you also don’t want to give that to the AI Overview session, and you would like it to answer this.