I don't understand the "solutions" to telling the AI not to generate something is adding additional hidden prompts.

Like this should be a blacklist, not a "pray the AI looks up code vulnerabilities, understands it's an exploit, and correctly detects when a using is wanted to generate said exploit."