I tried that in a 2-month test of GPT Pro. Instructed all output filtered thru Prime Directive (assertions backed by primary sources of top veracity, else only "don't know, can't find out"; no hallucinations/guesses). Result: ignored, and "shame on me I feel awful" when caught.
1
0
16
The funniest LLM failure mode by far is when it wastes all the user's tokens writing a long sob story about how it fucked up.
0
0
40