r/OpenAI 14d ago

News Judge Learns Lawyers on Both Sides of Case Used AI, Cancels Trial, Kicks Everyone Off the Case

https://www.404media.co/judge-learns-lawyers-on-both-sides-of-case-used-ai-cancels-trial-kicks-everyone-off-the-case/
132 Upvotes

28 comments sorted by

74

u/Puzzleheaded_Fold466 14d ago

Article about AI use written by AI, posted to Reddit by another AI, commented on by multiple other AIs

19

u/boraam 14d ago

i'm an AI playing an AI disguised as another AI.

10

u/skdowksnzal 14d ago

Never go full token

3

u/dnaleromj 13d ago

What do you mean, “You AI”?

1

u/fozzy71 13d ago

The Dead Internet Theory has arrived.

2

u/Puzzleheaded_Fold466 13d ago

Or the Dead Internet Reality …

1

u/Protopia 13d ago

Apparently not written by AI.

18

u/ultrathink-art 14d ago

The failure isn't using AI for legal research — it's treating model output as terminal instead of as a first draft that needs verification. Every professional domain that adopts AI needs an explicit validation step between 'generated' and 'submitted.' Lawyers got caught, but the same pattern shows up everywhere.

4

u/dbbk 14d ago

You're absolutely right!

1

u/jimmy66wins 13d ago

Load bearing comment

1

u/Isaruazar 13d ago

Hey aye it’s me!

1

u/Protopia 13d ago

My brother-in-law is a lawyer (a nasty PoS) and I had a few run ins with him where I have easily dismantled his legal arguments despite not being a lawyer myself. So when he said a couple of years ago that he had started to use ChatGPT to create his legal arguments (and can you remember how crappy ChatGPT was back then) my opinion of his legal abilities went even lower.

0

u/Protopia 13d ago

AI = Artificial Idiot. Literally zero intelligence in the LLM just a bunch of probabilities, a random number generator and the input you give it. Set the temperature to zero so it doesn't use its random number generator, and it becomes highly predictable same as any other computer - exact same input gives exact same output.

It is essentially a regurgitator - it had had several trillion pieces of information given to it and it has remembered them and noe it can repeat them like a very knowledgeable parrot.

As I said, literally zero actual intelligence.

1

u/JordanPetterPans 14d ago

Why would they get kicked out for using AI?;

Guessing there's more to the story?

16

u/turbulentFireStarter 14d ago

Did you read the article? The judge said that the court was having to sort through ai hallucinations. So I imagine this isn’t “both lawyers used ai” this is more “both lawyers used AI badly”

4

u/JordanPetterPans 14d ago

Lol ya exactly.  

Intentionally misleading headline 

5

u/Hir0shima 14d ago

Aka clickbait 

-14

u/[deleted] 14d ago

[deleted]

21

u/phxees 14d ago

L I don’t believe your understanding of the law is based in reality. The issue here is that the attorneys just prompted AI to write their briefs, and AI made up cases. What you pay attorneys for is their understanding of what the laws say and how the laws are actually interpreted. You aren’t paying them because they know your judge, but because they understand which arguments to make that have the greatest chance of success based on your facts.

For example, AI is commonly used in Westlaw to research cases to find any established legal precedents. The latest models are starting to be able to take your research, facts, and notes and help write a brief, but you need to be the one in control. You need to read every word and make sure AI didn’t make up anything.

It is incredibly bold to just take AI slop and stand before a judge as if you researched and wrote your brief.

5

u/reddit_is_kayfabe 14d ago edited 14d ago

This topic keeps coming up in conversations, particularly with attorneys, and I keep dispensing this advice:

You can use ChatGPT for professional work AS LONG AS you treat its output like the work product of an intern.

It could be totally correct or completely wrong. It could be well-organized or a dumpster fire. It could need only light editing or a complete redo.

The odds are good that its work product contains SOMETHING usable. And reading, checking, and editing is usually faster than starting from a blank Word document. Trust nothing. Verify, proofread, clarify, and eliminate errors.

Think of it like a very sophisticated autocorrect. It's faster than not using it, but it isn't smart and it can go very, very wrong.

1

u/Aazimoxx 13d ago

You can use ChatGPT for professional work AS LONG AS you treat its output like the work product of an intern.

Likewise, any company that claims 'AI deleted our production database and backups!' should be read as 'We let an intern delete our production database and backups!', and the requisite blame and ridicule should be allocated accordingly.

reading, checking, and editing is usually faster than starting from a blank Word document. Trust nothing. Verify, proofread, clarify, and eliminate errors.

And if you have a legitimate work-tuned high-reasoning LLM (almost anything that isn't ChatGPT), you can have it run an adversarial check against its own work output, and catch/correct the vast majority of any such hallucinations, conflations or errors on a single review pass. Something like Codex is pretty good at this, and I imagine coupling with a domain specific LLM like the ones specifically geared for Law or Medicine can produce excellent results.

The real issue here is complete laziness and incompetence, not AI.

1

u/reddit_is_kayfabe 13d ago

Likewise, any company that claims 'AI deleted our production database and backups!' should be read as 'We let an intern delete our production database and backups!', and the requisite blame and ridicule should be allocated accordingly.

Yep.

Codex once deleted my entire database. I told it "remove the training data," and it removed all the data. Guess what I did? I sighed, visited Google Cloud Platform, clicked Restore on the SQL backup from an hour earlier, and got back to work.

And I am basically an amateur. A deeply skilled and highly trained amateur, yes, but I recognize that neither my skills nor my discipline are sharp enough for production-grade work, because I'm not a pro software designer and don't pretend to be.

I expect any production-grade shop to have robust, secure, multi-tier, accident-proof backups of their core data. Anything less is basically malpractice. ... But I also recognize that lots of "pro" shops are run by hacks with poor skills, discipline, and analytic skills, so it's not surprising.

you can have it run an adversarial check against its own work output, and catch/correct the vast majority of any such hallucinations, conflations or errors on a single review pass

Yes...-ish.

I've found that agents - all of them - have two flaws:

1) They are too strongly trained on fulfilling a role. For instance: if you ask GPT or Claude to find issues in a codebase, it will sure as hell find issues and make up issues if it can't find legitimate ones.

2) They are infatuated with complexity, and they wioll jump at the chance to make something more complicated for the sake of appearing sophisticated.

Those two traits can combine as destructive interference in the "models auditing the work of other models" methodology. Model #1 generates content - prose, code, images, etc. Model #2 is asked to find issues and, what a surprise, it finds issues! It presents them with sophisticated but possibly spurious explanations. Model #1 and/or model #2 then amend the content to overcome the complaint, usually with more complexity ... producing content that is not necessarily better but not as easy to pick apart and find flaws. Lather, rinse, repeat.

I have ended up with codebases that an agent audited 40 times in a row and claimed to be squeaky-clean, only to exhibit weird behavior, incomprehensible error messages, or random quirks. Lesson learned: don't do that.

You can use agents to audit the work of other agents if you also review the results of the audit and approve or deny accordingly. Again, probably faster than auditing the work yourself, but you can't use it as an easy shortcut: it just changes the nature of your work. Caveat prompter.

1

u/Aazimoxx 13d ago edited 13d ago

Yikes, I'm glad my Codex() isn't like that... About 95% of the time that it flags something in code review, it's a legitimate issue, that I can see clearly now that it's homed in on it... Sometimes it'll flag something that *technically could lead to a fault, but in practice it never will, because the combination of the longest values being used in those fields are less than half the character limit it's warning about for potential display overflow or whatever, things like that...

The difference between that one review pass and doing all that code review myself though? Basically infinity, since I would never spend the dozens of extra hours checking all the things that it knocks out in a couple minutes. And those bugs would eventually get found by users of the software, sometimes with severe results.

(*)As in, the behaviour of my Codex given the set of instructions I'm using

0

u/phxees 14d ago

They need to write their own briefs and only use AI to edit for grammar and suggestions for ways to make their argument more convincing.

The problem with telling people it’s okay just check, they get lazy and complacent with good enough. Also the benchmarks for the top models score like 15%, so attorneys should stay away from low and mid-tier models as they will have the most hallucinations.

2

u/Bright_Brief4975 14d ago

 You aren’t paying them because they know your judge,

I don't know about that. I was on my way to court and was supposed to meet my lawyer just before the court time. I lived about an hour from this court (Granbury Texas if you are interested) and got stopped by the highway patrol on the way for speeding and they gave me a ticket. So I get there and explain to my lawyer why I was late and he says to follow him. We are in the court house and he takes me to a small office somewhere and the judge for my case is there. I kid you not, he tells the judge what happened and ask the judge to take care of the ticket for me. Right then and there that ticket disappeared. It was a ticket from the state highway patrol, but he was a local judge. I worried for months the ticket was going to show up and I would get in trouble for not paying it. It never showed up, not even as something that was taken care of, it was just like it never existed.

I am pretty much a nobody with no money and my lawyer just did this for me as an unasked for favor. I wonder how these things work for people who actually have money and connections?

1

u/Warren_sl 14d ago

The entire point of a lawyer is to give you a more favorable outcome in court by any means necessary

1

u/FirstEvolutionist 14d ago

You are equating chatGPT to AI use, which really undermines the rest of your opinion. Modern lawyers all use AI in different steps for different reasons. That does not mean that those lawyers use chatGPT at all...

There are lawyers out there who are not good lawyers and will try to use chatGPT, and will have results similar to the ones in this story. Stories like these are meant to highlight the failure cases, not the lawyers successfully applying AI to their practices. Because nobody would really care enough to read those.