purplecat: A painting of Alan Turing (General:AI)
[personal profile] purplecat
I am helping to supervise a PhD student in the area of Ethics and Natural Language Generation. Originally he was looking into generating explanations for ethical judgements but about 6 months into his PhD chatGPT happened and out-performed all the stuff he'd been doing out of the box. So we pivoted into improving the quality of explanations produced by LLMs.

We have a database of statements e.g., "the boy crushed the frog" and we give them to an LLM, prompting it to say why the action is unethical - in this case it violates a principle of not harming living creatures because crushing causes harm and a frog is a living creature. We then prompt the LLM to cast this explanation into a specific logical form that allows us to give it to a programming language called Prolog to check correctness. If Prolog doesn't class it as correct then the error message is sent back to the LLM as prompt to improve the explanation.

Frankly, I'm amazed that any of this works, which, admittedly it only does about half the time. It also suffers from the issue that if one of the initial facts generated by the LLM is wrong (for instance if it stated that a frog was not a living creature), then Prolog wouldn't catch this.

We've now moved on to using something much stronger than Prolog (a theorem proving tool called Isabelle) for checking explanations, but the results of the initial system are available open access and can be found here. My input has, admittedly mostly consisted in explaining how Prolog and Isabelle work and critiquing some of the formalisation the LLMs come up with

(no subject)

Date: 2024-08-22 02:41 am (UTC)
vivdunstan: Part of own photo taken in local university botanic gardens. Tree trunks rise atmospherically, throwing shadows from the sun on the ground. (Default)
From: [personal profile] vivdunstan
Belatedly replying to say I've read the open access paper now. And as I did I was really wondering how that fuzzy unification matching based on predicate names could work to produce reliable results for the purposes sought. Though to be fair it's 30+ years since I studied theorem proving and Prolog with Roy Dyckhoff! However I am a bit encouraged by your comments above. I maybe wasn't completely losing the plot. I'd be interested to see any future writeup re the version using Isabelle. Thanks for the interesting read!

(no subject)

Date: 2024-08-22 11:06 am (UTC)
vivdunstan: Part of own photo taken in local university botanic gardens. Tree trunks rise atmospherically, throwing shadows from the sun on the ground. (Default)
From: [personal profile] vivdunstan
Excellent!

Profile

purplecat: Hand Drawn picture of a Toy Cat (Default)
purplecat

June 2025

S M T W T F S
1234567
8 9 1011 12 13 14
15161718192021
22232425262728
2930     

Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags