
2025-01-22
Combining Logic Programming with Modern Language Models
An exploration of integrating Prolog's logical reasoning capabilities with modern language models to enhance rule consistency and knowledge management.
I recently spent some time exploring an idea that had been bouncing around in my head: what if we could combine “old-school” logical programming with modern language models? Here’s what I tried and what I learned along the way.
The Starting Point
Imagine you have a company policy document that reads:
“Our company provides a comprehensive travel expense policy to ensure fair reimbursement while maintaining cost efficiency. Employees can claim reimbursement for necessary business travel expenses. Economy class flights are standard for trips under 6 hours, while business class is permitted for longer flights. Hotel accommodations should not exceed $200 per night in major cities and $150 in other locations. Daily meal allowance is set at $75 in major cities and $50 elsewhere.
The company car policy allows managers and senior staff to lease vehicles through our corporate program. Junior managers qualify for vehicles up to $35,000 in value, while senior managers can select vehicles up to $50,000. Department heads and executives may choose vehicles up to $75,000. All vehicles must be replaced every 4 years or at 100,000 miles, whichever comes first.
Our annual leave policy grants all employees 20 days of paid vacation annually. After each year of service, employees earn an additional day of leave, up to a maximum of 30 days per year. Employees must take at least 10 days of vacation each year, and can carry over up to 5 unused days to the next year. Special arrangements for extended leave may be considered for long-term employees with over 10 years of service.”
Now, if you’ve worked with language models, you know they’re great at understanding this kind of text. But they can sometimes be inconsistent when you ask them detailed questions about rules and policies. One minute they might say the junior manager car limit is $35,000, the next $50,000.
A Blast from the Past: Enter Prolog
This is where I thought of Prolog, a fascinating piece of AI history. Back in the 1970s and 80s, when researchers were first trying to make computers “think,” they created Prolog as part of what we now call symbolic AI. The idea was simple but powerful: instead of training on massive amounts of data like modern AI, these systems worked with explicit rules and logic.
Think of Prolog like a very literal-minded detective. You give it facts like “Socrates is a man” and rules like “all men are mortal,” and it methodically figures out that “Socrates is mortal.” What made it special was its ability to chain these logical steps together automatically. During its heyday, Prolog powered everything from medical diagnosis systems to corporate knowledge bases.
Looking at today’s language models, I noticed they sometimes struggle with exactly what Prolog was good at - being completely consistent when applying rules. While ChatGPT might give you slightly different answers about company policies each time you ask, Prolog would give you the exact same answer every time - as long as the rules haven’t changed.
This got me thinking: what if we could combine the best of both worlds? Use language models for their amazing ability to understand human text, but hand off the logical reasoning to Prolog? It would be like giving our AI system both intuition and logic.
The Experiment
I wrote a Python program that does a few things:
- Takes in policy documents
- Uses a language model to convert the policies into Prolog facts and rules
- Stores these in a Prolog knowledge base
- Lets you ask questions that get answered using this knowledge
The results were interesting. The system consistently reported the senior manager car value limit as 75,000 and spotted inconsistencies in the policy document, like two different car value limits for junior managers (35,000 and 50,000). For holiday calculations, it correctly implemented and applied rules like the 25-day allowance after 5 years of service, even creating rules on the fly when needed.
Key Findings and Challenges
Building this experimental system revealed several interesting insights about combining logical programming with modern AI:
The Translation Challenge
One of the most fascinating challenges was getting the language model to reliably convert natural language into Prolog rules. While LLMs are great at understanding policy documents, they sometimes struggle with the precision required for logical programming. It’s like asking someone to translate poetry into mathematics - the meaning needs to stay exactly the same, but the languages work very differently.
The Consistency Trade-off
Using Prolog as a knowledge store helped maintain consistency - when a rule was defined, it was applied the same way every time. However, this came with a trade-off: we lost some of the flexibility that makes language models so powerful. For example, while the system could reliably tell you that a junior manager’s car allowance was exactly $35,000, it might struggle with fuzzy questions like “what’s a reasonable car for a junior manager?”
The Verification Problem
One surprising discovery was how important verification became. When you’re dealing with rules that can interact in complex ways, you need to check not just that each rule is correct, but that they all work together correctly. The system caught conflicting car value limits that might have gone unnoticed in a traditional document.
Time and Performance Insights
Performance turned out to be trickier than expected. While Prolog is lightning-fast at applying rules, the constant back-and-forth between the language model, Python, and Prolog created some interesting challenges. It’s a reminder that combining different technologies always comes with integration costs.
The Error Handling Evolution
Perhaps the most practical learning was about error handling. Prolog’s traditional error messages are famously cryptic, and language models can be overconfident. I had to develop a whole new approach to explaining what went wrong in a way that would make sense to users.
What I Learned
This experiment showed some interesting possibilities. The language model is great at understanding the natural language policies and converting them to logical rules. Prolog is great at consistently applying these rules and spotting conflicts.
But it’s not all perfect. Sometimes the language model struggles to convert complex policies into Prolog rules. And Prolog can be quite rigid — it needs rules to be very precisely stated. However, this can be mitigated by either fine-tuning a model to use Prolog or giving it more context about Prolog coding.
Where This Could Go
I think this kind of hybrid approach — combining different AI techniques — is worth exploring more. It’s not about replacing modern AI with older approaches, but about finding useful ways to combine them.
Some ideas for taking this further:
- Making it easier to update rules when policies change
- Adding a simple web interface for non-technical users
- Handling more complex policy scenarios
- Improving how the language model converts text to Prolog rules
Final Thoughts
This experiment sits within a broader exploration I’ve been conducting into knowledge management for language models. We’ve seen the rise of RAG systems that help LLMs access and reason over large document collections. We’ve worked with vector databases that can find semantically similar content. We’ve even seen hybrid approaches using SQL databases for structured data queries.
But there’s something uniquely interesting about using Prolog here. Unlike these other approaches, Prolog doesn’t just store and retrieve information - it can actively reason about it using logical rules. This experiment suggests there might be a place for logical reasoning engines alongside our modern embedding-based retrieval systems.
Perhaps the future of AI isn’t just about building bigger models or better embeddings. Maybe it’s also about rediscovering and repurposing tools from AI’s past, finding new ways to combine symbolic and neural approaches. After all, human intelligence combines many different types of reasoning - why shouldn’t artificial intelligence do the same?
I’d love to hear from others working on knowledge management for LLMs. Have you experimented with combining different approaches? What other tools from AI’s history might deserve a second look in the age of large language models? Let’s keep exploring these possibilities together.