Skip to content

Neuro-symbolic (NeSy) AI improves deep learning by integrating reasoning, prior knowledge, and constraints, making it in theory ideal for high-stakes applications. However, this promise depends on learning high-quality abstractions, which is challenging due to potential reasoning shortcuts. This project explores this problem in depth.

Notifications You must be signed in to change notification settings

Tobi-Tob/CausalAbstraction

Repository files navigation

Causal Abstractions of NeSy models. Where are the concepts?

Python

🚀 Overview

Understanding how neural models encode concepts internally is a central challenge in explainable AI. While Neuro-Symbolic (NeSy) models are designed to improve interpretability, they can still rely on reasoning shortcuts rather than learning meaningful abstractions. In this work, we analyze this phenomenon in DeepProbLog (DPL) models applied to the MNIST-Addition task, which requires both visual perception and reasoning. Using Causal Abstraction theory and Distributed Alignment Search (DAS), we investigate whether these models can be described by a high-level interpretable reasoning process and where they encode abstract concepts. Our findings reveal that architectural choices strongly influence the reliability of internal concept encodings, offering insights into which reasoning shortcuts may occur and into how abstract concept learning can be improved in NeSy models.

📌 Key Takeaways

DeepProbLog models are not immune to reasoning shortcuts (RS).

We observe concept flipping and collapse as RS (unintended optima of the learning objective where models achieve high accuracy while leveraging spurious correlations rather than meaningful abstractions).

Enforcing Disentanglement in the concept Encoder helps to mitigate RS.

Concept information is not evenly distributed across latent dimensions.


📂 Directory Structure

.
├── backbones/             # Definition of the image encoders
├── data/                  # Directory to save the data e.g. MNIST
├── datasets/              # Code for the datasets
├── models/                # Definition of the model architecture
├── trained_models/        # Saved trained models
├── utils/                 # Additional helper functionality 
│
├── abstraction_models.py  # Definition of the causal abstraction that is used
├── DAS.py                 # Code to run distributed alignment search
├── main.py                # Training for different models to solve the task
├── visualizer.py          # Visualization of model behavior and DAS alignment
├── wrapped_models.py      # Code to wrap models to work with pyvene and the DAS implementation
│
├── Causal_Abstraction.pdf # Project report summary
└── environment.yml        # Required dependencies

This project is an extension on this implementation of reasoning shortcuts

About

Neuro-symbolic (NeSy) AI improves deep learning by integrating reasoning, prior knowledge, and constraints, making it in theory ideal for high-stakes applications. However, this promise depends on learning high-quality abstractions, which is challenging due to potential reasoning shortcuts. This project explores this problem in depth.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages