Causal Abstractions of NeSy models. Where are the concepts?

🚀 Overview

Understanding how neural models encode concepts internally is a central challenge in explainable AI. While Neuro-Symbolic (NeSy) models are designed to improve interpretability, they can still rely on reasoning shortcuts rather than learning meaningful abstractions. In this work, we analyze this phenomenon in DeepProbLog (DPL) models applied to the MNIST-Addition task, which requires both visual perception and reasoning. Using Causal Abstraction theory and Distributed Alignment Search (DAS), we investigate whether these models can be described by a high-level interpretable reasoning process and where they encode abstract concepts. Our findings reveal that architectural choices strongly influence the reliability of internal concept encodings, offering insights into which reasoning shortcuts may occur and into how abstract concept learning can be improved in NeSy models.

📌 Key Takeaways

✅ DeepProbLog models are not immune to reasoning shortcuts (RS).

✅ We observe concept flipping and collapse as RS (unintended optima of the learning objective where models achieve high accuracy while leveraging spurious correlations rather than meaningful abstractions).

✅ Enforcing Disentanglement in the concept Encoder helps to mitigate RS.

✅ Concept information is not evenly distributed across latent dimensions.

📂 Directory Structure

.
├── backbones/             # Definition of the image encoders
├── data/                  # Directory to save the data e.g. MNIST
├── datasets/              # Code for the datasets
├── models/                # Definition of the model architecture
├── trained_models/        # Saved trained models
├── utils/                 # Additional helper functionality 
│
├── abstraction_models.py  # Definition of the causal abstraction that is used
├── DAS.py                 # Code to run distributed alignment search
├── main.py                # Training for different models to solve the task
├── visualizer.py          # Visualization of model behavior and DAS alignment
├── wrapped_models.py      # Code to wrap models to work with pyvene and the DAS implementation
│
├── Causal_Abstraction.pdf # Project report summary
└── environment.yml        # Required dependencies

This project is an extension on this implementation of reasoning shortcuts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Causal Abstractions of NeSy models. Where are the concepts?

🚀 Overview

📌 Key Takeaways

📂 Directory Structure

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
backbones		backbones
data		data
datasets		datasets
models		models
trained_models		trained_models
utils		utils
Causal_Abstraction_of_NeSy_models.pdf		Causal_Abstraction_of_NeSy_models.pdf
DAS.py		DAS.py
README.md		README.md
abstraction_models.py		abstraction_models.py
environment.yml		environment.yml
example_equality_model.py		example_equality_model.py
example_mnist_add_model.py		example_mnist_add_model.py
main.py		main.py
visualizer.py		visualizer.py
wrapped_models.py		wrapped_models.py

Tobi-Tob/CausalAbstraction

Folders and files

Latest commit

History

Repository files navigation

Causal Abstractions of NeSy models. Where are the concepts?

🚀 Overview

📌 Key Takeaways

📂 Directory Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages