The repository with the framework and experiments discussed in the article Transformers Meet Relational Databases
A study on integrating transformer architectures with relational databases via a modular message-passing framework, demonstrating enhanced performance.
The end-to-end nature of the system allows for streamlined integration of deep learning methods in the relational database settings. The pipeline allows for attaching any relational database easily through a simple connection string (with SQL Alchemy). Special care is given to databases of the CTU Relational repository, which are currently being further integrated with RelBench into a new dataset library. Furthermore the system loads data from the DB (with Pandas), automatically analyzes its schema structure and column semantics, and efficiently loads and embeds the data into learnable (PyTorch Frame) tensor representations.
The subsequent modular neural message-passing scheme operates on top of the (two-level) multi-relational hypergraph representation. Utilizing Pytorch Geometric to build such representation allows to utilize any of its modules readily, and together with the tabular transformers of PyTorch Frame creates a vast series of combinations available for instantiating the presented deep learning blueprint. One such instantiation is the proposed model DBFormer, illustrated below:
For more information, please read the paper and/or feel free to reach out directly to us!
Citation:
@misc{peleška2024transformersmeetrelationaldatabases,
title={Transformers Meet Relational Databases},
author={Jakub Peleška and Gustav Šír},
year={2024},
eprint={2412.05218},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2412.05218},
}
db_transformer
- the main module containing the:data
- loading, analysis, conversion, and embeddingdb
- connection, inspection, and schema detectionnn
- deep learnign models, layers, training methods
experiments
- presented in the paper, including baselines from:- Tabular models
- Propositionalization
- Statistical Relational Learning
- Neural-symbolic integration
and additionally some:
scripts
- some additional helper scripts
PyRelational is a currently developing library integration of datasets from the CTU Relational repository into representation proposed by the RelBench project with the goal of further extending the field of Relational Deep Learning.