Skip to content

A study on integrating transformer architectures with relational databases via a modular message-passing framework, demonstrating enhanced performance.

Notifications You must be signed in to change notification settings

jakubpeleska/deep-db-learning

 
 

Repository files navigation

Transformers Meet Relational Databases

The repository with the framework and experiments discussed in the article Transformers Meet Relational Databases

A study on integrating transformer architectures with relational databases via a modular message-passing framework, demonstrating enhanced performance.

About

The end-to-end nature of the system allows for streamlined integration of deep learning methods in the relational database settings. The pipeline allows for attaching any relational database easily through a simple connection string (with SQL Alchemy). Special care is given to databases of the CTU Relational repository, which are currently being further integrated with RelBench into a new dataset library. Furthermore the system loads data from the DB (with Pandas), automatically analyzes its schema structure and column semantics, and efficiently loads and embeds the data into learnable (PyTorch Frame) tensor representations.

The subsequent modular neural message-passing scheme operates on top of the (two-level) multi-relational hypergraph representation. Utilizing Pytorch Geometric to build such representation allows to utilize any of its modules readily, and together with the tabular transformers of PyTorch Frame creates a vast series of combinations available for instantiating the presented deep learning blueprint. One such instantiation is the proposed model DBFormer, illustrated below:

schema.png

For more information, please read the paper and/or feel free to reach out directly to us!

Citation:

@misc{peleška2024transformersmeetrelationaldatabases,
      title={Transformers Meet Relational Databases}, 
      author={Jakub Peleška and Gustav Šír},
      year={2024},
      eprint={2412.05218},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2412.05218}, 
}

Project Structure

  • db_transformer - the main module containing the:
    • data - loading, analysis, conversion, and embedding
    • db - connection, inspection, and schema detection
    • nn - deep learnign models, layers, training methods
  • experiments - presented in the paper, including baselines from:
    • Tabular models
    • Propositionalization
    • Statistical Relational Learning
    • Neural-symbolic integration

and additionally some:

  • scripts - some additional helper scripts

Related

PyRelational is a currently developing library integration of datasets from the CTU Relational repository into representation proposed by the RelBench project with the goal of further extending the field of Relational Deep Learning.

About

A study on integrating transformer architectures with relational databases via a modular message-passing framework, demonstrating enhanced performance.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.6%
  • Jupyter Notebook 15.4%