NaturalSQL for Materials Science

Project Overview

NaturalSQL for Materials Science is a project that integrates natural language processing capabilities with AiiDA (Automated Interactive Infrastructure and Database for Computational Science) to enable researchers to query computational materials science data using plain English instead of complex SQL queries.

The project leverages AiiDA's powerful provenance tracking and database capabilities to make scientific data more accessible through natural language queries, helping researchers focus on science rather than database query syntax.

Features

Natural Language Queries: Query your AiiDA database using plain English
Automatic SQL Generation: Converts natural language to optimized SQL queries
PDF Report Generation: Creates comprehensive reports from query results
AiiDA Integration: Works with AiiDA's provenance graph to provide context-aware results
Materials Science Focus: Tailored for computational materials science terminology and workflows

Requirements

Python 3.8+
AiiDA 2.5.0+ (2.6.0 recommended)
PostgreSQL (for production use) or SQLite (for testing)

Installation

Clone the repository:

git clone https://github.com/your-username/NaturalSQL-for-Material-Science.git
cd NaturalSQL-for-Material-Science

Set up a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install AiiDA:
```
pip install aiida-core
```
Install additional dependencies:
```
pip install -r requirements.txt
```
Configure AiiDA (if not already set up):
```
verdi setup
```

Configuration

Setting up AiiDA Profile

If you're starting from scratch, create and configure an AiiDA profile:

verdi quicksetup  # For quick setup with default values

Or for more control:

verdi setup

Configuring a Computer

Set up a compute resource:

verdi computer setup -L mycomputer -H localhost -T core.local -S core.direct -w /path/to/work/dir
verdi computer configure core.local mycomputer --safe-interval 0

Setting up Codes

Register computational codes:

verdi code create core.code.installed --label mycode --computer=mycomputer --default-calc-job-plugin plugin.name --filepath-executable=/path/to/executable

Usage

Running a Natural Language Query

verdi run_workflow.py

Available Demo Queries

The system can handle queries like:

"Show me all calculations that failed last week"
"Find structures with more than 50 atoms"
"List all workflows related to band structure calculations"
"Count the number of calculations per computer used"
"What is the average calculation runtime for quantum espresso jobs?"

Generating Reports

Reports are automatically generated when running queries and saved to the nl_query_reports folder with timestamped filenames:

nl_query_report_YYYYMMDD_HHMMSS.pdf

Workflow Development

To create your own custom natural language query workflows:

Extend the base workflow in nl_query_workflow.py
Define your specific query patterns in the workflow
Register your workflow in workflow.py
Execute using nl_query_demo/run_workflow.py

Contribution

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Acknowledgements

This project builds upon the AiiDA framework, a workflow manager for computational science with a strong focus on provenance, performance, and extensibility.

Please cite the following when using this project:

S. P. Huber et al., "AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance", Scientific Data 7, 300 (2020); DOI: 10.1038/s41597-020-00638-4
M. Uhrin et al., "Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows", Computational Materials Science 187, 110086 (2021); DOI: 10.1016/j.commatsci.2020.110086

Contact

For questions and support, please open an issue in the GitHub repository or contact the development team at mrebaal14@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
nl_query_reports		nl_query_reports
README.md		README.md
nl_query_workflow.py		nl_query_workflow.py
readme.md		readme.md
requirements.txt		requirements.txt
run_workflow.py		run_workflow.py
workflow.py		workflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NaturalSQL for Materials Science

Project Overview

Features

Requirements

Installation

Configuration

Setting up AiiDA Profile

Configuring a Computer

Setting up Codes

Usage

Running a Natural Language Query

Available Demo Queries

Generating Reports

Workflow Development

Contribution

Acknowledgements

Contact

About

Releases

Packages

Languages

Muhammad-Rebaal/NaturalSQL-for-Material-Science

Folders and files

Latest commit

History

Repository files navigation

NaturalSQL for Materials Science

Project Overview

Features

Requirements

Installation

Configuration

Setting up AiiDA Profile

Configuring a Computer

Setting up Codes

Usage

Running a Natural Language Query

Available Demo Queries

Generating Reports

Workflow Development

Contribution

Acknowledgements

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages