Skip to content

Commit

Permalink
fix: move project over to veracity repo
Browse files Browse the repository at this point in the history
  • Loading branch information
dnvHerman committed Nov 16, 2022
1 parent 02b6956 commit 4abba67
Show file tree
Hide file tree
Showing 12 changed files with 1,446 additions and 1 deletion.
131 changes: 131 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

/.vscode
58 changes: 57 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,57 @@
# KnowledgeGraphGenerator-
# knowledge Graph Generator

Knowledge Graph Generator - A way to create a node graph to visualize connections between data. The program is written in python 3.9.2 and mainly uses pandas for data management. It has a simple UI interface (currently with some bugs) that lets you define data, nodes, and edges and also generate the files. The program itself does not visualize the data, but generate the necessary files to use a visulization tool like [Gephi](https://gephi.org/).

## Running the program

You have two options to run the program. Either run the executable in releases, or with python.

### Running with python

First you need to install the required libraries. To do this, run `python -m pip install -r requirements.txt`. After that you can run the program with `python main.py`.

## How it works

The program is sectioned into 5 sections:
* [Data](#data)
* [Data settings](#data-settings)
* [Node settings](#node-settings)
* [Edge settings](#edge-settings)
* [Generate Graph](#generate-graph)

### Data

In this section it lets you define what files you want to import. It will let you import multiple files, but in the current version it only uses the first one. If a file is removed from the data section, it will not remove all the connected nodes and edges to that file in the current version.

So a rule of thumb: Only use 1 file (as of the beta version)

### Data settings

In this section it lets you define what the column types are. To sets the data columns, just click the file in the datapane on the left. The settings menu is a little buggy, so you might need to click the datafile a few times before the column settings want to behave properly. There are only a couple of options as of right now: *integer*, *float*, *string*, and *boolean*. All columns are read as strings when the program launches, so make sure to change the column types before you get data from it if you want to do more data processing on it later.

### Node settings

In this section it lets you define what nodes you want. Select the datafile you want in the pane on the left, and the nodes you can add shows up on the bottom. The select node pane is a list of your columns that you can set as nodes. When you check a column of as a node, it appears in the node pane on the left.

### Edge settings

In this section it lets you define edges from the nodes you have added. This menu is also a little buggy, so you might need to click a node a few times before the menu behaves properly. In the menu under the node pane to the left, and edge pane to the right, you see your selected node on the bottom, and a box to select other nodes on the right. Under you can select if you want the edge to be directional or not.

### Generate Graph

In this section you define the output files for the program. Here you set the path for the nodeFile and edgeFile. A warning about setting the node and edge file is that it will reset the file to 0 bytes, so be sure not to overwrite any files you want. To generate the files after setting the path for the node and edge path, just hit the *Generate Graph* button. This make take some time depending on the size of the data and the amount of nodes and edges, but usually finishes in under 15 seconds.

## Known Bugs

These are the known bugs:
* Settings menues not acting correctly before they have been clicked a few times.
* Edges are only removed visually, but are not actually removed.
* Nodes are only removed visually, but are not actually removed.

## Plan moving forward

The plan forward is to continue to fix and develop the frontend, and hopefully switch over to [eel](https://github.com/python-eel/Eel) - a python library to use HTML and JS as GUI for apps - and add more options for having metadata in the nodes and edges.

## How to contribute

If you find something you want to change, please feel free to create a pull request with the changes you have created. If you do not have the time to implement the changes yourself, you can add it as an issue such that we can add it to the development plan.
73 changes: 73 additions & 0 deletions data/data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import pandas as pd

class Data:
__path: str
__name: str
__type: str
__df: pd.DataFrame
__loaded: bool
__error: bool

def __init__(self, path: str) -> None:
self.__path = path
self.__name = path.split('/')[-1].split(".")[0]
self.__type = path.split('/')[-1].split(".")[1]
self.__loaded = False
self.__error = False
pass

@property
def path(self) -> str:
return self.__path

@property
def name(self) -> str:
return self.__name

@property
def type(self) -> str:
return self.__type

@property
def df(self) -> pd.DataFrame:
return self.__df

@property
def loaded(self) -> bool:
return self.__loaded

@property
def error(self) -> bool:
return self.__error

def __str__(self) -> str:
return f"path: {self.__path}, name: {self.__name}, type: {self.__type}"

def __repr__(self) -> str:
return self.__str__()

def __eq__(self, __o: object) -> bool:
if not isinstance(__o, Data): return False
if (self.__path != __o.path): return False
return True

def __hash__(self) -> int:
return hash(self.__path)

def loadData(self) -> None:
self.__error = False
if self.__type == "csv":
try:
self.__df = pd.read_csv(self.__path, delimiter=';', decimal=',', dtype='string')
if (self.__df.shape[1] == 1):
self.__df = pd.read_csv(self.__path, delimiter=',', decimal='.', dtype='string')
except Exception as e:
print(e)
print("could not load")
self.__error = True
elif self.__type == "xlsx":
self.__df = pd.read_excel(self.__path)
pass
elif self.__type == "json":
self.__df = pd.read_json(self.__path)
return
88 changes: 88 additions & 0 deletions data/dataManager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
from typing import Set, Union
from data.data import Data
from data.edge import Edge
from data.node import Node
from data.edgeDef import EdgeDef
from data.nodeDef import NodeDef

import pandas as pd

class DataManager:
__nodeDefs: Set[NodeDef]
__edgeDefs: Set[EdgeDef]
__nodes: Set[Node]
__edges: Set[Edge]
__data: Set[Data]

def __init__(self) -> None:
self.__nodeDefs = set()
self.__edgeDefs = set()
self.__nodes = set()
self.__edges = set()
self.__data = set()
return

@property
def data(self) -> Set[Data]:
return self.__data

@property
def nodeDefs(self) -> Set[NodeDef]:
return self.__nodeDefs

@property
def edgeDefs(self) -> Set[EdgeDef]:
return self.__edgeDefs

def addData(self, data: Data) -> None:
self.__data.add(data)
return

def findData(self, path: str, name: str, type: str) -> Union[Data, None]:
for d in self.__data:
if d.name == name and d.path == path and d.type == type:
return d

def removeData(self, data: Data) -> None:
self.__data.remove(data)
return

def addNodeDef(self, d: NodeDef) -> None:
self.__nodeDefs.add(d)
return

def removeNodeDef(self, d: NodeDef) -> None:
self.__nodeDefs.remove(d)
return

def findNodeDef(self, field: str) -> Union[NodeDef, None]:
for n in self.__nodeDefs:
print(f"field: {n.field}, inField: {field}")
if n.field == field:
return n
return None

def addEdgeDef(self, d: EdgeDef) -> None:
self.__edgeDefs.add(d)
return

def removeEdgeDef(self, d: EdgeDef) -> None:
self.__edgeDefs.remove(d)
return

def generateData(self) -> None:
[n.createNodes(list(self.__data)[0]) for n in self.__nodeDefs]
[e.createEdges(list(self.__data)[0]) for e in self.__edgeDefs]
return

def generateNodeFile(self, path: str) -> None:
[self.__nodes.update(d.nodes) for d in self.__nodeDefs]
df = pd.DataFrame.from_records([n.as_dict for n in self.__nodes])
df.to_csv(path, index=False, sep=';', decimal='.')
return

def generateEdgeFile(self, path: str) -> None:
[self.__edges.update(d.edges) for d in self.__edgeDefs]
df = pd.DataFrame.from_records([e.as_dict for e in self.__edges])
df.to_csv(path, index=False, sep=';', decimal='.')
return
Loading

0 comments on commit 4abba67

Please sign in to comment.