-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded row metadata for graph format #130
Comments
Key Question
Node filesNo comment Edge filesNot applicable types of walks
walks from/to "central/incidental" nodes
[[ node schema (in progress) ]]
[[ Edge schema ]] ---------
[[ Walk schema (in progress) ]]
[[ The walk file ]]
Related issuesIssues #126 #122 #125 #102 #124 @MatthewRalston thinks the path forward towards a graph format is in creating additional structural definitions. If i think through the relationships preserved among different incomplete and completely self-referential formats, they require associated metadata schemas, and the utility function of taking a table or metadata schematic input and generating a consistently hashable representation (the metadata header format, it's parser, and the table parsing functionality, as in these modules)...
and references.. i.e. "the format(s)" And associated schemas... This utility function wouldn't be part of the algorithm per-se, but it would be incident to that which is produced by virtue of the file-metadata-log (and this version-dataset pairing) thingawhosit. That's mostly contained in our and tying that to a git sha256 hash, should be preserved with all nodes of a given wall or path |
This issue has been tabled for the time being in favor of a cleaner UI and experience on the user end. 1. Interface overhaul (issue #132)I want the user to understand the output and even ASCII styling (in absence of a rich.py dependency, which isn't needed) output_dirI want the logfile and output directories (required to collect .kdb, .kdbg, .stats.txt, output.log etc) usage, steps, and featuresI want the expanded help and usage statements, including the 'features' and 'steps' developed further. minimal STDOUTAnd finally, I want the STDOUT to be extremely minimal and/or non-existent, in the profile and graph commands. OR the formatting should display the resulting stats clearly apart from the header. README "2.0" (issue #137)Finally, readme overhaul |
Okay, I've been working on some other features and needed documentation/UI overhauls. Delays pushed deadline back a few months, reprioritizing the assembly algorithm and possible numba/Python etc implementations of D2 metrics, more odds-ratio stuff on the horizon, more literature review and beginning to write a report and lit review on applications of kmer count matrices and distances to metagenomics and microbiomes. |
Key Question
[[ walk file ]]
schema concepts
for format versions of course...
[ minimal walks ]
solutional path
[[ solutional path file ]]
Header metadata will have the source and the parameters in the header. And a walk id - (a sha256 of the walk) for an associated walk file, and walk name (given at "runtime" via CLI). May be 0 to represent unspecific or unqualified walk (origin unclear)
#126 #122 #125 #102 #124
sidenote
The neighbor structure 🌪️is manifested by particular kmer IDs🌬️, which may be accessed from kmer arrays loaded alongside the edge list during a path producing process.
A working pipeline would include all components of the workflow onto the next step but all commands are partial. Schemas' in planning stage for future release
The text was updated successfully, but these errors were encountered: