Skip to content
This repository has been archived by the owner on Oct 21, 2024. It is now read-only.

lemousehunter/SC3020-Project-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repository contains the codebase for a disk-based B+ tree index database system. The data is first read from games.txt (if the database file does not exist) into the storage module, after which datablocks mirroring those of physical datablocks are created, each containing the maximum of 149 records (may vary from system to system depending on how large the recordLocations map in the datablock header is. All datablocks are stored in the same database file. Following which, the B+ tree index is created if it does not exist. The range search query function allows users to search for records with fgPctHome that are between 0.5 to 0.8 (inclusive). Based on our own tests on our own machine, the results should be as follows:

--------------- B+ Tree Search Results ---------------
Number of index nodes accessed (internal, non-leaf node): 2
Number of data blocks accessed: 55
Number of results: 6902
Average FG3_PCT_home: 0.420801
Running time: 9400 microseconds
------------------------------------------------------

---------------- Linear Search Results ---------------
Number of data blocks accessed: 209
Number of results: 6902
Average FG3_PCT_home: 0.420801
Running time: 34307 microseconds
------------------------------------------------------

For the following B+ Tree:

----------------- B+ Tree Statistics -----------------
Order (maximum number of keys per node): 100
Height of the tree: 3
Content of root node (keys): 0.393 0.419 0.438 0.453 0.469 0.488 0.506 
B+ Tree Node Count:
Total nodes: 459
Internal nodes: 9
Leaf nodes: 450
------------------------------------------------------

and Storage system:

----------------- Storage Statistics -----------------
Total number of records: 26651
Number of datablocks: 209
Size of record: 26 bytes
Size of record (in memory): 32 bytes
Size of record (with header): 27 bytes
Size of datablock: 4096 bytes
Size of datablock header: 614 bytes
Size of available space in datablock: 3482 bytes
Max Number of Records per Datablock: 128
Max used space in each Datablock: 4070 bytes
Unused space in each Datablock: 26 bytes
------------------------------------------------------

The schema of a datablock stored on disk (in the database file) is as follows:

┌───────────────────────────────────────────────────────┐
│             Datablock Schema (4096 bytes)             │
╞═══════════════════════════════════════════════════════╡
│Header                                 614 bytes       │
├────────────────┬──────────────────────────────────────┤
│id              │      unsigned short  2   bytes       │
│maxSize         │      unsigned short  2   bytes       │
│currentSize     │      unsigned short  2   bytes       │
│recordCount     │      unsigned short  2   bytes       │
│recordLocations │      unordered map   606 bytes       │
╞════════════════╧══════════════════════════════════════╡
│Record                                 27  bytes       │
╞═══════════════════════════════════════════════════════╡
│Record Header                          1   bytes       │
├────────────────┬──────────────────────────────────────┤
│size            │      int             1   bytes       │
╞════════════════╪══════════════════════════════════════╡
│gameDate        │      int             4   bytes       │
│teamId          │      int             4   bytes       │
│ptsHome         │      unsigned char   1   bytes       │
│fgPctHome       │      float           4   bytes       │
│ftPctHome       │      float           4   bytes       │
│fg3PctHome      │      float           4   bytes       │
│astHome         │      unsigned char   1   bytes       │
│rebHome         │      unsigned char   1   bytes       │
│homeTeamWins    │      bool            1   bytes       │
│recordId        │      unsigned short  2   bytes       │
╞════════════════╧══════════════════════════════════════╡
│                          ...                          │
│                          ...                          │
│                      Other Records                    │
│                          ...                          │
│                          ...                          │
╞═══════════════════════════════════════════════════════╡
│               Unused Space (26 bytes)                 │
└───────────────────────────────────────────────────────┘

Compiled with g++:

Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-darwin23.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Initially, we faced some discrepancies with the timing (linear search being much faster than B+ tree search), which should not be the case. After some investigations, we found out that this was because of the extra time taken by the file seekg() and read() operations that B+ tree used but linear search did not, as our original implementation of linear search loaded the entire datablock into memory instead of reading it by offsets like B+ tree search was.

Pre-requsites:

  1. Ensure G++ compiler is installed

    For Windows: Reference: Using GCC with MinGW

    a. Installing the MinGW-w64 toolchain here.

    b. In the wizard, choose your desired Installation Folder. Record this directory for later. In most cases, the recommended directory is acceptable. The same applies when you get to setting the start menu shortcuts step. When complete, ensure the Run MSYS2 now box is checked and select Finish. This will open a MSYS2 terminal window for you.

    c. In the MSYS2 UCRT64 environment (not MinGW64 or MinGW32) terminal paste this command to download the MinGW-w64 toolchain.

    pacman -S --needed base-devel mingw-w64-ucrt-x86_64-toolchain
    

    d. Accept the default number of packages in the toolchain by pressing Enter.
    image

    e. Press Y when prompted to proceed with the installation.

    f. Add the path of your MinGW-w64 bin folder to the Windows PATH environment variable by using the following steps:

    • f.1. In the Windows search bar, type Settings to open your Windows Settings.

    • f.2. Search for Edit environment variables for your account.

    • f.3. In your User variables, select the Path variable and then select Edit.

    • f.4. Select New and add the MinGW-w64 destination folder you recorded during the installation process to the list. If you used the default settings above, then this will be the path: C:\msys64\ucrt64\bin.

      f.5. Select OK, and then select OK again in the Environment Variables window to update the PATH environment variable. You have to reopen any console windows for the updated PATH environment variable to be available

    g. To check if g++, GCC, gdb has been installed correctly, open a new command prompt (CMD) and paste in these lines

    gcc --version
    g++ --version
    gdb --version


    For Mac:

    Via MacPorts:

    sudo port install gcc9

    Via Homebrew:

    brew install gcc


    For Linux:

    sudo apt-get update && sudo apt-get install g++

How to Run

Please download the correct version based on your machine OS

File Structure:

.../SC3020-Lab1
│
├─────── Mac
│         ├─────── include
│         │          ├──...
│         │          │
│         │         ...
│         │
│         ├─────── src
│         │          ├──...
│         │          │
│         │         ...
│         │
│         ├─────── Makefile
│         ├─────── games.txt
│
├───────Windows
│         ├─────── include
│         │          ├──...
│         │          │
│         │         ...
│         │
│         ├─────── src
│         │          ├──...
│         │          │
│         │         ...
│         │
│         ├─────── Makefile
│         ├─────── games.txt
  1. cd into the root directory of this git repository SC3020-Lab1

  2. cd into the directory corresponding to your Operating System:

    For Windows

    • First open the MSYS2 UCRT64 environment terminal:
      image

      Note: If you cannot find the file just open the search bar next to the Windows logo on the taskbar and search for MYSYS2 UCRT64
      image

    cd Windows # or path to the corresponding to Windows

    For Mac/Linux:

    cd Mac
  3. Make Clean

    make clean
  4. Make

    make
  5. Run the compiled executable file Assuming you are in the root folder of this repository:

    For Windows:

    bin/bplustree.exe

    For Linux / Mac:

    bin/bplustree

    Disclaimer: No tests was run on a Linux machine and hence no guarantees can be made that the code can be executed on Linux machines.

Gotchas

  1. You will need to cd to the root directory of this github repository before running the bin/plustree unix executable file, otherwise you will get the error Error: Unable to open file: games.txt
  2. Always ensure you make clean before running the executable file for the first time

About

Btree Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published