updated readme for dev section and example for brick which appears to…

… be ready for writeup
bbartling · Aug 20, 2024 · 9b30920 · 9b30920
1 parent a5315af
commit 9b30920
Show file tree

Hide file tree

Showing 12 changed files with 586 additions and 88 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -14,12 +14,12 @@ jobs:
 
     steps:
     - name: Checkout code
-      uses: actions/checkout@v4  # Using the latest version of the checkout action
+      uses: actions/checkout@v4
 
     - name: Set up Python
-      uses: actions/setup-python@v4  # Using the latest version of the setup-python action
+      uses: actions/setup-python@v4
       with:
-        python-version: '3.12'  # Specify the Python version you are using
+        python-version: '3.12.3'
 
     - name: Install dependencies
       run: |

diff --git a/README.md b/README.md
@@ -13,15 +13,13 @@ This is a Python-based Fault Detection and Diagnostics (FDD) tool for running fa
 
 
 ## Getting Setup
-* Some features may be broken or not work as expected while the project is undergoing a significant makeover to become installable from PyPI. The aim is to streamline the reporting processes and make them much easier to use. I appreciate your patience during this transition.
-
-This project is on PyPI now so get setup with this command using the Python package manager called pip.
+This project is now available on PyPI, making it easy to set up with the Python package manager, pip. You can install the package using the following command:
 
 ```bash
 pip install open-fdd
 ```
 
-See the `examples` directory for Jupyter notebook tutorials.
+For running Jupyter notebooks, I recommend using Visual Studio Code with the Jupyter notebook extension installed, which offers a seamless experience directly within the editor. Be sure to explore the `examples` directory for Jupyter notebook tutorials. If you have your own FDD experiences to share, feel free to contribute by creating a notebook (`.ipynb`). You’re welcome to reach out to me directly, and I can push your example to GitHub on your behalf, which might be a simpler process than submitting a pull request (PR), especially if you're just sharing an example rather than developing `open-fdd`.
 
 ## Project goals
 These are some basic project goals to make this into an interactive FDD application.
@@ -35,7 +33,26 @@ These are some basic project goals to make this into an interactive FDD applicat
  - [ ] create SQL example to read data from time series db and write back to SQL to then read faults in Grafana.
  - [ ] other?
 
+Certainly! Here's a revised version of your contribution guidelines:
+
 ## Contribute
+
+If you have suggestions for improving developer best practices or solutions, please feel free to reach out to me directly using my contact information or Git issue/discussion. I primarily work on Windows with multiple versions of Python installed, with Python 3.12.x as my default version. You can download the latest version of Python here:
+* https://www.python.org/downloads/
+
+1. **Adding New Faults and Reports:**  
+   Developers will need to `> py -3.12 -m pip install black pytest`. When adding new faults and reports, I usually run `> py -3.12 -m pip install .` in the cloned project directory. I continuously uninstall with `> py -3.12 -m pip uninstall open-fdd` and reinstall locally until I'm satisfied with the changes.
+
+2. **Testing Fault Logic:**  
+   All fault logic is rigorously tested using `pytest`. You can run the tests with `> py -m pytest`.
+
+3. **Formatting with Black:**  
+   To ensure code consistency, I use Black for formatting. Run `> py -m black .` to format the code and check it with `> py -m black --check .`
+
+4. **Pushing to GitHub:**  
+   After making changes, and the steps above are successful push them to GitHub in a pull request. The GitHub Actions workflow will automatically run `pytest` and `black` to ensure the build is successful.
+
+
 This project is a community-driven initiative, focusing on the development of free and open-source tools. I believe that Fault Detection and Diagnostics (FDD) should be free and accessible to anyone who wants to try it out, embodying the spirit of open-source philosophy. Additionally, this project aims to serve as an educational resource, empowering individuals to learn about and implement FDD in their own systems. As someone wisely said, `"Knowledge should be shared, not hoarded,"` and this project strives to put that wisdom into practice.
 
 Got any ideas or questions? Submit a Git issue or start a Discussion...

diff --git a/brick_timeseries.db b/brick_timeseries.db
diff --git a/examples/brick_model_and_sqlite/1_make_db.py b/examples/brick_model_and_sqlite/1_make_db.py
@@ -48,39 +48,47 @@
 )
 
 # Step 4: Load the CSV data
-csv_file = r"C:\Users\bbartling\Documents\WPCRC_Master.csv"
+csv_file = r"C:\Users\bbartling\Documents\WPCRC_July.csv"
 df = pd.read_csv(csv_file)
 print("df.columns", df.columns)
 
+# Ensure that the 'timestamp' column is properly parsed as a datetime object
+if "timestamp" in df.columns:
+    df["timestamp"] = pd.to_datetime(df["timestamp"])
+else:
+    raise ValueError("The CSV file does not contain a 'timestamp' column.")
+
 print("Starting step 5")
 
 # Step 5: Insert CSV data into the TimeseriesData table
 for column in df.columns:
-    for index, row in df.iterrows():
-        cursor.execute(
-            """
-        INSERT INTO TimeseriesData (sensor_name, timestamp, value)
-        VALUES (?, ?, ?)
-        """,
-            (column, index, row[column]),
-        )
-    print(f"Doing {column} in step 5")
+    if column != "timestamp":  # Skip the timestamp column itself
+        for index, row in df.iterrows():
+            cursor.execute(
+                """
+            INSERT INTO TimeseriesData (sensor_name, timestamp, value)
+            VALUES (?, ?, ?)
+            """,
+                (column, row["timestamp"].strftime("%Y-%m-%d %H:%M:%S"), row[column]),
+            )
+        print(f"Doing {column} in step 5")
 
 conn.commit()
 
 print("Starting step 6")
 
 # Step 6: Insert timeseries references based on sensor names
 for column in df.columns:
-    cursor.execute(
-        """
-    INSERT INTO TimeseriesReference (timeseries_id, stored_at)
-    VALUES (?, ?)
-    """,
-        (column, "SQLite Timeseries Storage"),
-    )
-
-    print(f"Doing {column} in step 6")
+    if column != "timestamp":  # Skip the timestamp column itself
+        cursor.execute(
+            """
+        INSERT INTO TimeseriesReference (timeseries_id, stored_at)
+        VALUES (?, ?)
+        """,
+            (column, "SQLite Timeseries Storage"),
+        )
+
+        print(f"Doing {column} in step 6")
 
 conn.commit()
 

diff --git a/examples/brick_model_and_sqlite/2_testout_step_1.py b/examples/brick_model_and_sqlite/2_testout_step_1.py
@@ -0,0 +1,39 @@
+import sqlite3
+import pandas as pd
+
+# Connect to the SQLite database
+conn = sqlite3.connect("brick_timeseries.db")
+
+# Query the data
+query = """
+SELECT sensor_name, timestamp, value
+FROM TimeseriesData
+WHERE sensor_name = 'HWR_value'
+ORDER BY timestamp ASC
+"""
+df = pd.read_sql_query(query, conn)
+
+# Convert the timestamp column to datetime if needed
+df["timestamp"] = pd.to_datetime(df["timestamp"])
+
+# Set the 'timestamp' column as the index
+df.set_index("timestamp", inplace=True)
+
+# Pivot the DataFrame to make sensor_name the columns and value the data
+df_pivot = df.pivot(columns="sensor_name", values="value")
+
+# Display the DataFrame
+print(df_pivot.head())
+print()
+
+# Display the DataFrame
+print("SQL: ", df_pivot.describe())
+print()
+
+# Close the connection
+conn.close()
+
+# Just for fun see if the CSV file looks any different
+csv_file = r"C:\Users\bbartling\Documents\WPCRC_July.csv"
+df = pd.read_csv(csv_file)
+print("CSV: ", df["HWR_value"].describe())
diff --git a/...ples/brick_model_and_sqlite/2_make_rdf.py → ...ples/brick_model_and_sqlite/3_make_rdf.py b/...ples/brick_model_and_sqlite/2_make_rdf.py → ...ples/brick_model_and_sqlite/3_make_rdf.py
@@ -42,13 +42,12 @@
     sensor_uri = URIRef(f"http://example.org/{timeseries_id.replace(' ', '_')}")
 
     # Adjust sensor type and unit based on sensor name
-    if "SaTempSP" in timeseries_id or "SaStatic" in timeseries_id:
-        if "SPt" in timeseries_id or "SPt" in timeseries_id:  # Adjust setpoint type
-            g.add((sensor_uri, RDF.type, brick.Supply_Air_Static_Pressure_Setpoint))
-            g.add((sensor_uri, brick.hasUnit, unit.Inch_Water_Column))
-        else:
-            g.add((sensor_uri, RDF.type, brick.Supply_Air_Static_Pressure_Sensor))
-            g.add((sensor_uri, brick.hasUnit, unit.Inch_Water_Column))
+    if "SaStaticSPt" in timeseries_id:
+        g.add((sensor_uri, RDF.type, brick.Supply_Air_Static_Pressure_Setpoint))
+        g.add((sensor_uri, brick.hasUnit, unit.Inch_Water_Column))
+    elif "SaStatic" in timeseries_id:
+        g.add((sensor_uri, RDF.type, brick.Supply_Air_Static_Pressure_Sensor))
+        g.add((sensor_uri, brick.hasUnit, unit.Inch_Water_Column))
     elif "Sa_FanSpeed" in timeseries_id:
         g.add((sensor_uri, RDF.type, brick.Supply_Fan_VFD_Speed_Sensor))
         g.add((sensor_uri, brick.hasUnit, unit.Percent))
@@ -58,7 +57,6 @@
         g.add(
             (sensor_uri, brick.hasUnit, unit.DEG_F)
         )  # Assuming degrees Fahrenheit, adjust if needed
-
     timeseries_ref_uri = URIRef(
         f"http://example.org/timeseries_{timeseries_id.replace(' ', '_')}"
     )

diff --git a/examples/brick_model_and_sqlite/4_mimic_grafana_query.py b/examples/brick_model_and_sqlite/4_mimic_grafana_query.py
@@ -1,13 +1,60 @@
-import sqlite3
-import pandas as pd
 import matplotlib.pyplot as plt
+import pandas as pd
+import sqlite3
+import matplotlib.dates as mdates
+
+
+def plot_timeseries(df, filename="mimic_grafana_plot.png"):
+    fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(25, 10))
+    fig.suptitle("HVAC Timeseries Data and Fault Detection")
+
+    # Plot Static Pressure Sensor and Setpoint on ax1
+    ax1.plot(df.index, df["SaStatic"], label="Static Pressure Sensor")
+    ax1.plot(df.index, df["SaStaticSPt"], label="Static Pressure Setpoint")
+    ax1.legend(loc="best")
+    ax1.set_ylabel("Inch WC")
+    # ax1.grid(True)
+
+    # Improve timestamp formatting for ax1
+    ax1.xaxis.set_major_locator(mdates.AutoDateLocator())
+    ax1.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M"))
+    fig.autofmt_xdate(rotation=45)  # Rotate the labels for better readability
+
+    # Plot Fan Speed on ax2
+    ax2.plot(df.index, df["Sa_FanSpeed"], color="g", label="Fan Speed")
+    ax2.legend(loc="best")
+    ax2.set_ylabel("Fan Speed (%)")
+    # ax2.grid(True)
+
+    # Improve timestamp formatting for ax2
+    ax2.xaxis.set_major_locator(mdates.AutoDateLocator())
+    ax2.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M"))
+
+    # Plot Fault Flag on ax3
+    ax3.plot(df.index, df["fc1_flag"], label="Fault Detected", color="k")
+    ax3.set_xlabel("Timestamp")
+    ax3.set_ylabel("Fault Flags")
+    ax3.legend(loc="best")
+    # ax3.grid(True)
+
+    # Improve timestamp formatting for ax3
+    ax3.xaxis.set_major_locator(mdates.AutoDateLocator())
+    ax3.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d %H:%M"))
+
+    # Rotate x-axis labels for all subplots to improve readability
+    fig.autofmt_xdate(rotation=45)
+
+    # Save the plot to a file
+    plt.tight_layout(rect=[0, 0.03, 1, 0.95])
+    plt.savefig(filename)
+    plt.close()
 
 
 def query_timeseries_data(conn, start_time=None, end_time=None):
     query = """
     SELECT timestamp, sensor_name, value, fc1_flag
     FROM TimeseriesData
-    WHERE sensor_name IN ('Supply_Air_Static_Pressure_Sensor', 'Supply_Air_Static_Pressure_Setpoint', 'Supply_Fan_VFD_Speed_Sensor')
+    WHERE sensor_name IN ('Sa_FanSpeed', 'SaStatic', 'SaStaticSPt')
     """
 
     if start_time and end_time:
@@ -16,42 +63,19 @@ def query_timeseries_data(conn, start_time=None, end_time=None):
     df = pd.read_sql_query(query, conn)
     print(f"Retrieved {len(df)} records from the database.")
 
-    # Pivot the data to get one column per sensor
-    df_pivot = df.pivot(index="timestamp", columns="sensor_name", values="value")
-    df_pivot["fc1_flag"] = df["fc1_flag"]
-
-    return df_pivot
-
-
-def plot_timeseries(df, output_file=None):
-    plt.figure(figsize=(14, 7))
+    # Convert the 'timestamp' column to datetime
+    df["timestamp"] = pd.to_datetime(df["timestamp"])
 
-    # Plot each sensor's data
-    for column in df.columns:
-        if column != "fc1_flag":
-            plt.plot(df.index, df[column], label=column)
+    # Pivot the data to get one column per sensor
+    df_pivot = df.pivot_table(index="timestamp", columns="sensor_name", values="value")
 
-    # Highlight the times when a fault was detected
-    fault_times = df.index[df["fc1_flag"] == 1]
-    plt.scatter(
-        fault_times,
-        [df.loc[time, "Supply_Air_Static_Pressure_Sensor"] for time in fault_times],
-        color="red",
-        label="Fault Detected",
-        zorder=5,
-    )
+    # Add the fc1_flag back to the pivoted DataFrame, aligned with the timestamp
+    df_pivot["fc1_flag"] = df.groupby("timestamp")["fc1_flag"].first()
 
-    plt.title("HVAC Timeseries Data")
-    plt.xlabel("Timestamp")
-    plt.ylabel("Value")
-    plt.legend()
-    plt.grid(True)
+    # Set the 'timestamp' as the index
+    df_pivot = df_pivot.set_index(df_pivot.index)
 
-    if output_file:
-        plt.savefig(output_file)
-        print(f"Plot saved as {output_file}")
-    else:
-        plt.show()
+    return df_pivot
 
 
 def main():
@@ -60,9 +84,11 @@ def main():
 
     # Step 2: Query the timeseries data
     df = query_timeseries_data(conn)
+    print(df)
+    print(df.columns)
 
     # Step 3: Plot the data
-    plot_timeseries(df, output_file="timeseries_plot.png")
+    plot_timeseries(df)
 
     # Close the connection
     conn.close()