Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

YaadR
Copy link

@YaadR YaadR commented Feb 23, 2025

Bug fix - to handle "u: list of array-like, shape (n_samples, n_control_features)" input - I'll elaborate more in the PR comment section

Generally there is an option to give the model.fit() ' list of array-like, shape' as documented. when this is done for X_train data requires all the associated data to be in a sequence (list) form as well, e.g. [t_train], [x_dot] and [u_train] - the problem that arises with u_train is that there is a reshape part in the code that does it poorly as well as a section that called for X_train.shape - an numpy.array() feature that doesn't exist in python 'list' type. both fixes allows the code to run correct and smoothly, and does not effect other library features.

In the code x_train_1 and x_trian_2 are of different lengths, to demonstrate the use of python 'list' specifically and not numpy.array() which constrained to 'symmetric' matrix shape only

The code that reproduces the problem

#!/usr/bin/env python3
import numpy as np # numpy==1.26.4
import pysindy as ps # pysindy==1.7.5

def main():
    # 1. Create sample data to mimic your shape conditions.
    #    Two trajectories: (2, 101) and (2, 100).
    t_train_1 = np.linspace(0, 1, 101)
    t_train_2 = np.linspace(0, 1, 100)

    x_train_1 = np.vstack([
        np.sin(2*np.pi*t_train_1),
        np.cos(2*np.pi*t_train_1)
    ])  # shape (2, 101)

    x_train_2 = np.vstack([
        np.sin(2*np.pi*t_train_2),
        np.cos(2*np.pi*t_train_2)
    ])  # shape (2, 100)

    # Create x_dot data for x_train_1
    x_dot_1 = np.zeros_like(x_train_1)
    for i in range(x_train_1.shape[0]):
        sfd = ps.SmoothedFiniteDifference(smoother_kws={'window_length': 25})
        x_dot_1[i, :] = sfd._differentiate(x_train_1[i, :], t_train_1[1] - t_train_1[0])

    x_dot_2 = np.zeros_like(x_train_2)
    for i in range(x_train_2.shape[0]):
        x_dot_2[i, :] = sfd._differentiate(x_train_2[i, :], t_train_2[1] - t_train_2[0])

    x_dot = [x_dot_1.T, x_dot_2.T]


    # Optional: Control input (u_train) for each trajectory
    #           shape matches time dimension
    u_train_1 = np.zeros_like(t_train_1)
    u_train_2 = np.zeros_like(t_train_2)

    # Combine into lists to represent multiple trajectories
    x_train = [x_train_1, x_train_2]
    u_train = [u_train_1, u_train_2]

    # Simple time step (dt) taken from the first trajectory
    dt = t_train_1[1] - t_train_1[0]

    # Example feature library (you can choose any)
    feature_library = ps.PolynomialLibrary(degree=2)

    # For demonstration, define a single optimizer:
    from pysindy.optimizers import STLSQ
    selected_optimizers = {
        "STLSQ_example": {
            "class": STLSQ,
            "params": {
                "alpha": 0.1,
                "threshold": 0.1,
                "fit_intercept": True
            }
        }
    }

    # Check if x_train is a list => multiple trajectories
    xu_list = isinstance(x_train, list)

    def run_selected_optimizers(selected_opts):
        if not selected_opts:
            print("Please select at least one optimizer.")
            return

        models_scores = {}
        models_errors = {}

        # Example function to compute "prediction error" (stub)
        # pred_state, state_data shapes must match in time dimension.
        def compute_prediction_error(pred_state, state_data):
            # Just a demo for RMS error
            state_data = state_data[:, :pred_state.shape[1]]
            return [
                np.sqrt(np.mean((pred - true) ** 2))
                for (pred, true) in zip(pred_state, state_data)
            ]

        # 3. Loop over each optimizer
        for name, opt_data in selected_opts.items():
            optimizer_class = opt_data["class"]
            optimizer_params = opt_data["params"]

            # 4. Initialize and fit the SINDy model
            model = ps.SINDy(
                optimizer=optimizer_class(**optimizer_params),
                feature_library=feature_library
            )
            model.fit(
                x=x_train,
                t=dt,
                x_dot=x_dot,                # Not providing pre-computed derivatives
                u=u_train,                 # Control inputs
                multiple_trajectories=xu_list,
            )

            # 5. Print model to console
            print(f"\n===== Trained Model: {name} =====")
            model.print()

    # 6. Finally, run the optimizers
    run_selected_optimizers(selected_optimizers)


if __name__ == "__main__":
    main()

The problem:
Screenshot from 2025-02-23 17-28-29

…ol_features)" input - I'll elaborate more in the PR comment section
@giopapanas
Copy link

Thank you @YaadR , for your fix here. I raised this issue: #611, do you think it relates to your bug fix? In brief, when I do a toy experiment and run model.fit() with X of 1D, then the model.fit runs fine. However, as I explain in the discussion in the link above, the model gives me an error when I load a multi-dimensional X.

Btw, do you know if I need to input [x_dot] and [u_train] data myself? I think PySINDy is by default loading [x_dot] and [u_train], if you specify the differentiation method and the library to use? Thank you in advance.

@YaadR
Copy link
Author

YaadR commented Mar 25, 2025

Hi @giopapanas , to my understanding #611 is not sourced from the same bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants