Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] <description>Incorporate Time-Series Cross-Validation Support #141

Closed
1 task done
VaishnaviChelagola opened this issue Oct 20, 2024 · 6 comments
Closed
1 task done
Assignees
Labels
enhancement New feature or request gssoc-ext GSSoC'24 Extended Version hacktoberfest Hacktober Collaboration hacktoberfest-accepted Hacktoberfest 2024 level2 25 Points 🥈(GSSoC)

Comments

@VaishnaviChelagola
Copy link

Is this a unique feature?

  • I have checked "open" AND "closed" issues and this is not a duplicate

Is your feature request related to a problem/unavailable functionality? Please describe.

The stock price prediction model currently uses "train_test_split" to randomly split data, which might not be the best method for time-series data. The sequential nature of time-series stock data is ignored by this approach, which may result in data leakage and inaccurate model evaluation.

Proposed Solution

In order to enable the model to split the data sequentially, I wanted to add  "TimeSeriesSplit" from scikit-learn. This approach maintains the temporal order by guaranteeing that training is done on past data and evaluation is done on future data.

Screenshots

No response

Do you want to work on this issue?

Yes

If "yes" to above, please explain how you would technically implement this (issue will not be assigned if this is skipped)

I'll change the  dataset splitting procedure to make advantage of 'TimeSeriesSplit' and tweak the model training to accommodate multiple splits. In order to demonstrate how 'TimeSeriesSplit' enhances model performance on stock price data, I will also present comprehensive comparison metrics (such as RMSE and MAE) before and after the implementation.
Steps:
1.Modify the data splitting logic to use TimeSeriesSplit.
2.Train the model on each split and calculate evaluation metrics.
3.Compare the results with the current random data split method.
4.Provide detailed documentation on how this feature improves the accuracy of predictions on time-series data.

@VaishnaviChelagola VaishnaviChelagola added the enhancement New feature or request label Oct 20, 2024
Copy link
Contributor

Ensure the issue is not similar or previously being worked on.Thanks for your time

@rohitinu6
Copy link
Owner

@VaishnaviChelagola , all the best
please ensure to star the repo
your contribution is highly appreciated

@rohitinu6 rohitinu6 added gssoc-ext GSSoC'24 Extended Version hacktoberfest-accepted Hacktoberfest 2024 level2 25 Points 🥈(GSSoC) hacktoberfest Hacktober Collaboration labels Oct 20, 2024
@VaishnaviChelagola
Copy link
Author

Thank You sir!!

@VaishnaviChelagola
Copy link
Author

VaishnaviChelagola commented Oct 20, 2024 via email

@rohitinu6
Copy link
Owner

https://www.linkedin.com/in/rohit-dubey-d/

I will be happy to help you
Let's connect 😊

Copy link
Contributor

✅ This issue has been successfully closed. Thank you for your contribution and helping us improve the project! If you have any more ideas or run into other issues, feel free to open a new one. Happy coding! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gssoc-ext GSSoC'24 Extended Version hacktoberfest Hacktober Collaboration hacktoberfest-accepted Hacktoberfest 2024 level2 25 Points 🥈(GSSoC)
Projects
None yet
Development

No branches or pull requests

3 participants