-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performed EDA: Handled Missing Values and Outliers in Stock Price Data #148
Conversation
…apped outliers using IQR method for stock price data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure the PR matches the requirements mentioned in the Contribution guide. The maintainer might get in touch to enusre quality. Thanks for your time
I’ve handled missing values and outliers and added some documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@Mayureshd-18 Thanks for the approval! Please add the gssoc-ext, hacktoberfest, hacktoberfest-accepted, and suitable level labels, and kindly merge it to close the issue. @rohitinu6 |
@rohitinu6 @Mayureshd-18 |
@rohitinu6 @Mayureshd-18 |
@RB137 The assigned level 1 is appropriate based on the root objective which is the refinement of code. Higher levels are reserved for new algos and procedures. |
🎉🎉 Thank you for your contribution! Your PR #148 has been merged! 🎉🎉 |
I have handled the data quality issues in the stock price dataset and fixed the problems related to missing values and outliers mentioned in issue #137 :
Fixes: #137
Handling Missing Values:
Used forward fill (ffill) to fill missing values in the Open, High, Low, Close, Adj Close, and Volume columns, ensuring data continuity for time-series analysis.
Outlier Treatment:
Applied the Interquartile Range (IQR) method to detect and cap outliers in the stock price columns, preventing distortion from extreme values and creating a more reliable dataset.
These fixes enhance the dataset's quality, making it ready for further analysis and predictive modeling.
Before:
After:
Added documentation to enhance code readability :