Skip to content

omdena-nic-nepal-classroom-c5f4f9-data-science-assignment-2-data-science-libraries-assignment-2 created by GitHub Classroom

Notifications You must be signed in to change notification settings

ru-bee/data-science-assignment-2-ru-bee

 
 

Repository files navigation

Review Assignment Due Date Open in Visual Studio Code

data-science-libraries-assignment-2

Section 1:

Task 1: Setup and DataFrame Creation

  • Install Pandas (if not already installed).
  • Import Pandas and any other necessary libraries.
  • Create a DataFrame from a dictionary and from a list of dictionaries.
  • Load a dataset from a CSV file that is provide called messed_dataset.csv

Task 2: Viewing and Inspecting Data

  • Display the first and last few rows of the DataFrame.
  • Get a summary of the DataFrame, including basic statistics and data types.
  • Display the shape and column names of the DataFrame.

Task 3: Selection and Indexing

  • Select a single column and multiple columns.
  • Select rows by index and by label.
  • Select specific rows and columns using loc and iloc.

Task 4: Handling Missing Data

  • Identify missing values in the DataFrame.
  • Drop rows with missing values.
  • Fill missing values with a specified value.

Task 5: Data Operations

  • Add a new column to the DataFrame.
  • Delete a column from the DataFrame.
  • Rename columns in the DataFrame.
  • Apply a function to a column.

Task 6: GroupBy Operations

  • Group the DataFrame by a column and calculate summary statistics.
  • Iterate over groups and display the group names and data.

Task 7: Merging and Joining DataFrames

  • Merge two DataFrames on a common column.
  • Join two DataFrames using their indices.

Task 8: Working with Dates and Times

  • Create a datetime index for the DataFrame.
  • Convert a column to datetime and extract date components.

Task 9: Input and Output

  • Read data from a CSV file into a DataFrame.
  • Write the DataFrame to a CSV file.
  • Read data from an Excel file into a DataFrame. The data given is SaleData.xlsl.
  • Write the DataFrame to an Excel file.

Section 2: Reference to Friday Class on EDA

Task 10: Visualization

  • Create a simple plot using Matplotlib.
  • Create a bar plot using Seaborn.

Instructions

  1. Fork this repository.
  2. Complete the tasks in assignment.ipynb.
  3. Commit and push your changes to your forked repository.
  4. Submit the link to your repository.

Tasks

  • Task 1: Load the provided dataset into a Pandas DataFrame.
  • Task 2: Perform basic data cleaning.
  • Task 3: Perform data analysis to answer the provided questions.
  • Task 4: Visualize the results using Matplotlib/Seaborn.

About

omdena-nic-nepal-classroom-c5f4f9-data-science-assignment-2-data-science-libraries-assignment-2 created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%