Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Course sequence listings #16

Open
4 tasks
wllmwu opened this issue Sep 9, 2022 · 1 comment
Open
4 tasks

Course sequence listings #16

wllmwu opened this issue Sep 9, 2022 · 1 comment
Assignees
Labels
maintenance Issues related to developer experience scraping

Comments

@wllmwu
Copy link
Owner

wllmwu commented Sep 9, 2022

Description

  • Properly handle listings for course sequences when scraping the catalog (multiple courses put together in one listing)

Tasks

  • Detect multi-course listings in postprocessor
  • Extract separate course codes from title
  • Extract separate units from title if possible
  • Store an entry for each course code - duplicate all other course information
@wllmwu wllmwu added scraping maintenance Issues related to developer experience labels Sep 9, 2022
@wllmwu wllmwu self-assigned this Sep 9, 2022
@wllmwu wllmwu moved this to Triage in William Wu's projects Sep 9, 2022
@wllmwu wllmwu added this to the v1.2.0 milestone Dec 22, 2022
@wllmwu wllmwu moved this from Triage to Todo in William Wu's projects Dec 22, 2022
@wllmwu wllmwu moved this from Todo to In Progress in William Wu's projects Dec 26, 2022
@wllmwu wllmwu moved this from In Progress to Triage in William Wu's projects Dec 26, 2022
@wllmwu wllmwu removed this from the v1.2.0 milestone Dec 26, 2022
@wllmwu
Copy link
Owner Author

wllmwu commented Dec 26, 2022

Currently doesn't seem to be worth the effort - there are fewer than 70 multi-course listings in the catalog, most of them are graduate courses with little or no connectedness to other courses, and the descriptions in the catalog tend to be written with references to e.g. "these courses" or "this sequence" which could become confusing if the courses were separated. Removing from the v1.2.0 milestone but leaving the issue open for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Issues related to developer experience scraping
Projects
Status: Triage
Development

No branches or pull requests

1 participant