Skip to content

Latest commit

 

History

History
94 lines (74 loc) · 7.43 KB

NLP202001.md

File metadata and controls

94 lines (74 loc) · 7.43 KB

Introduction to Natural Language Processing

This course is on the fundamentals of natural language processing (NLP), including text representation, language modeling, NLP tasks and paradigms. Case studies on open-source tools are used to illustrate techniques and trade-offs.

Objectives

The main goal of this basic level course is to provide an excursion into research in NLP, emphasizing text processing, language modeling, text embeddings, and NLP tasks through programming projects. Upon successful completion of this course, the student should be able to:

  • Modeling text as vectors.
  • Use text processing algorithms.
  • Design learning approaches for NLP tasks.

Grading Scheme

The final grade will be determined by the grades in assignments, exams, and participation as follows:

  • 65%: Assignments
    • 15%: Homeworks
    • 50%: Projects
  • 25%: Final Exam
  • 10%: Participation

Homeworks

Students will complete multiple homeworks during the course. These assignments are designed to reinforce the lectures and reading materials and their due dates are posted in this website. Each assignment will be graded out of a total of 100 points and are counted equally when computing the homeworks portion of the final grade.

Projects

Students will complete programming projects during the course. Each project is cumulative, i.e., the students need to successfully complete each assignment in order to complete the next one and we will not release solutions for the projects. Each project will be graded out of a total of 100 points and are counted equally when computing the projects portion of the final grade.

Exams

There will be one in-class exams during the course, a final exam at the end of the semester. The exam will be based on the mandatory readings and topics discussed in class.

Participation

Students must be actively involved in classes, performing the required activities. There are several ways of earning participation credit, including attending lectures, punctuality, completing feedback surveys, and behavioral aspects, such as not disturbing other students. Participation will be graded out of a total of 100 points and are counted equally when computing the participation portion of the final grade.

Rules and Honor Code

Assignments Policies

Assignments and exams should be completed independently by each student and any program code should always be appropriately commented. Students will be held responsible for all information presented in the assignments and they must strictly follow the instructions provided for each assignment. Students should be sure to hand in assignments on time, don't waiting until the last minute to begin. Starting early will give students ample time to ask questions and obtain assistance. We recommend to reserve late days only for legitimate emergencies.

Late penalties are a loss of a percentage of the original overall points for the assignment so that any assignment handed in late will be marked off 25% per day. That is, after 4 days, the grade will be zero. Each late day constitutes a 24-hour extension, including all weekend days and holidays. Students cannot split late days into smaller increments. For instance, a submission that is 1 minute late will count as one day late.

In extreme circumstances, such as medical emergencies, we will grant no-penalty extensions. Please be prepared to provide written documentation, e.g., doctor's note.

Honesty Policies

Students should not look for assignment answers elsewhere. The use of pre-existing code is allowed since properly acknowledged. Students who demonstrably violate the Academic Honesty policy presented in PUC Minas Student Guide will receive a failing grade in the course and the case will be reported to the Administrative Board of the School, who could require suspension from all future work. Prohibited behaviors include:

  • copying all or part of another person's work, even if you subsequently modify it
  • viewing all or part of another student's work
  • showing all or part of your work to another student
  • consulting solutions from past semesters, or those found in books or on the Web

Not knowing or misunderstanding the rules, running out of time, submitting "the wrong version", or being overwhelmed with multiple demands are not acceptable excuses. There are no excuses for failure to uphold policies. Plagiarism checker, such as the Moss system, can be used to screen submitted programs for plagiarism. Over the years, we unfortunately had to fail students for copying on assignments. To avoid problems, limit any discussion of assignments with other students to clarification of the requirements or definitions of the problems, or to understanding the existing programs or general course material. Never discuss issues directly relevant to problem solutions.

Schedule

This is a tentative schedule, enabling students to see what is coming up or what they will miss if absent, but changes can happen. Details of the schedule, materials and reading lists will be updated as the course progresses. Periodical reading assignments from recent research articles will be given and should be read before the corresponding lecture. Mostly, lectures will use slides to allow students to focus on understanding the material during class and reduce the need for taking notes. However, simply reading the slides is no substitute for attending class, in which additional explanation and discussion are presented.

Lectures

# Date Topic or Activity Material
1 Mar 3 First Lecture: course goals, activities and schedule slide
2 Mar 10 Introduction to NLP slide | videos
3 Mar 17 Text representation and vector space slide | videos
4 Mar 24 Text preprocessing slide | videos
5 Mar 31 Language modeling slide | videos
6 Apr 7 Language modeling slide | videos
7 Apr 14 Text classification slide | videos
8 Apr 28 NLP tasks and paradigms slide | videos
9 May 5 Tagging slide | videos
10 May 12 Sequence labeling slide | videos
11 May 19 Language generation slide | videos
12 May 26 Semantic similarity slide | videos
13 Jun 2 Introduction to neural networks slide | videos
14 Jun 9 Neural NLP slide | videos
15 Jun 16 Word embeddings slide | videos
16 Jun 23 Sentence embeddings slide | videos
17 Jun 30 Seminars slide | videos
18 Jul 7 Final exam slide | videos

Assignments

ID Assignment Type Release Date Due Date Solution
AS01 Text preprocessing homework Mar 10, 2019 Mar 31, 2019 @ 23:59 Solution
AS02 Text classification homework Mar 31, 2019 Apr 28, 2019 @ 23:59 Solution
AS03 Entity recognition homework Apr 28, 2019 May 19, 2019 @ 23:59 Solution
AS04 Final project project Mar 3, 2019 Jun 29, 2019 @ 23:59 -