-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
40 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,12 @@ | ||
--- | ||
title: "Chapter 7.4: Tasks & Performance" | ||
title: "Chapter 07.04: Tasks & Performance" | ||
weight: 7004 | ||
--- | ||
|
||
GPT-3 has X-shot abilities, meaning it is able to perform tasks with minimal or even no task-specific training data. This chapter provides an overview over various different tasks and illustrates the X-shot capabilities of GPT-3. Additionally you will be introduced to relevant benchmarks. | ||
|
||
<!--more--> | ||
|
||
### Lecture Slides | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter07-gpt/slides-74-tasks.pdf" >}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,14 @@ | ||
--- | ||
title: "Chapter 7.5: Discussion: Ethics and Cost" | ||
title: "Chapter 07.05: Discussion: Ethics and Cost" | ||
weight: 7005 | ||
--- | ||
|
||
In discussing GPT-3's ethical implications, it's crucial to consider its potential societal impact, including issues surrounding bias, misinformation, and data privacy. With its vast language generation capabilities, GPT-3 has the potential to disseminate misinformation at scale, posing risks to public trust and safety. Additionally, the model's reliance on large-scale pretraining data raises concerns about reinforcing existing biases present in the data, perpetuating societal inequalities. Furthermore, the use of GPT-3 in sensitive applications such as content generation, automated customer service, and decision-making systems raises questions about accountability, transparency, and unintended consequences. As such, responsible deployment of GPT-3 requires careful consideration of ethical guidelines, regulatory frameworks, and robust mitigation strategies to address these challenges and ensure the model's ethical use in society. | ||
In discussing GPT-3's ethical implications, it is crucial to consider its potential societal impact, including issues surrounding bias, misinformation, and data privacy. With its vast language generation capabilities, GPT-3 has the potential to disseminate misinformation at scale, posing risks to public trust and safety. Additionally, the model's reliance on large-scale pretraining data raises concerns about reinforcing existing biases present in the data, perpetuating societal inequalities. Furthermore, the use of GPT-3 in sensitive applications such as content generation, automated customer service, and decision-making systems raises questions about accountability, transparency, and unintended consequences. As such, responsible deployment of GPT-3 requires careful consideration of ethical guidelines, regulatory frameworks, and robust mitigation strategies to address these challenges and ensure the model's ethical use in society. | ||
|
||
|
||
<!--more--> | ||
{{< video id="TfrSKiOecWI" >}} | ||
<!--{{< video id="TfrSKiOecWI" >}}--> | ||
|
||
### Lecture Slides | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter07-gpt/slides-75-discussion.pdf" >}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,17 @@ | ||
--- | ||
title: "Chapter 8.1: Instruction Fine-Tuning" | ||
title: "Chapter 08.01: Instruction Fine-Tuning" | ||
weight: 8001 | ||
--- | ||
|
||
In this chapter we introduce instruction-tuning, which is a technique that allows us to adapt the models to follow instructions. | ||
Instruction fine-tuning aims to enhance the adaptability of large language models (LLMs) by providing explicit instructions or task descriptions, enabling more precise control over model behavior and adaptation to diverse contexts. | ||
This approach involves fine-tuning LLMs on task-specific instructions or prompts, guiding the model to generate outputs that align with the given instructions. By conditioning the model on explicit instructions, instruction fine-tuning facilitates more accurate and tailored responses, making LLMs more versatile and effective in various applications such as language translation, text summarization, and question answering. | ||
|
||
<!--more--> | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter8-multilinguality/slides-81-why_multilingual.pdf" >}} | ||
### Lecture Slides | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter08-llm/slides-81-instruction-tuning.pdf" >}} | ||
|
||
### Additional Resources | ||
|
||
- [Blog about Instruction Fine-Tuning](https://heidloff.net/article/instruct-tuning-large-language-models/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,16 @@ | ||
--- | ||
title: "Chapter 8.2: Chain-of-thought Prompting" | ||
title: "Chapter 08.02: Chain-of-thought Prompting" | ||
weight: 8002 | ||
--- | ||
|
||
In this session we cover Chain-of-thoght Prompting, which is a technique to improve the performance of models without requiring additional training. | ||
Chain of thought (CoT) prompting [1] is a prompting method that encourage Large Language Models (LLMs) to explain their reasoning. This method contrasts with standard prompting by not only seeking an answer but also requiring the model to explain its steps to arrive at that answer. By guiding the model through a logical chain of thought, chain of thought prompting encourages the generation of more structured and cohesive text, enabling LLMs to produce more accurate and informative outputs across various tasks and domains. | ||
<!--more--> | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter8-multilinguality/slides-82-multilingual-wordembs.pdf" >}} | ||
### Lecture Slides | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter08-llm/slides-82-chain-of-thought.pdf" >}} | ||
|
||
### References | ||
|
||
- [1] [Wei et al., 2022](https://arxiv.org/abs/2201.11903) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,15 @@ | ||
--- | ||
title: "Chapter 8.3: Emergent Abilities" | ||
title: "Chapter 08.03: Emergent Abilities" | ||
weight: 8003 | ||
--- | ||
Various researchers have reported that LLMs seem to have emergent abilities. In this section we discuss the concept of emergence in LLMs. | ||
Various researchers have reported that LLMs seem to have emergent abilities. These are sudden appearances of new abilities when Large Language Models (LLMs) are scaled up. In this section we introduce the concept of emergent abilities and discuss a potential counterargument for the concept of emergence. | ||
|
||
<!--more--> | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter8-multilinguality/slides-83-multilingual-transformers.pdf" >}} | ||
### Lecture Slides | ||
|
||
{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter08-llm/slides-83-emergent-abilities.pdf" >}} | ||
|
||
### Additional Resources | ||
|
||
- [Article: Large Language Models' Emergent Abilities Are a Mirage](https://www.wired.com/story/how-quickly-do-large-language-models-learn-unexpected-skills/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters