-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
31 changed files
with
12,023 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,5 @@ | |
_site | ||
.sass-cache | ||
.jekyll-metadata | ||
Gemfile.lock | ||
Gemfile.lock | ||
_posts/_contents/.Rhistory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
--- | ||
toc: true | ||
layout: category | ||
permalink: /categories/FMEfficient/ | ||
taxonomy: FMEfficient | ||
entries_layout: list | ||
classes: wide | ||
title: FMEfficient | ||
desc: "Recent Readings for Efficiency of Foundation Models (since 2022)" | ||
order: "0" | ||
author_profile: true | ||
sidebar: | ||
title: "Reviews Indexed" | ||
nav: sidebar-sample | ||
--- | ||
|
||
|
||
|
||
<p><a name="topPage"></a></p> | ||
|
||
<hr> | ||
<h1 class="page-title">{{ page.desc }} (Index of Posts):</h1> | ||
|
||
<table id="datatab3" summary="Table of Readings" border="1"> | ||
<tr> | ||
<h3> | ||
<b> | ||
<th>No.</th> | ||
<th>Read Date</th> | ||
<th>Title and Information</th> | ||
<th>We Read @</th> | ||
</b> | ||
</h3> | ||
</tr> | ||
|
||
{% assign counter = 0 %} | ||
{% assign sortedp = site.posts | sort: 'date' | reverse %} | ||
{% for post in sortedp %} | ||
{% if post.categories contains page.title %} | ||
{% assign counter=counter | plus:1 %} | ||
|
||
<tr> | ||
<td>{{ counter }}</td> | ||
<td><span class="date"> {{ post.date | date: "%Y, %-b, %-d " }}</span></td> | ||
<td><a href="{{ site.baseurl }}{{ post.url }}">{{ post.title }}</a></td> | ||
<td>{{ post.desc }}</td> | ||
</tr> | ||
{% endif %} | ||
{% endfor %} | ||
|
||
</table> | ||
|
||
|
||
|
||
<!--- present its posts in orders --> | ||
|
||
|
||
<hr> | ||
<br> | ||
<h1>Here is a detailed list of posts!</h1> | ||
<br> | ||
|
||
|
||
{% assign counter = 0 %} | ||
{% assign sorted = site.posts | sort: 'date' | reverse %} | ||
{% for post in sorted %} | ||
{% if post.categories contains page.title %} | ||
{% assign counter=counter | plus:1 %} | ||
|
||
<div class="posts"> | ||
<hr> | ||
<div class="post"> | ||
<h2 class="post-title">[{{ counter }}]: | ||
<a href="{{ site.baseurl }}{{ post.url }}"> | ||
{{ post.title }} | ||
</a> | ||
</h2> | ||
|
||
{% if post.date %} | ||
<span class="post-date">read on: - {{ post.date | date_to_string }}</span> <br> | ||
{% endif %} | ||
|
||
{% if post.tags %} | ||
{% for word in post.tags %} | ||
{% assign wordd = word | downcase %} | ||
<a class="button" href="{{ site.baseurl }}/aReadingsIndexByTags/#{{wordd | replace:" ","-" }}"> {{ word }}</a> | ||
{% endfor %} | ||
{% endif %} | ||
|
||
{% if post.content contains '<!--excerpt.start-->' %} | ||
{{ post.content | split:'<!--excerpt.start-->' | first }} | ||
{% else %} | ||
{{ post.content }} | ||
{% endif %} | ||
|
||
</div> | ||
|
||
{% endif %} | ||
{% endfor %} | ||
|
||
<hr> | ||
<hr> | ||
<br> | ||
<h1>Here is a name list of posts!</h1> | ||
<br> | ||
|
||
|
||
|
||
<div style="position: fixed; bottom: 39px; right:10px; width: 129px; height: 58px; background-color: #FFCF79;"> | ||
<a style="position: fixed; bottom:40px; right:10px;" href="#topPage" title="Back to Top">BackTop</a> | ||
</div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
layout: post | ||
title: Introduction | ||
lecture: | ||
lectureVersion: current | ||
extraContent: | ||
notes: instructor | ||
video: on nlp basics | ||
tags: | ||
- BasicLLM | ||
desc: 2024-S0 | ||
term: 2024-seminarRead | ||
categories: | ||
- FMBasic | ||
--- | ||
|
||
|
||
|
||
## Readings: | ||
|
||
#### Basics of ML and DL: | ||
- [URL](https://qiyanjun.github.io/2022sp-UVA-CS-MachineLearningDeep/) | ||
|
||
#### Basics of NLP | ||
- [URL](https://qiyanjun.github.io/2022sp-UVA-CS-MachineLearningDeep//Lectures/S3-deepNNtext.pdf) | ||
- Typical NLP tasks / Challenges / Pipeline | ||
- f() on natural language | ||
+ Before Deep NLP (Pre 2012) • (BOW / LSI / Topic Modeling LDA ) | ||
+ Word2Vec (2013-2016) • (GloVe/ FastText) | ||
+ Recurrent NN (2014-2016) • LSTM | ||
+ Seq2Seq | ||
+ Attention | ||
+ Self-Attention (2016 – now ) | ||
+ Transformer (attention only Seq2Seq) | ||
+ BERT / RoBERTa/ XLNet/ GPT / ... | ||
|
||
|
||
+ A good code walk through on transformer at [URL](https://nlp.seas.harvard.edu/annotated-transformer/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
layout: post | ||
title: LLM basics | ||
lecture: S0-Intro | ||
lectureVersion: current | ||
extraContent: | ||
notes: instructor | ||
video: on llm basics | ||
tags: | ||
- BasicLLM | ||
desc: 2024-S1 | ||
term: 2024-seminarRead | ||
categories: | ||
- FMBasic | ||
--- | ||
|
||
|
||
|
||
## Required Readings: | ||
|
||
#### Emergent Abilities of Large Language Models | ||
+ [ URL](https://arxiv.org/abs/2206.07682) | ||
+ "an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models."| | ||
|
||
#### Language Models are Few-Shot Learners | ||
+ [ URL](https://arxiv.org/abs/2005.14165) | ||
+ "GPT-3, 175B autoregerssive LLM; show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches."| | ||
|
||
|
||
## Extra Readings: | ||
|
||
|
||
#### A survey of Generative AI Applications | ||
+ https://arxiv.org/abs/2306.02781 | ||
+ Generative AI has experienced remarkable growth in recent years, leading to a wide array of applications across diverse domains. In this paper, we present a comprehensive survey of more than 350 generative AI applications, providing a structured taxonomy and concise descriptions of various unimodal and even multimodal generative AIs. The survey is organized into sections, covering a wide range of unimodal generative AI applications such as text, images, video, gaming and brain information. Our survey aims to serve as a valuable resource for researchers and practitioners to navigate the rapidly expanding landscape of generative AI, facilitating a better understanding of the current state-of-the-art and fostering further innovation in the field. | ||
|
||
|
||
#### Generative AI: Perspectives from Stanford HAI | ||
+ https://hai.stanford.edu/generative-ai-perspectives-stanford-hai | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
layout: post | ||
title: Survey LLMs and Multimodal FMs | ||
lecture: S1-LLM | ||
lectureVersion: current | ||
extraContent: | ||
notes: instructor | ||
video: on FM list | ||
tags: | ||
- BasicLLM | ||
desc: 2024-S2 | ||
term: 2024-seminarRead | ||
categories: | ||
- FMMulti | ||
--- | ||
|
||
|
||
In this session, our readings cover: | ||
|
||
## Readings: | ||
|
||
|
||
#### ChatGPT is not all you need. A State of the Art Review of large Generative AI models | ||
+ Roberto Gozalo-Brizuela, Eduardo C. Garrido-Merchan | ||
+ https://arxiv.org/abs/2301.04655 | ||
+ During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion that have been published. Concretely, these models are able to perform tasks such as being a general question and answering system or automatically creating artistic images that are revolutionizing several sectors. Consequently, the implications that these generative models have in the industry and society are enormous, as several job positions may be transformed. For example, Generative AI is capable of transforming effectively and creatively texts to images, like the DALLE-2 model; text to 3D images, like the Dreamfusion model; images to text, like the Flamingo model; texts to video, like the Phenaki model; texts to audio, like the AudioLM model; texts to other texts, like ChatGPT; texts to code, like the Codex model; texts to scientific texts, like the Galactica model or even create algorithms like AlphaTensor. This work consists on an attempt to describe in a concise way the main models are sectors that are affected by generative AI and to provide a taxonomy of the main generative models published recently. | ||
|
||
|
||
#### A Survey of Large Language Models | ||
+ https://arxiv.org/abs/2303.18223 | ||
+ Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions. | ||
|
||
|
||
|
||
|
||
### On the Opportunities and Risks of Foundation Models | ||
+ https://arxiv.org/abs/2108.07258 | ||
+ " a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations)." |
Oops, something went wrong.