e.g. http://localhost/api/v2/annotator/annotate
-
POST /upload
- Req:
"corpus"
or"original_text"
, not both - Res:
"corpus_id"
- Req:
-
POST /parser/divide
- Req:
"corpus_id"
,"divide_options"
- Res:
"task_id"
- Req:
-
POST /parser/parse
- Req:
"corpus_id"
,"parse_options"
- Res:
"task_id"
- Req:
-
POST /annotator/annotate
- Req:
"corpus_id"
,"annotate_options"
- Res:
"task_id"
- Req:
-
POST /annotator/reannotate
- Req:
"corpus_id"
,"annotate_options"
,"reannotate_options"
- Res:
"task_id"
- Req:
-
GET /tasks/
- Res: [
{"task_id", "status", "target_corpus_id"}
, ...]
- Res: [
-
GET /tasks/<id>
- Res:
"status"
,"target_corpus_id"
- Res:
-
GET /tasks/<id>/abort
- Res:
"success": true
- Note: Only effective when task.status is in [
READY
,RUNNING
]
- Res:
-
GET /corpuses/<id>
- Res:
"corpuses_history"
- Res:
-
GET /corpuses/
- Res: [
{"corpus_id", "corpuses_history"}
, ...]
- Res: [
-
GET /user/available-openai-tokens
- Res:
"available-openai-tokens"
- Res:
-
GET /user/get-temp-user
- Res:
"success"
,"key"
- Note: The session logs in. The email will end with
@example.com
.
- Res:
-
GET /user/key
- Res:
"key"
- Note:
"key"
is not needed since it also uses the session.
- Res:
-
GET /user/logout
-
See
dj-rest-auth
Doc -
POST /login/
(that is,/api/v4/rest-auth/login/
)- Req:
"email"
,"password"
, - Res:
"key"
- Note:
is not requiredusername
- Req:
-
POST /logout/
-
POST /registration/
- Req:
"email"
,"password1"
,"password2
(="password1
) - Note:
is not requiredusername
- Req:
-
etc. (See the doc above)
-
Note: For every POST endpoints, the
csrftoken
that is saved on the client-side cookie has to be included on the request header with nameX-CSRFToken
. Test the backend index page to see this behavior. (Implemented here)
- Reserved
Token.gloss
:"!UNKNOWN"
- For
AnnotateOptions.annotator_name
null
: defaults to"dummy"
"dummy"
: for test purposes"chatpgt_ft0"
: uses the pretrained ChatGPT model (default fallback)"chatgpt_gpt-3.5-turbo-untrained_0"
: Using untrained"chatgpt_gpt-3.5-turbo-pretrained_0"
: Using pretrained"chatgpt_gpt-4o-mini-untrained_0"
"chatgpt_gpt-4o-mini-pretrained_0"
- On
/annotator/reannotate
, the fields ofAnnotateOptions
:.target_paragraphs
: only the first element is used.- The other fields can be null can be null (will use the previous one)
class Parser:
def divide_into_paragraphs(c: Corpus, paragraph_delimiters=?: list[str], **kwargs) {
c.p_div_locs = [p0, p1, p2, ...]
c.paragraphs = [
Paragraph(original_text=c.original_text[p:q], pstate="DIVIDED")
for p, q
in the sequence of the c.p_div_locs
]
}
def parse_paragraph(p: Paragraph, token_delimieters=?: list[char], **kwargs):
p.tokens = [Token() ...]
p.pstate = "PARSED"
p.token_delimieters = token_delimiters
class Annotator:
def annotate(p: Paragraph, lang_from: str, lang_to: str, **kwargs):
if not p.is_delimiter:
for token in p.tokens:
if token.is_delimiter:
continue
token.gloss = ...
p.pstate = "ANNOTATED"
p.annotator_info = ...
- Note: The parallelistic behavior is not yet implemented.
TaskInfo.status
is in [READY
, RUNNING
, FINISHED
, ERROR
, ABORTED
]
- password
- available_openai_tokens
- Served in https://parkchamchi.github.io/GlossySnake/samples/v1/
- The
index.json
serves{"filenames": [...]}
-
GET /user/check
- Res:
"is_auth"
,"email"
,"key"
- Note: No authentication required.
- Not used anymore as the session auth is deprecated
- Res:
-
GET /user/openai_api_key
- Res:
"openai_api_key"
- Note: can be null. (if not set)
- Res:
-
PUT /user/openai_api_key
-
DELETE /user/openai_api_key
-
User.personal_openai_key
(take care of this sensitive data)
class Task:
def run(self, func, data):
self.status = "RUNNING"
uploaded_corpus = get from self.target_corpus_id
uploaded_corpus.current_task = task_id
# Run the task
func(args)
#newcorpus.task_ids.append(task_id)
#uploaded_corpuzs.corpuses_history.append(newcorpus)
#After the task is completed:
uploaded_corpus.current_task = None
self.status = "FINISHED"
- TODO: Change the model and seperate the array fields (in the next iteration)
Week 2 참조.
paragraph_delimiters
: list of strings: 문단을 나누는 문자열paragraphs
: list of objectsis_delimiter
: boolean: 문단을 나누는 문자열인지 여부 (예: newline)token_delimiters
: string (of chars): 토큰을 나누는 문자들tokens
: list of objectstxt
: string: 단어is_delimiter
: boolean: 토큰이 delimiter인지 여부gloss
: string or null: 번역물."!UNKNOWN"
: 번역물 제공 실패"!CONTINUED"
: 이전 gloss에서 이어짐. (예시:["fishing", "rod"]
~["낚시대", "!CONTINUED"]
)