Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final dh #5

Open
wants to merge 65 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
f610ed7
Create README.md
sangki930 May 25, 2021
97f4701
add feature
tofulim May 27, 2021
d2521e1
models commit
sangki930 May 28, 2021
b0e15a1
model recommit
sangki930 May 28, 2021
4913de4
[dh] new feature commit
tofulim May 28, 2021
1357646
[dh] k-fold commit
tofulim Jun 2, 2021
4ab9a96
[dh] make time2global feature
tofulim Jun 3, 2021
312c880
commit test
sangki930 Jun 3, 2021
1160816
[sangki930] branch commit
sangki930 Jun 3, 2021
af9892a
[dh] commit to merge
tofulim Jun 4, 2021
e527d8c
Update README.md
sangki930 Jun 4, 2021
23bf831
Merge pull request #2 from bcaitech1/new_branch_01
sangki930 Jun 4, 2021
5a4307a
pplz
tofulim Jun 4, 2021
4831d39
[dh] test
tofulim Jun 4, 2021
0d933bd
[dh] oh yeah
tofulim Jun 4, 2021
7a99180
add master
PrimeOfMine Jun 4, 2021
0f34d9b
dh explane
tofulim Jun 6, 2021
b110420
Merge branch 'comb_main' into sangki930
tofulim Jun 6, 2021
cf7a7e5
Merge pull request #3 from bcaitech1/sangki930
tofulim Jun 6, 2021
85449dd
test_sangki
sangki930 Jun 6, 2021
8cbe13d
sangki commit
sangki930 Jun 6, 2021
401ef43
cm
tofulim Jun 6, 2021
c55eb22
Merge branch 'comb_main' of https://github.com/bcaitech1/p4-dkt-olleh…
tofulim Jun 6, 2021
4f73e48
[dh] cont fix..
tofulim Jun 7, 2021
c813636
cm
tofulim Jun 7, 2021
02ab294
[dh] continuous fix
tofulim Jun 7, 2021
eff3dea
cm
tofulim Jun 7, 2021
088158b
[dh] submit fix, lstmattn fix
tofulim Jun 7, 2021
db91d72
[dh] change setting and split model.py to each architecture
tofulim Jun 9, 2021
03b01a9
[dh] change setting
tofulim Jun 10, 2021
b091795
[dh] make new branch to merge feat
tofulim Jun 10, 2021
324463c
cm
tofulim Jun 10, 2021
975b8f6
[dh] add presicion,recall,f1 metric
tofulim Jun 11, 2021
e68165a
[dh] cont/cate mid check
tofulim Jun 11, 2021
0f8e947
[dh] mid check
tofulim Jun 11, 2021
19428b2
[dh] push for compare
tofulim Jun 12, 2021
40fb49c
[dh] apply on model
tofulim Jun 12, 2021
047c851
fixed untracked files
tofulim Jun 13, 2021
aaab502
[dh] model final fix
tofulim Jun 13, 2021
105a1d3
[dh] final model fix
tofulim Jun 13, 2021
edee0f0
[dh] lgbm change
tofulim Jun 14, 2021
c8a7246
[dh] lgbm change
tofulim Jun 14, 2021
f3d3de8
[dh] cm
tofulim Jun 14, 2021
e701e79
[dh] cm
tofulim Jun 14, 2021
9a1fe4b
edit for k-fold
PrimeOfMine Jun 14, 2021
731b63e
add comments
PrimeOfMine Jun 14, 2021
8c0194c
debugging
PrimeOfMine Jun 14, 2021
c5f44df
[dh] fix & pull
tofulim Jun 14, 2021
0be294a
Merge branch 'final_dh' of https://github.com/bcaitech1/p4-dkt-ollehd…
tofulim Jun 14, 2021
f610a72
[dh] use test file
tofulim Jun 15, 2021
8f39e38
[dh] final push
tofulim Jun 15, 2021
0488c36
[dh] push
tofulim Jun 15, 2021
d6c3d6e
[dh] ffffinal commit
tofulim Jun 15, 2021
552e5ce
Update README.md
tofulim Jun 20, 2021
5eccb79
Update README.md
tofulim Jun 20, 2021
cba42b1
Update README.md
tofulim Jul 20, 2021
12a85d2
Update README.md
tofulim Jul 20, 2021
0a26104
Update README.md
tofulim Jul 24, 2021
0b1e351
Update README.md
tofulim Jul 24, 2021
6bd2e44
Update README.md
tofulim Jul 24, 2021
1512d6d
Update README.md
tofulim Jul 25, 2021
c98b8ae
Update README.md
tofulim Jul 25, 2021
ca4a390
Create README.md
tofulim Jul 25, 2021
e9c5699
Update README.md
tofulim Jul 26, 2021
7b94b5f
Update README.md
tofulim Jul 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add feature
tofulim committed May 27, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 97f4701de01c93c8db0da1dfd0411a118377f14f
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -8,6 +8,8 @@

### 3. $ python3 whole-in-one.py
학습-추론 한번에 실행
단, lgbm은 inference를 따로 수행하지 않아도 됩니다. train부분에서 모두 처리
실행시 폴더에 학습 때 사용한 하이퍼 파라미터와 피처를 json으로 저장

### 4. $ python3 submit.py
key와 파일path를 입력하면 다운로드할 필요 없이 서버에서 바로 제출
2 changes: 1 addition & 1 deletion conf.yml
Original file line number Diff line number Diff line change
@@ -8,7 +8,7 @@ wandb :
- baseline

##main params
task_name: lgbm_test
task_name: lgbm_test_featureconf
seed: 42
device: cuda

5 changes: 4 additions & 1 deletion dkt/trainer.py
Original file line number Diff line number Diff line change
@@ -24,12 +24,15 @@ def run(args, train_data, valid_data):
train_loader, valid_loader = get_loaders(args, train_data, valid_data)

if args.model=='lgbm':
#학습
#학습(train_dataset+test_dataset(마지막행 제외)

model,auc,acc=lgbm_train(args,train_data,valid_data)
wandb.log({"valid_auc":auc, "valid_acc":acc})
#추론준비

csv_file_path = os.path.join(args.data_dir, args.test_file_name)
test_df = pd.read_csv(csv_file_path)#, nrows=100000)

test_df = make_lgbm_feature(test_df)
#유저별 시퀀스를 고려하기 위해 아래와 같이 정렬
test_df.sort_values(by=['userID','Timestamp'], inplace=True)
18 changes: 16 additions & 2 deletions lgbm_utils.py
Original file line number Diff line number Diff line change
@@ -21,7 +21,11 @@ def make_lgbm_feature(df):
correct_t.columns = ["test_mean", 'test_sum']
correct_k = df.groupby(['KnowledgeTag'])['answerCode'].agg(['mean', 'sum'])
correct_k.columns = ["tag_mean", 'tag_sum']

#학생의 학년을 정하고 푼 문제지의 학년합을 구해본다
df['test_level']=df['assessmentItemID'].apply(lambda x:int(x[2]))
correct_l = df.groupby(['userID'])['test_level'].agg(['mean', 'sum'])
correct_l.columns = ["level_mean", 'level_sum']
df = pd.merge(df, correct_l, on=['userID'], how="left")

df = pd.merge(df, correct_t, on=['testId'], how="left")
df = pd.merge(df, correct_k, on=['KnowledgeTag'], how="left")
@@ -32,6 +36,7 @@ def make_lgbm_feature(df):

def lgbm_split_data(data,ratio):
random.seed(42)

users = list(zip(data['userID'].value_counts().index, data['userID'].value_counts()))
random.shuffle(users)

@@ -94,6 +99,7 @@ def lgbm_train(args,train_data,valid_data):

_ = lgb.plot_importance(model)


return model,auc,acc

def lgbm_inference(args,model, test_data):
@@ -114,4 +120,12 @@ def lgbm_inference(args,model, test_data):
for id, p in enumerate(answer):
w.write('{},{}\n'.format(id,p))

print(f"lgbm의 예측파일이 {new_output_path}/{args.task_name}.csv 로 저장됐습니다.")
print(f"lgbm의 예측파일이 {new_output_path}/{args.task_name}.csv 로 저장됐습니다.")

save_path=f"{args.output_dir}{args.task_name}/feature{len(FEATS)}_config.json"
json.dump(
FEATS,
open(save_path, "w"),
indent=2,
ensure_ascii=False,
)
4 changes: 2 additions & 2 deletions submit.py
Original file line number Diff line number Diff line change
@@ -25,8 +25,8 @@ def submit(user_key='', file_path = ''):
requests.post(url=submit_url, data=body, files={'file': open(file_path, 'rb')})

if __name__ == "__main__":
test_dir='/opt/ml'#prediction folder path
test_dir='/opt/ml/code/output/lgbm_add_test_data'#prediction folder path

# 아래 글을 통해 자신의 key값 찾아 넣기
# http://boostcamp.stages.ai/competitions/3/discussion/post/110
submit("Bearer 15bdf505e0902975b2e6f578148d22136b2f7717", os.path.join(test_dir, 'answer.csv'))
submit("Bearer 15bdf505e0902975b2e6f578148d22136b2f7717", os.path.join(test_dir, 'output.csv'))
2 changes: 1 addition & 1 deletion train.py
Original file line number Diff line number Diff line change
@@ -20,7 +20,7 @@ def main(args):
preprocess = Preprocess(args)
preprocess.load_train_data(args.file_name)
train_data = preprocess.get_train_data()

train_data, valid_data = preprocess.split_data(train_data)