Skip to content

Commit

Permalink
Merge pull request #114 from lean-dojo/peiyang
Browse files Browse the repository at this point in the history
Fix string UTF-8 misformatting PANIC
  • Loading branch information
Peiyang-Song authored Aug 14, 2024
2 parents d7ad9f5 + 7e4910c commit 88e50cb
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion LeanCopilot/Models/ByT5.lean
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,9 @@ def tokenize (text : String) : Array String :=


def detokenize (tokens : Array String) : String :=
String.fromUTF8! ⟨tokens.map tokenToByte!⟩
match (String.fromUTF8? ⟨tokens.map tokenToByte!⟩) with
| some s => s
| none => ""


def eosToken := "</s>"
Expand Down

0 comments on commit 88e50cb

Please sign in to comment.