Skip to content

Releases: HPMLL/BurstGPT

BurstGPT v1.1

13 Jun 14:14
Compare
Choose a tag to compare

Main Characteristics

  • Duration: 121 consecutive days in 4 consecutive months.
  • Dataset size: ~5.29M lines, ~188MB. Comprises 4.81 million traces of ChatGPT and 0.24 million traces of GPT-4 using API services, and 0.16 million traces of ChatGPT and 0.08 million traces of GPT-4 using conversational services.

Files

  • BurstGPT_1.csv contains all of our trace in the first 2 months with some failure that Response tokens are 0s. Totally 1429.7k lines.

  • BurstGPT_without_fails_1.csv contains all of our trace in the first 2 months without failure. Totally 1404.3k lines.

  • BurstGPT_2.csv contains all of our trace in the second 2 months with some failure that Response tokens are 0s. Totally 3858.4k lines.

  • BurstGPT_without_fails_2.csv contains all of our trace in the second 2 months without failure. Totally 3784.2k lines.

Full Changelog: v1.0...v1.1

BurstGPT v1.0

28 Apr 08:48
Compare
Choose a tag to compare

As we continue to update and improve our dataset in the future, this release marks the initial version of our dataset. Thank you for your continued support and feedback as we work to improve our dataset.

Main characteristics

  • Duration: 61 consecutive days in 2 consecutive months.
  • Dataset size: 1.4m lines, ~50MB.

Schema

  • Timestamp: request submission time, seconds from 0:00:00 on the first day.
  • Model: called models, including ChatGPT and GPT-4.
  • Request tokens: Request tokens length.
  • Response tokens: Response tokens length.
  • Total tokens: Request tokens length plus response tokens length.
  • Log Type: the way users call the model, in conversation mode or using API, including Conversation log and API log.