Evaluating the Impact of Chain-of-Thought Length in LLMs on Stock Price Movement Predictions

Objective: Assess whether large language models (LLMs) with extended chain-of-thought reasoning improve predictions of stock price movements following earnings announcements.

Background: Public companies listed on NASDAQ are required to publish quarterly earnings reports. Publicly traded companies on NASDAQ release quarterly earnings reports, which significantly impact stock prices. These earnings are typically announced either after market close (around 4:05 PM) or before market open (around 9:00 AM).

For earnings announced after market close, stock prices react almost instantly in after-hours trading, depending on whether earnings exceed or miss analysts' expectations. Overnight, extended-hour trading further adjusts the price, influenced by investor sentiment, option market signals, and liquidity.

At the 9:30 AM market open, another sharp movement occurs. The key question is whether this movement continues the after-hours trend or reverses, and whether longer chain-of-thought reasoning in LLMs can improve the accuracy of such predictions.

Methodology:

Data Collection: Gather historical earnings reports, corresponding stock price data, and analyst estimates.
LLM Configuration: Develop multiple LLMs with varying chain-of-thought lengths.
Input Preparation: Convert financial data into textual format suitable for LLM processing.
Prediction and Evaluation: Analyze each model's predictive accuracy regarding stock price movements post-earnings announcements.

References:

Mackintosh, Phil. Earnings Announcements Sliced and Diced nasdaq.com

Steps to reproduce the results

Install npm packages and pip packages:

npm ci
virtualenv venv
. ./venv/bin/activate
pip install jupyter

Setting up environment variables, put it in file .env
- FINNHUB_API_KEY
- MONGO_URL
- DATABENTO_API_KEY
- GEMINI_API_KEY
- GROQ_API_KEY
- OLLAMA_URL
Run the JavaScript scripts using node.js in the specific order. Note that on Linux you can use:
```
run-parts --regex '\.js$' scripts
```
To run them manually, invoke:
1. node scripts/01-download-earnings.js: Download 1-month company earnings data from Finnhub
  - writes to MongoDB collection earnings.earnings
2. node scripts/02-download-index.js: Download 1-month stock index data from Databento
  - writes to MongoDB timeseries earnings.stock_indexes
3. node scripts/03-download-symbols.js: Download stock symbol data from Databento
  - writes to MongoDB collection earnings.symbols
4. node scripts/04-download-ohlcv.js: Download historical stock price (bid, ask, trade) data from Databento, including EXT hours
  - writes to MongoDB collection earnings.prices
5. node scripts/05-unify-symbols.js: Tranform downloaded stock symbol data to filter out actively traded U.S. stocks.
  - reads from MongoDB collection earnings.symbols
  - writes to MongoDB collection earnings.symbol_ids
6. node scripts/06-transform-price.js: Transform downloaded stock price data into MongoDB timeseries for faster, easier processing
  - reads from MongoDB collection earnings.prices
  - writes to MongoDB timeseries earnings.prices_cleaned
7. node scripts/07-transform-earnings.js: Combine earnings data with stock prices data, computing key stock metrics
  - reads from MongoDB collection earnings.earnings
  - reads from MongoDB timeseries earnings.stock_indexes
  - reads from MongoDB timeseries earnings.prices_cleaned
  - writes to MongoDB collection earnings.earnings_cleaned
8. node scripts/08-generate-descriptions.js: For each earnings incident, generate a comprehensive, textual report briefing the historical stock price movement as well as intraday/after-market/pre-market trading activities before and after the earnings release
  - reads from MongoDB collection earnings.earnings_cleaned
  - writes to MongoDB collection earnings.earnings_cleaned
9. node scripts/09-combine-descriptions.js: Part all valid earnings data into examples (n=3) and test (n=120), then compile LLM prompts for making predictions on each of the test data
  - reads from MongoDB collection earnings.earnings_cleaned
  - writes to files in desc/<symbol>_<quarter>*.txt
10. node scripts/10-query-llms.js: For each LLM prompt, invoke many different LLM API to get answer (Gemini + GroqCloud + Ollama)
  - reads from files in desc/<symbol>_<quarter>*.txt
  - writes to MongoDB collection earnings.llm_outputs
11. node scripts/11-import-output.js: Some LLM does not have an open API or are too expensive - we need to manually collect the data, form a *.tsv file, and then feed to MongoDB
  - reads from the file specified by the command line arguments
  - writes to MongoDB collection earnings.llm_outputs
12. node scripts/12-parse-order.js: For each LLM output, parse the requested trade order, and output the net profit from such trade
  - reads from MongoDB collection earnings.llm_outputs
  - writes to MongoDB collection earnings.llm_outputs
13. node scripts/13-visualize-timeline.js: For each LLM, organize profit/loss into a timeline for easier visualization
  - reads from MongoDB collection earnings.llm_outputs
  - writes to file visual/timeline.html
14. node scripts/14-export-csv.js: Organize data into CSV for easier python processing
  - reads from MongoDB collection earnings.llm_outputs
  - writes to file visual/data.csv
  - writes to file visual/data.json
Open the Jupyter Notebook file scripts/15-data-visualizations.ipynb and follow directions
```
jupyter lab --ip 0.0.0.0 --no-browser
```

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating the Impact of Chain-of-Thought Length in LLMs on Stock Price Movement Predictions

Steps to reproduce the results

About

Releases

Packages

Contributors 2

Languages

b1f6c1c4/earnings-llm

Folders and files

Latest commit

History

Repository files navigation

Evaluating the Impact of Chain-of-Thought Length in LLMs on Stock Price Movement Predictions

Steps to reproduce the results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages