# FinRL_DeepSeek

**Repository Path**: spring-water-driver/FinRL_DeepSeek

## Basic Information

- **Project Name**: FinRL_DeepSeek
- **Description**: No description available
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-26
- **Last Updated**: 2025-10-26

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents
![Visitors](https://api.visitorbadge.io/api/VisitorHit?user=AI4Finance-Foundation&repo=FinRL_DeepSeek&countColor=%23B17A)

[![](https://dcbadge.vercel.app/api/server/trsr8SXpW5)](https://discord.gg/trsr8SXpW5)


Update1: The project is integrated to the original FinRL project by [AI4Finance](https://github.com/AI4Finance-Foundation/FinRL_DeepSeek)!

Update2: The project is the basis of task 1 in [FinRL contest 2025](https://open-finance-lab.github.io/FinRL_Contest_2025/)!

Installation script: `installation_script.sh`

Data: https://huggingface.co/datasets/benstaf/nasdaq_2013_2023/tree/main

Trading agents: https://huggingface.co/benstaf/Trading_agents/tree/main

# Results

![Alt Text](https://github.com/benstaf/FinRL_DeepSeek/blob/main/IMG_20250207_175434_001.jpg)


# Preliminary conclusion

Bull market -> PPO

Bear market -> CPPO-DeepSeek


## More details on installation of dependencies 
run `installation_script.sh` on Ubuntu server (128 GB RAM CPU instance recommended)

## Datasets and data preprocessing 

The basic dataset is FNSPID:
https://huggingface.co/datasets/Zihan1004/FNSPID (the relevant file is `Stock_news/nasdaq_exteral_data.csv`)

https://github.com/Zdong104/FNSPID_Financial_News_Dataset

https://arxiv.org/abs/2402.06698

LLM signals are added by running `sentiment_deepseek_deepinfra.py` and `risk_deepseek_deepinfra.py`, to obtain:  
- https://huggingface.co/datasets/benstaf/nasdaq_news_sentiment
- https://huggingface.co/datasets/benstaf/risk_nasdaq

Then this data is processed by `train_trade_data_deepseek_sentiment.py` and `train_trade_data_deepseek_risk.py` to generate agent-ready datasets.  
For plain PPO and CPPO, `train_trade_data.py` is used.

## Training and Environments  
- For training PPO, run:  
  `nohup mpirun --allow-run-as-root -np 8 python train_ppo.py > output_ppo.log 2>&1 &`


- For CPPO: `train_cppo.py`  
- For PPO-DeepSeek: `train_ppo_llm.py`  
- For CPPO-DeepSeek: `train_cppo_llm_risk.py`  

Environment files are:  
- `env_stocktrading.py` for PPO and CPPO, same as in the original FinRL  
- `env_stocktrading_llm.py` or `env_stocktrading_llm_01.py` for PPO-DeepSeek (depending on the desired LLM influence. More tweaking would be interesting)  
- `env_stocktrading_llm_risk.py` or `env_stocktrading_llm_risk_01.py` for CPPO-DeepSeek  

Log files are `output_ppo.log`, etc., and should be monitored during training, especially:  
- `AverageEpRet`  
- `KL`  
- `ClipFrac`  

## Evaluation  
Evaluation in the trading phase (2019-2023) happens in the `FinRL_DeepSeek_backtest.ipynb` Colab notebook.  
Metrics used are `Information Ratio`, `CVaR`, and `Rachev Ratio`, but adding others like `Outperformance frequency` would be nice.