ChatGPT Micro-Cap Live Trading: AI-managed micro-cap portfolio experiment

A public live micro-cap trading experiment where ChatGPT makes daily trading decisions and records reproducible trade logs and performance visualizations; useful for research and teaching but constrained by legal, capital, and licensing considerations.

GitHub LuckyOne7777/ChatGPT-Micro-Cap-Experiment Updated 2026-01-07 Branch main Stars 7.1K Forks 1.5K

Python algorithmic-trading LLM-driven live-backtest-and-visualization

💡 Deep Analysis

What core problem does this project solve, and how does it achieve that through technology and process?

Core Analysis ¶

Project Positioning: This project addresses whether a general-purpose LLM (e.g., ChatGPT) can serve as a reproducible decision engine for live micro-cap trading. The aim is not only performance measurement but also to build an auditable experimental pipeline.

Technical Features ¶

Separation of data and decision: yfinance/Stooq for market data, trading_script.py for execution and stop-loss, ChatGPT for strategy suggestions; clear separation facilitates replacement and reproduction.
Reproducibility: ASOF_DATE enables historical replay/backtest, and daily CSVs plus chat logs are saved for reconstructing each decision point.
Rule-based risk control: Automated stop-losses and position limits act as hard constraints, creating a hybrid “LLM + rules” governance.

Usage Recommendations ¶

Start with historical replay and simulation: Use ASOF_DATE to backtest and verify outputs against the CSVs in the repo.
Retain raw chat and execution logs for post-hoc analysis of model errors or unusual market events.
Progress from paper trading to small live capital to observe slippage and latency impacts.

Important Notice: The README mentions ChatGPT-5; model versions, API behavior, and costs will affect reproducibility. Record model version and call settings.

Summary: The project is a research-oriented, modular framework that places LLM decision-making into a live micro-cap context with rule-based controls and logging, suitable for technically proficient users validating LLM-driven trading hypotheses.

85.0%

What are the key strengths and potential weaknesses of the project's technical architecture? Why were these technologies (`yfinance`, `pandas`, ChatGPT, etc.) chosen?

Core Analysis ¶

Architecture pros and cons overview: The project uses a lightweight Python stack (yfinance, pandas, Matplotlib) with ChatGPT as the decision engine. This lowers the barrier to entry and accelerates hypothesis testing but introduces external dependencies and uncertainties around data and execution.

Technical Features and Strengths ¶

Rapid prototyping & replaceability: pandas-based processing and modular scripts make it easy to swap data sources or replace ChatGPT with another model.
Low deployment requirements: Only Python 3.11+ and internet access are required, suitable for researchers running experiments locally or on cloud instances.
Transparent logging: CSV exports and saved chat logs enable auditing and reproducibility.

Potential Weaknesses ¶

Data quality risk: yfinance can have delays or missing data; reliable open/market prices are critical for trading decisions.
Missing execution layer: The repo does not include a broker API integration; handling real fills, slippage, and order failures is left to the user.
Model externalities: Reliance on ChatGPT API (versioning, latency, cost, behavioral drift) affects long-term reproducibility and cost profile.

Practical Recommendations ¶

Replace data source before production: For larger capital or long-term live trading, connect to exchange-grade or paid data providers to ensure data integrity.
Implement an execution-simulation layer: Model slippage, fees, and partial fills during backtests to increase real-world fidelity.
Record model metadata: Save model version, prompts, temperature, and other call parameters to reproduce experiments.

Important Notice: Without explicit fee and slippage assumptions, historical results in the repo should be treated as research references, not directly transferable to live results.

Summary: The stack is well-suited for research and rapid prototyping, but production-grade deployment requires strengthening data, execution, and model governance.

85.0%

How should one move from repo simulation/suggestions to real broker execution? How to handle slippage, partial fills, and risk control in live execution?

Core Analysis ¶

Key gap from suggestion to execution: The repo contains trade suggestions and logic for MOO/limit orders, but lacks a real broker execution layer, order lifecycle management, and reconciliation. Running these scripts live without modifications exposes you to slippage, partial fills, and compliance risks.

Required changes for live execution ¶

Broker/execution integration: Extend trading_script.py to support broker SDKs (e.g., alpaca-trade-api, ib_insync), implement idempotent order placement, status polling, and retry logic.
Order lifecycle management: Handle submitted, partial_fill, filled, rejected states and persist raw fills for reconciliation.
Slippage & fee modeling: Incorporate slippage and commission models into backtests based on volume or historical spreads to avoid optimistic performance.
Risk checks pre-order: Apply deterministic checks (max position, per-trade risk, exposure limits, stop-loss thresholds) and block or downgrade orders that violate rules.

Practical execution workflow ¶

Connect to a sandbox: Test order placement, cancellation, and receipt handling in a broker sandbox.
Small live pilot: Execute with minimal capital to measure real fills, slippage, and differences from backtest.
Automated reconciliation & alerts: Reconcile daily CSVs with broker fills; lock further orders if anomalies occur.

Important Notice: Persisting trade receipts and LLM chat logs is crucial for dispute resolution and reproducibility. Use encryption and access controls for sensitive logs.

Summary: Converting the repo into a reliable live system requires broker integration, order lifecycle handling, slippage/fee modeling, and enforced risk/reconciliation processes. Progress gradually with small-scale pilots.

85.0%

How to improve the experiment's reproducibility and research rigor? What concrete enhancements would increase the credibility of its conclusions?

Core Analysis ¶

Current reproducibility status: The repo provides several reproducibility foundations (CSV logs, chat records, ASOF_DATE replay), but lacks standardized recording of model metadata, slippage/fee assumptions, data snapshots, and environment specifications—weakening long-term reproducibility.

Key concrete improvements ¶

Record full metadata: Capture model version (exact API version), full prompts, temperature/random seeds, API call timestamps, data source versions and query params.
Data snapshots & versioning: Produce downloadable snapshots or hashes of key daily market data so others can obtain identical inputs.
Explicit transaction-cost modeling: Incorporate commissions, bid/ask spread, and volume-based slippage into backtests and publish those assumptions.
Statistical robustness checks: Run multiple randomized replays, bootstrap, or controlled baselines (random portfolios) to assess whether results are significant versus noise.
Environment/dependency encapsulation: Provide requirements.txt/poetry.lock or Dockerfile to fix Python and library versions, reducing environment drift.
Clarify licensing & compliance: State repo license and data usage constraints to enable lawful reproduction and citation.

Practical steps ¶

Add a meta.json to each run that writes model metadata and data hashes.
Attach data snapshot links or hash values when publishing research results.
Define standard backtest scenarios (baseline, conservative, stressed) and publish results with confidence intervals.

Important Notice: Preserve raw chat logs and trade receipts long-term; model and data providers may change behavior over time.

Summary: By systematizing metadata capture, versioned data snapshots, explicit cost modeling, and statistical robustness tests, the project can raise reproducibility and research credibility from an experimental level to an auditable academic standard.

85.0%

✨ Highlights

Live experiment: ChatGPT manages a micro-cap portfolio daily and publishes data
Includes reproducible trading scripts and performance visualization tools
Repository lacks a clear license and shows minimal contributor/release activity
Real-money trading carries actual financial and regulatory risks — exercise caution

🔧 Engineering

LLM-driven trading engine with automated stop-loss, performance tracking and visualization
Supports historical backtest (ASOF_DATE), market-on-open and limit order simulation

⚠️ Risks

Lack of license and community maintenance hampers reproducibility and collaboration long-term
Relies on proprietary LLMs and third-party data; live trading risks include real losses and compliance exposure

👥 For who?

Quant researchers, algorithmic trading enthusiasts, and educational experiment sample
Suitable for small-cap live testing, strategy prototyping, and studying AI decision transparency