Analyze Custom CSV Data with GPT-4 using Langchain

TL;DR

LangChain’s create CSV agent can connect an LLM to a CSV by giving it a Python/pandas execution tool, enabling computed answers instead of guesses.

Briefing Cornell Notes

Briefing

A LangChain “CSV agent” can turn a custom Bitcoin price spreadsheet into a question-answering system that writes and runs pandas code on the fly—then returns numeric answers and even a bullish/bearish read. Using a preprocessed BTC daily price CSV (about 458 rows with columns for date plus open/high/low/close and derived fields like day-of-week, month, and year), the workflow connects GPT-style models to the dataset so queries like “average price during February 2023” or “best day to buy” become executable data analysis rather than static text responses.

The setup starts with LangChain’s create CSV agent, which provides the model a tool: a Python/pandas execution environment. The agent is prompted to plan steps, generate pandas code, run it, and then return the final answer. The transcript demonstrates this with two model configurations: an older text-davinci-003 setup and a GPT-4 setup (via a ChatOpenAI wrapper). Even basic sanity checks work: asking for the number of rows and columns triggers code using df.shape, and the returned dimensions match the loaded dataset.

From there, the analysis becomes increasingly specific. The agent calculates the average closing price for February 2023 by filtering rows to month=February and year=2023, then taking the mean of the price column. It also computes differences between months (mean February vs. mean March 2023), percentage changes across years (comparing average 2022 vs. average 2023), and intra-year comparisons like the percentage change between January and February 2023. For more time-series style questions, GPT-4 handles rolling computations: it uses pandas rolling/rolling mean logic to estimate a one-week moving average during 2023.

Where the system gets most interesting is in “decision-style” prompts. The transcript asks which day of the week the price dropped the most and which day is best to buy. The text-davinci-003 agent tends to rely on simpler averages, while GPT-4 introduces a more data-driven approach by computing day-to-day price differences (current minus previous day), then grouping by day-of-week and selecting the most negative change. For “best to buy,” the GPT-4 approach effectively finds the day-of-week with the lowest average price.

The agent also identifies extrema: it finds the minimum Bitcoin price date (November 21, 2022) and the maximum price date in 2023 (with a follow-up confirming the result as April 1, 2023). A tougher query—how much the price increased since the minimum—shows model-dependent failure modes: text-davinci-003 initially compares the minimum to an incorrect “current” value, while GPT-4 correctly uses the last available date in the dataset (April 3, 2023) and reports a percentage increase of 76.24%.

Finally, the transcript frames an overall “price strength” question. text-davinci-003 returns a bearish conclusion based on its internal trend read. GPT-4 instead computes a simple first-to-last change across the dataset, concluding the trend remains bearish because the price drops by about $20,000 from the first day to the worst day. A bonus attempt at forecasting BTC for May 31, 2023 fails when the dataset lacks that date, but GPT-4 proposes a pragmatic workaround: estimate average daily change and project forward from the most recent available price.

Cornell Notes

LangChain’s create CSV agent can connect GPT-style models to a custom Bitcoin CSV so the model generates and runs pandas code, then returns computed answers. With a BTC daily price dataset (roughly 458 rows) containing date-derived fields (month, year, day-of-week) and OHLC data, the agent successfully performs filtering, averaging, month-to-month comparisons, percentage-change calculations, and rolling one-week moving averages. GPT-4 tends to handle more complex “decision” questions better by using day-to-day price differences rather than only averages. Results vary by query: some prompts produce correct extrema (e.g., minimum on November 21, 2022 and maximum in 2023 on April 1, 2023), while other “current price” comparisons can fail unless the model uses the last available row in the dataset. The overall takeaway is that LLMs can act as an interactive analytics layer over CSV data, but prompt precision and model choice matter.

How does the LangChain CSV agent turn natural-language questions into actual calculations on the CSV?

It uses create CSV agent to wrap a language model with a tool that can execute pandas/Python code against the loaded CSV. The default prompt instructs the model to follow a ReAct-style loop: take the question, think about what to do, choose an action (a Python/pandas execution tool), provide code, observe the output, and then produce a final answer. In the transcript, a query like “how many rows and columns” triggers code using df.shape, and the returned tuple matches the dataset dimensions.

What kinds of computations work reliably on the BTC daily price CSV?

Filtering by time windows (e.g., average closing price during February 2023), comparing aggregates across periods (mean February vs. mean March 2023), and computing percentage changes across years (average 2022 vs. average 2023) all work. The agent groups by year or month, selects the price column, and applies mean or diff/percentage formulas. The transcript also shows a rolling one-week moving average using pandas rolling/rolling mean logic during 2023.

Why do GPT-4 and text-davinci-003 differ on “best day” and “biggest drop” questions?

The transcript suggests text-davinci-003 often answers “which day dropped the most” using simpler averages by day-of-week, which may not capture the magnitude of day-to-day declines. GPT-4 instead computes a price difference between each day and the previous day (current minus previous day), then groups those differences by day-of-week and selects the minimum change. For “best to buy,” GPT-4 effectively finds the day-of-week with the lowest average price, which aligns with the cheapest-price criterion.

How are extrema (minimum/maximum) identified, and what pitfalls appear?

For the minimum price, the agent finds November 21, 2022 as the date of the lowest Bitcoin price in the dataset. For the maximum in 2023, it identifies April 1, 2023 as the highest-price date (confirmed via follow-up). A pitfall appears in “increase since November 21, 2022”: text-davinci-003 initially uses an incorrect “current” value, while GPT-4 correctly compares the minimum to the last available date in the dataset (April 3, 2023), yielding a 76.24% increase.

What does the “overall price strength” question reduce to in the transcript’s results?

text-davinci-003 returns a bearish conclusion based on its internal trend read. GPT-4 uses a straightforward first-to-last change approach across the dataset: it computes the price drop from the first day to the worst day and concludes the trend remains bearish, citing an approximate $20,000 decline. The transcript also notes that GPT-4’s approach is more explicitly tied to a concrete calculation rather than an unclear chart interpretation.

Why does forecasting BTC for May 31, 2023 fail, and what workaround does GPT-4 propose?

The dataset ends before May 31, 2023, so the agent can’t directly retrieve a row for that date. When asked for a single number for May 31, it initially misinterprets the request and falls back to unrelated values (like mean values from available years). GPT-4 then proposes a workaround: estimate the average daily price increase/decrease, multiply by the number of days from the most recent dataset date to May 31, and add the projected change to the most recent price.

Review Questions

When the agent answers “average price during February 2023,” what filtering and aggregation steps are being performed on the CSV?
What calculation strategy does GPT-4 use for “which day of the week the price dropped the most,” and how does it differ from using only day-of-week averages?
In the “increase since November 21, 2022” query, what mistake does the text-davinci-003 approach make, and how does GPT-4 correct it using the dataset’s last available date?

Key Points

1
LangChain’s create CSV agent can connect an LLM to a CSV by giving it a Python/pandas execution tool, enabling computed answers instead of guesses.
2
A preprocessed BTC daily price CSV with derived date fields (month, year, day-of-week) makes time-based questions easier for the agent to execute.
3
Setting temperature to 0 reduces randomness and helps the model produce more consistent code and outputs for data queries.
4
GPT-4 tends to handle complex “drop magnitude” questions better by computing day-to-day price differences and then grouping by day-of-week.
5
Some prompts fail when they require values outside the dataset’s date range; the agent can’t retrieve May 31, 2023 if the CSV doesn’t include it.
6
Correct “current price” comparisons depend on using the last available row in the dataset; otherwise percentage-change results can be wrong.
7
For forecasting-style requests, a practical fallback is projection from the most recent available date using average daily change rather than direct lookup.

Highlights

The CSV agent answers questions by generating and running pandas code (e.g., df.shape for dataset dimensions), then returning the computed result.

GPT-4 improves “biggest drop” analysis by calculating day-to-day differences and grouping those differences by day-of-week, not just averaging prices.

The minimum Bitcoin price date in the dataset is identified as November 21, 2022, and the maximum in 2023 is confirmed as April 1, 2023.

A “since the low” percentage increase works only when the model compares against the dataset’s last available date (April 3, 2023), producing 76.24%.

Forecasting May 31, 2023 fails as a direct lookup because the dataset doesn’t include that date, but GPT-4 suggests projecting using average daily change.

Topics

LangChain CSV Agent
GPT-4 Data Analysis
Bitcoin Price CSV
Pandas Rolling Average
LLM Tool Use

Mentioned

LangChain
OpenAI
ChatOpenAI
GPT-4
OHLC
CSV