Fine-tuning Alpaca: Train Alpaca LoRa for Sentiment Analysis on a Custom Dataset
Based on Venelin Valkov's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Convert tweet sentiment CSV data into Alpaca LoRA JSON with fields for instruction, input (tweet text), and output (positive/neutral/negative).
Briefing
Fine-tuning Llama 7B with LoRA on a custom Bitcoin-tweet sentiment dataset can produce a practical sentiment classifier that labels new tweets as positive, neutral, or negative. The workflow hinges on converting Kaggle tweet data into Alpaca-style instruction records, then training only a small fraction of parameters via low-rank adapters—making the process feasible on a single GPU using 8-bit model loading.
The dataset starts from “BTC tweets sentiment” on Kaggle, with roughly 50k scraped tweets and sentiment labels. Before training, the pipeline removes retweeted tweets and drops any tweets containing links to reduce noisy or non-original text. Sentiment labels are then normalized into three classes: the script maps numeric sentiment scores to strings (“positive” when the score is above 1, “negative” when below 0, otherwise “neutral”). Each training example is reformatted into the Alpaca LoRA schema: a constant instruction (“detect the sentiment of the tweet”), the tweet text as the input, and the sentiment class as the output. The resulting JSON file becomes the direct training input for the fine-tuning step.
On the modeling side, the base is Llama 7B loaded from Hugging Face, using 8-bit weights to save memory and speed up training. The tokenizer is initialized from the same base model, with padding configured by setting the pad token ID to 0 and using left padding. The training dataset is loaded from the JSON and wrapped with the Alpaca prompt template (“Write a response that appropriately completes the request…”), then tokenized with a hard cutoff length of 256 tokens. Labels are constructed so the model learns to generate the correct response portion of each prompt.
Instead of updating all model weights, the process uses PEFT’s LoRA (low-rank adaptation). LoRA is configured to target the query and value projection layers, with rank r=8, alpha scaling, and a 5% dropout, and it’s applied for causal language modeling. Training updates about 0.06% of parameters—small enough to be efficient while still steering the model toward the sentiment task. The run uses micro-batch size 4, mixed precision (float16), Adam with weight decay, and trains for 300 steps with evaluation and checkpointing every 50 steps. After training, the fine-tuned model is saved and published to Hugging Face under the name “Alpaca Bitcoin Tweets Sentiment.”
To validate performance, the workflow switches from training to inference: it clones the Alpaca LoRA repo at a fixed commit, then runs generate.py with the base Llama model plus the custom LoRA weights. A Gradio interface lets users paste tweets and receive sentiment predictions. Example inputs include bullish phrasing (“A project with great prospects and opportunities”) yielding “positive,” neutral market commentary (“Get ready to take short positions”) yielding “neutral,” and bearish signals (“If you think the run of BTC is over…”) yielding “negative.” The end result is a reproducible recipe for turning a general LLM into a domain-specific sentiment tool using instruction tuning plus LoRA.
Cornell Notes
The project fine-tunes Llama 7B for Bitcoin tweet sentiment by converting Kaggle-labeled tweets into Alpaca-style instruction data. Tweets are cleaned by removing retweets and link-containing posts, then sentiment scores are mapped into three labels: positive, neutral, and negative. Training loads the base model in 8-bit and uses PEFT LoRA to update only a small slice of parameters (about 0.06%), targeting the query and value projections for causal language modeling. Data is wrapped in the Alpaca prompt template, tokenized with a 256-token cutoff, and trained for 300 steps with periodic evaluation. A Gradio-based generate.py script then uses the base model plus the LoRA weights to classify new tweets.
How does the pipeline turn raw tweet sentiment data into something Alpaca LoRA can train on?
Why load Llama 7B in 8-bit, and what does that enable in practice?
What role does LoRA play, and which parts of the model get trained?
How is the prompt and label construction handled during training?
What training settings determine how the model learns and how progress is monitored?
How does inference work after fine-tuning?
Review Questions
- What preprocessing steps are applied to the Kaggle tweets before converting them into Alpaca-style JSON records?
- Which model components are targeted by LoRA in this setup, and roughly what fraction of parameters becomes trainable?
- How do prompt formatting and the 256-token cutoff affect what the model is trained to generate?
Key Points
- 1
Convert tweet sentiment CSV data into Alpaca LoRA JSON with fields for instruction, input (tweet text), and output (positive/neutral/negative).
- 2
Filter training data by removing retweets and tweets containing links to reduce noise in sentiment learning.
- 3
Load Llama 7B in 8-bit and use float16 mixed precision to make fine-tuning practical on a single GPU.
- 4
Use PEFT LoRA to train only low-rank adapters, targeting query and value projection layers for causal language modeling.
- 5
Wrap each example in the Alpaca prompt template and tokenize with truncation at a 256-token cutoff to control context length.
- 6
Train for a limited number of steps (300) with periodic evaluation/checkpointing, then save and reuse the LoRA weights for inference.
- 7
Run generate.py with the base model plus the custom LoRA weights and serve predictions through a Gradio interface for real tweet inputs.