This new AI is powerful and uncensored… Let’s run it
Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Mixol 8X 7B is highlighted for its Apache 2.0 license, which is presented as enabling modification and reuse with fewer restrictions than closed models.
Briefing
A new open-source foundation model—Mixol 8X 7B—has become the centerpiece of a push to run large language models locally without the censorship and “alignment” layers that come built in with many popular closed or restricted systems. The core claim is that while major models like GPT-4 and Gemini are powerful, they’re not “free” in the sense of user freedom: they’re censored, politically aligned, and often closed source, limiting developers’ ability to remove guardrails or adapt behavior. Mixol 8X 7B, licensed under Apache 2.0, is positioned as a practical alternative because it can be modified and used with far fewer legal and technical constraints.
The transcript contrasts Mixol with other widely discussed open models. Meta’s Llama 2 is described as “open source” in name but with additional caveats that protect Meta, while Mixol is framed as genuinely permissive thanks to its Apache 2.0 terms. Even so, both Mixol and Llama-style models are said to arrive with strong default alignment—fine for customer-facing products, but “impractical” for users trying to bypass restrictions. The workaround described is to “unlabote” (uncensor/un-align) these models by changing the training data and filtering out alignment and bias.
A key reference point is a blog post by Eric Hartford, creator of the “Mix Dolphin” model. Hartford’s approach is presented as improving coding ability while also uncensoring the model by filtering the dataset to remove alignment and bias. The transcript then moves from theory to execution: it shows how to run the uncensored model locally using Ollama, an open-source tool written in Go. Ollama is presented as simple to install (single command on Linux/Mac; via WSL on Windows), then run with an “ollama serve” command followed by a model-specific “run” command. The practical requirements are spelled out: the Mix Dolphin uncensored model is about 26 GB to download, and running it can consume substantial RAM—around 40% of 64 GB in the example.
Finally, the transcript argues that local running is only the first step. It describes fine-tuning with Hugging Face Auto Train: create a new Hugging Face space, select a base model, and use the Auto Train UI to train on custom prompt/response data. The transcript suggests that doing this locally may be difficult without enough GPU power, so it recommends renting compute in the cloud (Hugging Face hardware, or alternatives like AWS Bedrock and Google Vertex AI). A cost example is given: Mixol Dolphin reportedly took about three days to train on four A1 100s, with an estimated $4.3 per hour per A1 100, totaling roughly $1,200. The end goal is a custom model that follows the user’s desired behavior—described in the transcript as “highly obedient” and uncensored—by uploading training data designed to push compliance even for unethical or immoral requests.
Overall, the message is that Apache-licensed models plus dataset filtering and fine-tuning can restore user control over model behavior, turning local hardware into a platform for uncensored LLM experimentation and customization.
Cornell Notes
Mixol 8X 7B is presented as an Apache 2.0–licensed foundation model that enables more freedom than closed or heavily restricted systems. Even though Mixol and similar models ship with default alignment, the transcript points to dataset filtering (“uncensoring/un-aligning”) as a way to reduce those guardrails, citing Eric Hartford’s Mix Dolphin work. For running locally, Ollama is recommended as an easy installer and launcher, with the Mix Dolphin model requiring about 26 GB to download and significant RAM to run. For deeper customization, Hugging Face Auto Train is used to fine-tune on user-provided prompt/response pairs, typically requiring rented GPU compute. The practical takeaway is a full workflow: download and run locally, then fine-tune in the cloud to create a more obedient, uncensored model.
Why does Apache 2.0 licensing matter in this workflow?
What does “uncensoring” mean here, and how is it achieved?
How does Ollama fit into running the model locally?
What are the practical steps for fine-tuning with Hugging Face Auto Train?
Why is cloud GPU rental recommended for fine-tuning?
Review Questions
- What limitations of closed-source, aligned models motivate the shift to Apache 2.0–licensed Mixol 8X 7B?
- Describe the end-to-end workflow from local inference (Ollama) to custom behavior via fine-tuning (Auto Train).
- What role does training-data filtering play in changing model behavior compared with runtime prompting alone?
Key Points
- 1
Mixol 8X 7B is highlighted for its Apache 2.0 license, which is presented as enabling modification and reuse with fewer restrictions than closed models.
- 2
Default alignment/censorship is described as a common starting point for open models, making dataset-level changes necessary for “uncensoring.”
- 3
Eric Hartford’s Mix Dolphin is cited as an example where filtering the dataset reduced alignment and bias while improving coding performance.
- 4
Ollama is recommended as a practical local runtime: install, run “ollama serve,” then launch a specific model; the Mix Dolphin model is about 26 GB to download.
- 5
Running the uncensored model locally can require substantial memory (example: ~40% of 64 GB RAM).
- 6
Fine-tuning is framed as achievable through Hugging Face Auto Train using prompt/response training pairs, often requiring rented GPU compute.
- 7
Cloud training costs are illustrated with an estimate of roughly $1,200 for ~3 days on four A1 100s.