A Natural Language AI (LLM) SQL Database - Could this work?
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Convert SQL rows into chunked text, then generate and store embeddings locally to enable semantic retrieval.
Briefing
Turning a SQL database into a natural-language hiring and skills assistant is feasible by combining local embeddings, RAG, and hybrid search. The core setup takes structured employee records—names, ages, professions, specialties, and achievements—then chunks that data, generates embeddings locally, and stores them for retrieval. Instead of writing SQL queries, a user can ask questions in plain language, and the system retrieves relevant rows using both semantic similarity (embeddings) and keyword matching, then feeds the retrieved context to a local LLM to produce a natural-language answer.
The workflow starts with a PostgreSQL database hosted on a platform referred to as Hoku. The creator populates it with synthetic JSON profiles (about 30 employee-like entries), then verifies the data inserted successfully by reading it back from the database. Next comes the “SQL to embeddings” step: a Python script pulls the SQL records, chunks them, and uses a locally run embeddings model from AMA to generate vector representations. Those embeddings are saved to a file (Vault embeddings Json), along with an intermediate text/trunk file used for embedding. A separate “SQL RAG” script then performs retrieval and generation.
A key design choice is hybrid search. Retrieval doesn’t rely purely on embedding similarity; it also incorporates keyword search so exact terms and semantic meaning both influence which records are selected as context. That retrieved context is then passed to a local LLM—configured to run Mistral—so the system can answer in natural language rather than returning structured SQL output.
In the demo, the assistant is asked: “I need a developer to create our next mobile app—who is the best candidate?” The hybrid retrieval surfaces candidates whose profiles mention mobile-app development and measurable outcomes. The response names Fiona Davis and Wendy Clark, citing evidence such as a mobile app with over 100k downloads for Davis and product-development experience for Clark, while also distinguishing “intermediate” versus “advanced/leadership” experience based on the retrieved achievements.
A second query tests a different domain: “We have cyber security challenges and need a team of two people—who are the best candidates?” The system again retrieves relevant profiles and returns Ian Clark and Alice Cooper, supported by details like expert lead red teaming and high-profile cyber security exercises. The results suggest the approach works well for small datasets and targeted questions, though scaling to hundreds of entries wasn’t fully tested.
Overall, the experiment frames natural-language SQL access as a practical RAG pipeline: structured data becomes embeddings, retrieval becomes hybrid (semantic + keyword), and generation becomes a local LLM response. The remaining unknown is performance and quality at scale, but the initial tests indicate the concept can produce useful, evidence-backed answers without writing SQL.
Cornell Notes
The experiment demonstrates a way to query a PostgreSQL employee database using natural language instead of SQL. Records are chunked, embedded locally with an AMA embeddings model, and saved to a vector store file. A RAG pipeline then retrieves relevant records using hybrid search—combining semantic similarity from embeddings with keyword matching—before sending the retrieved context to a local Mistral model. The LLM returns natural-language answers that cite the retrieved employee achievements and specialties. The approach appears to work well on a small synthetic dataset (~30 profiles), but scaling to larger databases wasn’t thoroughly tested.
How does structured SQL data turn into something an LLM can query in natural language?
What role does hybrid search play compared with using embeddings alone?
What does the system return when asked for a “best candidate” for a task?
How does the approach handle a different type of question, like building a team for cyber security?
What limitations were acknowledged in the experiment?
Review Questions
- If you removed keyword search and used only embeddings, what kinds of retrieval failures might you expect for domain-specific queries?
- Why is chunking important when converting SQL rows into embeddings for RAG?
- What additional experiments would you run to validate performance when the database grows from ~30 profiles to hundreds or thousands?
Key Points
- 1
Convert SQL rows into chunked text, then generate and store embeddings locally to enable semantic retrieval.
- 2
Use a RAG pipeline so natural-language questions trigger retrieval of relevant database context before generation.
- 3
Hybrid search improves retrieval by combining semantic similarity with keyword matching.
- 4
Local model execution is part of the design: embeddings come from an AMA embeddings model and generation uses Mistral.
- 5
The demo’s answers are evidence-based, citing achievements and specialties pulled from retrieved profiles.
- 6
Initial results look promising on a small synthetic dataset, but scaling to larger databases remains an open question.