Get AI summaries of any video or article — Sign up free
A Natural Language AI (LLM) SQL Database - Could this work? thumbnail

A Natural Language AI (LLM) SQL Database - Could this work?

All About AI·
4 min read

Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Convert SQL rows into chunked text, then generate and store embeddings locally to enable semantic retrieval.

Briefing

Turning a SQL database into a natural-language hiring and skills assistant is feasible by combining local embeddings, RAG, and hybrid search. The core setup takes structured employee records—names, ages, professions, specialties, and achievements—then chunks that data, generates embeddings locally, and stores them for retrieval. Instead of writing SQL queries, a user can ask questions in plain language, and the system retrieves relevant rows using both semantic similarity (embeddings) and keyword matching, then feeds the retrieved context to a local LLM to produce a natural-language answer.

The workflow starts with a PostgreSQL database hosted on a platform referred to as Hoku. The creator populates it with synthetic JSON profiles (about 30 employee-like entries), then verifies the data inserted successfully by reading it back from the database. Next comes the “SQL to embeddings” step: a Python script pulls the SQL records, chunks them, and uses a locally run embeddings model from AMA to generate vector representations. Those embeddings are saved to a file (Vault embeddings Json), along with an intermediate text/trunk file used for embedding. A separate “SQL RAG” script then performs retrieval and generation.

A key design choice is hybrid search. Retrieval doesn’t rely purely on embedding similarity; it also incorporates keyword search so exact terms and semantic meaning both influence which records are selected as context. That retrieved context is then passed to a local LLM—configured to run Mistral—so the system can answer in natural language rather than returning structured SQL output.

In the demo, the assistant is asked: “I need a developer to create our next mobile app—who is the best candidate?” The hybrid retrieval surfaces candidates whose profiles mention mobile-app development and measurable outcomes. The response names Fiona Davis and Wendy Clark, citing evidence such as a mobile app with over 100k downloads for Davis and product-development experience for Clark, while also distinguishing “intermediate” versus “advanced/leadership” experience based on the retrieved achievements.

A second query tests a different domain: “We have cyber security challenges and need a team of two people—who are the best candidates?” The system again retrieves relevant profiles and returns Ian Clark and Alice Cooper, supported by details like expert lead red teaming and high-profile cyber security exercises. The results suggest the approach works well for small datasets and targeted questions, though scaling to hundreds of entries wasn’t fully tested.

Overall, the experiment frames natural-language SQL access as a practical RAG pipeline: structured data becomes embeddings, retrieval becomes hybrid (semantic + keyword), and generation becomes a local LLM response. The remaining unknown is performance and quality at scale, but the initial tests indicate the concept can produce useful, evidence-backed answers without writing SQL.

Cornell Notes

The experiment demonstrates a way to query a PostgreSQL employee database using natural language instead of SQL. Records are chunked, embedded locally with an AMA embeddings model, and saved to a vector store file. A RAG pipeline then retrieves relevant records using hybrid search—combining semantic similarity from embeddings with keyword matching—before sending the retrieved context to a local Mistral model. The LLM returns natural-language answers that cite the retrieved employee achievements and specialties. The approach appears to work well on a small synthetic dataset (~30 profiles), but scaling to larger databases wasn’t thoroughly tested.

How does structured SQL data turn into something an LLM can query in natural language?

The pipeline pulls rows from a PostgreSQL database, chunks the text representation of each profile, generates embeddings locally using an AMA embeddings model, and saves those vectors to “Vault embeddings Json.” A separate text/trunks file is written before embedding, then the RAG step uses those embeddings to retrieve relevant chunks based on a user’s question.

What role does hybrid search play compared with using embeddings alone?

Hybrid search combines semantic search (embedding similarity) with keyword search. That means retrieval can match both meaning (e.g., “mobile app development”) and exact or near-exact terms (e.g., domain-specific keywords). In the demo, this helps pull context that contains both the right skills and the right supporting achievements.

What does the system return when asked for a “best candidate” for a task?

It returns a natural-language recommendation rather than SQL rows. For the mobile app query, it names Fiona Davis and Wendy Clark and justifies the choice using retrieved profile details—such as Davis’s mobile app with over 100k downloads and Clark’s product-development experience—then differentiates experience levels based on the retrieved context.

How does the approach handle a different type of question, like building a team for cyber security?

The same retrieval-and-generation flow applies. For the cyber security team-of-two question, the system retrieves profiles matching cyber security expertise and returns Ian Clark and Alice Cooper, with supporting evidence from their achievements (e.g., lead red teaming and high-profile cyber security exercises).

What limitations were acknowledged in the experiment?

The creator notes that the database could eventually contain hundreds of entries, but scaling wasn’t tested. The results are described as working “pretty good” for initial tests, with the main uncertainty being performance and answer quality as the dataset grows.

Review Questions

  1. If you removed keyword search and used only embeddings, what kinds of retrieval failures might you expect for domain-specific queries?
  2. Why is chunking important when converting SQL rows into embeddings for RAG?
  3. What additional experiments would you run to validate performance when the database grows from ~30 profiles to hundreds or thousands?

Key Points

  1. 1

    Convert SQL rows into chunked text, then generate and store embeddings locally to enable semantic retrieval.

  2. 2

    Use a RAG pipeline so natural-language questions trigger retrieval of relevant database context before generation.

  3. 3

    Hybrid search improves retrieval by combining semantic similarity with keyword matching.

  4. 4

    Local model execution is part of the design: embeddings come from an AMA embeddings model and generation uses Mistral.

  5. 5

    The demo’s answers are evidence-based, citing achievements and specialties pulled from retrieved profiles.

  6. 6

    Initial results look promising on a small synthetic dataset, but scaling to larger databases remains an open question.

Highlights

A PostgreSQL employee database can be queried in plain language by embedding its contents and using RAG with hybrid retrieval.
Hybrid search (semantic + keyword) helps the system pull context that contains both the right skills and the right supporting details.
Natural-language outputs can recommend candidates (e.g., for mobile apps or cyber security teams) while grounding responses in retrieved achievements.

Topics

Mentioned

  • LLM
  • RAG
  • AMA