FREE Phone Calls with Claude Code
Based on NetworkChuck's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
SIP signaling connects a phone system to Claude Code, while a separate media pipeline handles audio capture, transcription, and speech synthesis.
Briefing
A hobby VoIP setup can be wired into Claude Code so phone calls—down to an analog payphone—can trigger AI workflows, keep conversational context, and even run business actions like creating ClickUp tasks or generating Slack messages. The core breakthrough is treating SIP (the call-signaling protocol behind most VoIP systems) as the bridge between a phone system and Claude Code, then adding a separate “media” layer for audio and speech-to-text/text-to-speech.
The build starts with three CX’s AI receptionist and transcription features, which the creator sets up in about 10 seconds by adding an API key and configuring an AI agent (“Dolores Umbridge”) with call-handling context. The receptionist can handle both normal requests and edge cases like abusive or frustrated callers, and it supports transcription via OpenAI, Google, or local three CX options. That quick success sparks the bigger idea: if a phone system can talk to Claude Code, then Claude Code can become reachable from anywhere—without requiring the caller to understand anything about APIs.
To make calls work end-to-end, the project leans on SIP for signaling and a separate media server for the actual audio stream. The creator initially considers a commercial all-in-one SIP/media solution (Jam Bones), but balks at the cost—about $1,000 per month for a single node. Instead, the setup uses free, open-source components: FreeSWITCH as the SIP stack and an additional media/processing path built around voice activity detection, Whisper for speech-to-text, and ElevenLabs for text-to-speech. A wrapper server on a Mac handles the handoff: it detects when the user speaks, transcribes, sends the text into Claude Code, then returns Claude Code’s spoken response back through the VoIP pipeline.
A key constraint is that three CX’s free tier doesn’t allow custom SIP trunks. The workaround is to register Claude Code as if it were a phone endpoint inside the three CX system—so it appears as an extension with a ready status and can receive calls directly. From there, the creator defines “call skills” that let Claude Code perform actions during a live call, while Morpheus acts as an executive assistant with access to Claude Code skills.
The practical payoff shows up in demos: Morpheus can create ClickUp tasks and send Slack messages with the task link, while maintaining context across the same call session. It also supports “fire and forget” workflows—calling Morpheus, triggering a job (like generating hyper-realistic thumbnails), then hanging up while results arrive later via Slack.
Finally, the project moves from entertainment to operations with an N8N workflow that checks storage cluster health (e.g., SSD pool capacity thresholds). When conditions are met, an HTTP request triggers the Claude Code-backed phone agent (“Stephanie”) to call the creator, ask for details, and then send a Slack update using its Slack skill. The result is a proof-of-concept for AI-driven phone-based monitoring and response—an interface that can reach the creator even when they’re away from the dashboard.
The creator frames it as a janky but free POC, with a separate documentation/video promised for installation. The bigger message is that once SIP signaling and Claude Code are connected, phone access becomes a control surface for AI—capable of running real workflows, not just answering questions.
Cornell Notes
The project connects a phone system to Claude Code by using SIP for call signaling and a separate audio pipeline for speech. After setting up three CX’s AI receptionist and transcription features, the creator builds a bridge where calls to a registered Claude Code extension can trigger Claude Code skills and keep conversational context. A wrapper server performs voice activity detection, transcribes speech with Whisper, and generates responses with ElevenLabs text-to-speech. The system then runs real actions—like creating ClickUp tasks, sending Slack messages, and monitoring storage cluster health via N8N—then calls or messages the user when thresholds are crossed. It matters because it turns “phone calls” into an interface for AI workflows, usable even from places with no internet access (e.g., payphones).
How does SIP fit into the Claude Code phone-call bridge?
Why was a commercial all-in-one SIP/media solution avoided, and what replaced it?
What does the wrapper server do during a live call?
How does the setup work without custom SIP trunks on three CX’s free tier?
What kinds of real tasks can the AI perform during or after a call?
How is monitoring and alerting implemented using phone-based AI?
Review Questions
- What roles do SIP signaling and the media pipeline play in making an AI phone agent work?
- Why does registering Claude Code as an extension matter for compatibility with three CX’s free tier?
- Describe one demo workflow (ClickUp/Slack or storage monitoring) and how the system delivers the result to the user.
Key Points
- 1
SIP signaling connects a phone system to Claude Code, while a separate media pipeline handles audio capture, transcription, and speech synthesis.
- 2
three CX’s AI receptionist and transcription features are used as a starting point, but the main leap is turning phone calls into Claude Code-triggered workflows.
- 3
FreeSWITCH is used as the SIP stack, avoiding a costly all-in-one SIP/media option (Jam Bones) by relying on open-source components.
- 4
A local wrapper server performs voice activity detection, Whisper speech-to-text, and ElevenLabs text-to-speech to complete the call loop.
- 5
Because three CX free tier blocks custom SIP trunks, Claude Code is registered as an extension so it can receive calls directly.
- 6
Claude Code “call skills” let the system run real actions like creating ClickUp tasks and sending Slack messages, with context preserved across a call.
- 7
N8N can monitor infrastructure thresholds and trigger phone-based AI troubleshooting and follow-up notifications via Slack.