How I cloned myself using PyVoIP and Elevenlabs
… and taught my grandmas a lesson!
One day while clicking through my internet routers interface, I noticed that it supports connecting internet-based phone systems (IP PBX). I also noticed that I have three landline phone numbers, but I was using none of them. What a waste! So naturally, I looked daringly at my unused raspberries and decided to have some fun!
So what kind of fun are we having?
Initially, I wanted to create a personal assistant for myself, which I can call and, say, ask to put something on my TODO list, or ask whether I am free on a given date, or to do some home automation thing. But I figured I’d never really use it.
So I pivoted to creating a clone of myself - it talks like me, knows the basics about my life, and gives cheerful answers to anything you talk to it. And if I call myself, it can detect the self-referential silliness of it all and make for some really weird conversations. I also configured it to swoon about my girlfriend whenever it is given the chance - just like real me!
Why do this?
- It taught my grandmas a lesson: if my voice gives them a call to ask for money, that doesn’t necessarily mean its me. (Note: i didnt actually gave them a grandparent scam call, i just showed them what is possible. No grandmas were harmed during this project.)
- It allowed me to convince my mom and some close friends to give “me” a call on my landline phone number, and for me to listen in to hear the coin drop. Most often, this realization was followed by laughter, disbelief, or interesting queries to the clone, e.g. trying to make it spill some tea about me.
- When I can’t decide what to eat, wear, or blog about, I can now just call myself!
- I wanted to confirm that it is indeed possible to use a Raspberry Pi as an overpowered phone connecting to my router - this might come in handy for future projects.
Why write this blogpost?
The project itself is nothing fancy, but it did require some tinkering with PyVOIP to get this to work correctly with my Telekom Speedport router. I am primarily sharing for this reason, hoping to spare others some hours of pain.
Used puzzle pieces
Before we can wire things together, we need to know what we’re dealing with.
PyVOIP
PyVOIP is an open-source VoIP/SIP/RTP library. Basically, this will run on my Raspberry Pi, and allow it to register itself as a “phone” to the router, then answer incoming calls, and receive/send audio streams from/to the caller.
ElevenLabs
Elevenlabs handles the AI stuff - its conversational AI platform specifically. You can clone a voice, and then use it to create an Agent with a system prompt - this is where I fed in some basic personal information. You can then use a websocket connection to tunnel audio streams to and from the platform.
Telegram bot for notifications
My go to for all my notifications in personal projects - you create a bot by chatting with @BotFather, get your newborn bots token1, and in order to send a message, you just do HTTP requests to the webhook URL - bish bash bosh, done. In this project, I send transcripts and audio recordings of calls as soon as they finish.
How to get PyVOIP to work with a Telekom router
This is the ‘tinkering’ part I mentioned above. I can only test this using my Speedport Smart 4 router, but I figure issues with similar routers might be solved by the same steps.
Registration issues
Using the PyVOIP repo on the main branch, trying to register a phone to the router was met with an 423 Interval Too Brief
error answer by the router. Well, actually, it failed silently, and only after intercepting and inspecting the packet stream, I was able to see this. Anyways..
This error indicates that the registration expires interval
parameter sent with the registration request is too low - the router requires at least 300, while PyVOIP sends 120. This value cannot be passed as a parameter to the VoIPPhone
constructor, and instead you have to set self.phone.sip.default_expires = 300
after instantiating but before .start()
ing the phone.
Answering calls
While these changes made registration possible, I couldn’t get the library to actually handle the ‘incoming call’ events, and invoke my callback. On the development branch, a future version 2.0 of PyVOIP is being developed, but that didn’t help either, and the development also seemed to have halted halfway through, which isn’t very reassuring.
So I went and checked the forks on GitHub and indeed found a promising one: BRIDGE-AI/pyVoIP (remember to switch to the development branch there too!). With this fork, things worked out as expected. The phone can now correctly register, and accept incoming calls, receiving an audio stream and sending one back!
How to properly connect PyVOIP with ElevenLabs
So now I called my landline, the Pi picked up the call and then… nothing. Dead silence, and the recording on the Elevenlabs platform just contained a bunch of really loud noise. Clearly, something about the audio format was wrong. In the settings of the Agent, you can actually specify input and output formats - both were set to ‘PCM 16kHz’. After some digging, I found that what I got from my calls was actually 8-bit linear PCM, in 8kHz. I vibecoded two functions to transform the audio formats in both directions:
def telephony_to_elevenlabs(audio_data: bytes) -> bytes:
"""
Convert 8kHz PCM audio from PyVOIP read_audio() to 16kHz PCM for ElevenLabs.
"""
try:
if not audio_data:
return b""
audio_array = np.frombuffer(audio_data, dtype=np.uint8)
if len(audio_array) == 0:
return b""
# Convert from unsigned 8-bit to signed 16-bit
# uint8 range [0, 255] -> int16 range [-32768, 32767]
audio_16bit = (audio_array.astype(np.int32) - 128) * 256
audio_16bit = np.clip(audio_16bit, -32768, 32767).astype(np.int16)
# Upsample from 8kHz to 16kHz using proper resampling
target_length = len(audio_16bit) * 2 # 8kHz -> 16kHz = 2x
upsampled = signal.resample(audio_16bit, target_length)
# Ensure we stay within int16 range and convert back
upsampled = np.clip(upsampled, -32768, 32767)
return upsampled.astype(np.int16).tobytes()
except Exception as e:
logger.error(f"Error converting telephony audio to ElevenLabs: {e}")
return b""
and
def elevenlabs_to_telephony(audio_data: bytes) -> bytes:
"""
Convert 16kHz PCM ElevenLabs audio to 8kHz PCM for PyVoIP write_audio()
"""
try:
if not audio_data:
return b""
# Convert bytes to numpy array (16-bit)
if len(audio_data) % 2 != 0:
# Ensure even number of bytes for 16-bit samples
audio_data = audio_data[:-1]
audio_array = np.frombuffer(audio_data, dtype=np.int16)
if len(audio_array) == 0:
return b""
# Downsample from 16kHz to 8kHz
target_length = len(audio_array) // 2 # 16kHz -> 8kHz = 1/2x
downsampled = signal.resample(audio_array, target_length)
# Apply bad quality filter if enabled
if audio_config.add_phone_effect:
downsampled = AudioConverter._apply_telephone_filter(downsampled)
# Ensure we stay within int16 range first
downsampled = np.clip(downsampled, -32768, 32767).astype(np.int16)
# Convert from signed 16-bit to unsigned 8-bit for pyVoIP
# int16 range [-32768, 32767] -> uint8 range [0, 255]
audio_8bit = (downsampled.astype(np.int32) // 256) + 128
audio_8bit = np.clip(audio_8bit, 0, 255).astype(np.uint8)
# Return 8-bit PCM
return audio_8bit.tobytes()
except Exception as e:
logger.error(f"Error converting ElevenLabs audio to telephony: {e}")
return b""
where, as you can see, I also added an optional ‘low phone call quality’ filter for that extra bit of realism. This is done by applying a Butterworth bandpass filter.
How well does it work?
For obvious reasons, I’m not posting the number to call here. But I must say, the cloned voice is pretty convincing! During voice cloning, I went with only ~10 minutes of input audio recording from my podcasting microphone, and the tonality of my voice was matched almost to a tee. However, the rhythm of generated speech is a bit off, so even if you didn’t know my real voice, you could tell it was AI.
Latency is also really good, answers come almost instantly, and Elevenlabs has some knobs which allow you to find a good sweet spot between latency, text quality, and voice quality. I highly recommend the service for similar projects!
Thats all!
I hope you go and do something fun, too :-)
—