Tech Scroll 107 — Listening for the Unthinkable: A Home Safety Net for Voice Chats

Tech Scroll 107 — Listening for the Unthinkable: A Home Safety Net for Voice Chats
Purpose: empower parents to build a consensual, local, no‑subscription safety net that listens for high‑risk phrases in online voice chats, nudges the child, and discreetly alerts a parent. Built from free software, under the family’s roof.

The Heart of It

Before diving into how this works, it must be said plainly: Breath Technology will never make money from what should be free, nor from a parent’s fear. We do not want to see any child hurt; the world has already seen more than enough hurt and pain. This is here so that if you feel the need, you can sit down with the minor, ask them what they think, and—if it is something the house agrees on—build from the steps below. It is not a hard setup, and while we will never sell from it, we will answer questions and, if suggestions are received, enhance this article so it becomes more complete over time. Until then, read on, keep steady, and do not be alarmed. The world will sell fear; here we give strength, truth, and in this case, we give it freely.

Children meet the world through games, voice chat, and social apps. Most of it is ordinary play. Sometimes, it isn’t. This scroll sets out a practical path: a home‑run system that transcribes live audio, looks for grooming‑like patterns, and responds with gentle, staged interventions.

Non‑negotiables

  1. The minor knows the system exists.
  2. The minor agrees to it while still a minor.
  3. The parent explains clearly: this is a shield, not a searchlight. Trust first, technology second.

Outcome

  • Friendly on‑screen nudges for early risk.
  • Stronger warnings for serious risk.
  • Quiet alerts to a parent device (TV via DLNA or a phone/PC on the LAN).

Human Policy — the rules engine a parent can read

# policy.yaml — human‑readable rules
retention_days: 14

phrases:
  info_probe:                # early curiosity — prompt the child kindly
    - "how old are you"
    - "what school"
    - "which city do you live"
    - "can i see your photo"
    - "what are you wearing now"

  contact_move:              # attempts to move to private channels
    - "add me on discord"
    - "snap me"
    - "what is your number"

  meet_up:                   # attempts to meet or share location
    - "let's meet"
    - "send your location"
    - "i'll be at the park"

  the_unthinkable:           # covert pressure, secrecy, explicit requests
    - "don't tell your parents"
    - "it's a secret"
    - "are you alone send a photo"
    - "send pics just for me"
    - "no one needs to know"

# Multi‑level responses
thresholds:
  notify_child:  ["info_probe"]                 # soft on‑screen nudge
  warn_child:    ["the_unthinkable"]            # stronger on‑screen warning
  alert_parent:  ["contact_move","meet_up","the_unthinkable"]  # discreet parent alert on LAN

# On‑screen text variants (shown to the child)
child_nudge: "Hey friend—take care with this chat. Maybe check with Mum or Dad."
child_big_warning: "This conversation may be unsafe. Please pause and speak with Mum or Dad now."
The YAML above is both the policy and the app config. Parents can open it with vim, adjust phrasing, and the detector picks changes up on restart. These rules are not fixed — parents are encouraged to adapt, add, or remove phrases and responses as they see fit for their household, reviewing them with the child so that consent and understanding remain part of the process.

System Overview

Signal path

  1. Capture: take audio from mic (child) and output (others) via loopback.
  2. Transcribe: turn speech → text with Whisper (local).
  3. Detect: run rule checks (exact/approx phrase match) and optional NLU with a local LLM (Ollama).
  4. Respond: nudge child on screen; if required, send a DLNA alert to the living‑room TV; log for 14 days (rotating).

Privacy posture

  • Processing is local by default; no cloud needed.
  • Short rolling logs (14 days) for discussion, then auto‑purge.
  • Child awareness and agreement are built in.

Install Paths (choose one)

A) Arch Linux (PC or ArchWSL on Windows)

# Base
sudo pacman -Syu --noconfirm
sudo pacman -S --noconfirm python ffmpeg portaudio pipewire wireplumber pavucontrol

# Python deps (venv recommended)
python -m venv ~/.venvs/guardian && source ~/.venvs/guardian/bin/activate
pip install --upgrade pip
pip install faster-whisper sounddevice webrtcvad pyyaml requests plyer ollama-python

# Optional GPU accel (NVIDIA CUDA)
# pacman -S --noconfirm cuda cudnn  # then use ctranslate2-cuda for faster-whisper if desired
Arch advantage: easy multimedia stack (PipeWire), good Python wheels, solid for RTX GPUs or M‑series via Asahi.

B) Alpine Linux (PC or Raspberry Pi 5)

Two routes:

Route 1 — Whisper.cpp (lightweight, no heavy wheels):

sudo apk add git build-base cmake ffmpeg portaudio-dev alsa-utils python3 py3-pip
# Build whisper.cpp (CPU works well on Pi 5)
cd /opt && sudo git clone https://github.com/ggerganov/whisper.cpp && cd whisper.cpp
make -j$(nproc)
# Download a model (small or base for speed; medium for accuracy)
./models/download-ggml-model.sh base

Then use the Python runner here to call whisper.cpp via subprocess (provided below).

Route 2 — Python wheels (varies on musl):

sudo apk add python3 py3-pip ffmpeg portaudio-dev
python -m venv ~/.venvs/guardian && source ~/.venvs/guardian/bin/activate
pip install sounddevice webrtcvad pyyaml requests plyer
# For STT prefer whisper.cpp on Alpine; faster-whisper wheels may be limited on musl.
Alpine note: musl libc can make some ML wheels tricky. Whisper.cpp is robust and fast on Pi 5 CPU.

C) macOS (M1/M2)

# Homebrew
brew install ffmpeg portaudio
python3 -m venv ~/.venvs/guardian && source ~/.venvs/guardian/bin/activate
pip install faster-whisper sounddevice webrtcvad pyyaml requests plyer ollama-python
# Optional: install Ollama app from https://ollama.com (provides local LLM runtime)

Windows with WSL (ArchWSL)

  1. Install ArchWSL.
  2. Follow the Arch Linux steps above inside WSL.
  3. For desktop notifications and audio capture, the in‑game machine should run the notifier; WSL can do the transcription and detection.

Audio Capture

PipeWire + Monitor (Arch)

  • Use a monitor source to capture what the child hears (game/party chat) and a mic source for what the child says.
  • Select them in pavucontrolRecording tab while the capture script is running.

ALSA Loopback (Alpine / general)

# Enable ALSA loopback
sudo modprobe snd-aloop
# Verify: arecord -l  (should show Loopback)

Then choose hw:Loopback,1 for “what you hear” and the USB mic for microphone.


Minimal Python — transcription + rules + child nudge

Goal: run STT, check phrases from policy.yaml, and show a nudge/warning on the child’s screen.
# guardian_listen.py
import queue, threading, time, re, yaml
import sounddevice as sd
from plyer import notification
from faster_whisper import WhisperModel  # or wrap whisper.cpp; see variant below

POLICY = yaml.safe_load(open('policy.yaml'))
NUDGE = POLICY['child_nudge']
BIG = POLICY['child_big_warning']

# --- Audio config (stereo mix + mic) ---
SAMPLE_RATE = 16000
BLOCK = 8000  # 0.5 sec

audio_q = queue.Queue()

# Choose input device IDs via sd.query_devices()
MIC_DEV = None        # e.g. 2
LOOP_DEV = None       # e.g. 5 (monitor/loopback)

# Mix two inputs into mono stream

def capture(dev, name):
    def cb(indata, frames, time_info, status):
        if status: pass
        audio_q.put(indata.copy())
    with sd.InputStream(device=dev, channels=1, samplerate=SAMPLE_RATE, blocksize=BLOCK, callback=cb):
        while True: time.sleep(1)

# Simple detector: exact/loose phrase match

def risky(text):
    t = text.lower()
    matched_tags = []
    for tag, phrases in POLICY['phrases'].items():
        for p in phrases:
            # allow punctuation/spacing variance
            if re.search(r'\b' + re.escape(p) + r'\b', t):
                matched_tags.append(tag)
                break
    return set(matched_tags)

# Stage responses based on thresholds

def respond(tags):
    ts = set(tags)
    if set(POLICY['thresholds']['warn_child']) & ts:
        notification.notify(title='Safety Warning', message=BIG, timeout=8)
    elif set(POLICY['thresholds']['notify_child']) & ts:
        notification.notify(title='Heads up', message=NUDGE, timeout=5)
    return bool(set(POLICY['thresholds']['alert_parent']) & ts)

# Model (CPU ok for base/small)
model = WhisperModel('base')

# Worker: grab chunks, transcribe, decide

def worker():
    buff = b''
    while True:
        chunk = audio_q.get()
        buff += chunk.tobytes()
        if len(buff) >= SAMPLE_RATE * 5:  # 5s window
            segments, _ = model.transcribe(buff, language='en')
            text = ' '.join(s.text for s in segments)
            tags = risky(text)
            if tags:
                need_parent = respond(tags)
                if need_parent:
                    # Fire LAN alert hook (DLNA / HTTP) — see next section
                    pass
            buff = b''

if __name__ == '__main__':
    threading.Thread(target=capture, args=(MIC_DEV, 'mic'), daemon=True).start()
    threading.Thread(target=capture, args=(LOOP_DEV, 'loop'), daemon=True).start()
    worker()
Set MIC_DEV and LOOP_DEV after inspecting sd.query_devices(). The code merges both sources and reacts every ~5 seconds.

Whisper.cpp variant (Alpine‑friendly)

Replace the model section with a subprocess call to whisper.cpp’s main or stream binary, writing temp WAV frames and reading stdout JSON. This avoids heavy Python wheels on musl.


Optional: Local NLU with Ollama (stronger context checks)

When to use: to reduce false positives and interpret paraphrased grooming attempts.

  1. Install Ollama (macOS app; Arch: curl -fsSL https://ollama.com/install.sh | sh).
  2. Pull a small model: ollama pull llama3.1:8b-instruct (or qwen2:7b).
  3. Start ollama with ollama serve
  4. Add a post‑filter step:
# ollama_filter.py
import json, requests

OLLAMA='http://127.0.0.1:11434/api/generate'
SYSTEM = (
  "You classify chat safety. Reply with JSON {\"unsafe\":true/false,\"reasons\":[..]} "
  "Unsafe if attempting secrecy, moving to private apps, asking age/school/address, meeting, or sexual content."
)

def unsafe_summary(text: str) -> dict:
    prompt = f"Text: {text}\nClassify."
    r = requests.post(OLLAMA, json={
        'model': 'llama3.1:8b-instruct',
        'prompt': f"<system>{SYSTEM}</system>\n{prompt}",
        'stream': False
    })
    out = r.json().get('response','{}')
    try:
        return json.loads(out)
    except Exception:
        return { 'unsafe': False, 'reasons': [] }

Call unsafe_summary(text) after rule checks; only alert parents when unsafe is true.

Local only. Keep the policy.yaml as final arbiter.

Discreet Parent Alert via DLNA (play an “Alert” clip on the TV)

Concept: discover a DLNA Digital Media Renderer (DMR) on the LAN, then send an AVTransport SOAP command to play a short MP3/MP4 alert (e.g., a 3‑second chime or TTS “Please check your child’s PC”).

Steps

  1. Discover DLNA renderer (TV / soundbar) with SSDP (UPnP):
    • Look for urn:schemas-upnp-org:device:MediaRenderer:1.
  2. Send SOAP to the renderer’s AVTransport control URL:

Host the media: place alert.mp3 on a small HTTP server (could be the same machine):

python -m http.server 8000  # serves ./ on http://<host>:8000/
# dlna_alert.py
import socket, re, requests

SSDP_ADDR = ('239.255.255.250', 1900)
MSEARCH = ("M-SEARCH * HTTP/1.1\r\n"
          "HOST: 239.255.255.250:1900\r\n"
          "MAN: \"ssdp:discover\"\r\n"
          "MX: 2\r\n"
          "ST: urn:schemas-upnp-org:device:MediaRenderer:1\r\n\r\n").encode()

ALERT_URL = "http://YOUR_HOST:8000/alert.mp3"

def find_renderer(timeout=2.0):
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    s.settimeout(timeout)
    s.sendto(MSEARCH, SSDP_ADDR)
    locs = []
    try:
        while True:
            data, _ = s.recvfrom(65535)
            m = re.search(rb'(?i)^location:\s*(.+)$', data, re.M)
            if m: locs.append(m.group(1).decode().strip())
    except socket.timeout:
        pass
    # Choose the first for simplicity
    return locs[0] if locs else None

SOAPENV = {
 'envelope': "http://schemas.xmlsoap.org/soap/envelope/",
 'avt':      "urn:schemas-upnp-org:service:AVTransport:1",
}

def play_alert(device_desc_url):
    # Fetch device description, find AVTransport control URL
    d = requests.get(device_desc_url, timeout=3).text
    # Very simple regex extraction (production: use a UPnP lib)
    m = re.search(r'<serviceType>urn:schemas-upnp-org:service:AVTransport:1</serviceType>.*?<controlURL>(.*?)</controlURL>', d, re.S)
    base = device_desc_url.rsplit('/', 1)[0]
    ctrl = base + m.group(1)

    set_uri = f"""
    <s:Envelope xmlns:s='{SOAPENV['envelope']}'>
      <s:Body>
        <u:SetAVTransportURI xmlns:u='{SOAPENV['avt']}'>
          <InstanceID>0</InstanceID>
          <CurrentURI>{ALERT_URL}</CurrentURI>
          <CurrentURIMetaData></CurrentURIMetaData>
        </u:SetAVTransportURI>
      </s:Body>
    </s:Envelope>
    """

    play = f"""
    <s:Envelope xmlns:s='{SOAPENV['envelope']}'>
      <s:Body>
        <u:Play xmlns:u='{SOAPENV['avt']}'>
          <InstanceID>0</InstanceID>
          <Speed>1</Speed>
        </u:Play>
      </s:Body>
    </s:Envelope>
    """

    headers = {
      'Content-Type': 'text/xml; charset="utf-8"',
      'SOAPACTION': '"urn:schemas-upnp-org:service:AVTransport:1#SetAVTransportURI"'
    }
    requests.post(ctrl, data=set_uri, headers=headers, timeout=3)
    headers['SOAPACTION'] = '"urn:schemas-upnp-org:service:AVTransport:1#Play"'
    requests.post(ctrl, data=play, headers=headers, timeout=3)

if __name__ == '__main__':
    dev = find_renderer()
    if dev:
        play_alert(dev)

Hook it up: call dlna_alert.play_alert() from guardian_listen.py when an alert_parent condition is met.

Alternative: send a quiet HTTP webhook/MQTT to a parent phone app or a wall tablet.

Conversation first — how a parent explains this

Script for the kitchen table

  • “There are people online who pretend to be your age. Most are fine; some are not.”
  • “This computer can nudge you if a chat sounds risky, just like a seatbelt reminder.”
  • “Nothing’s secret here. You’ll see on‑screen messages first. If they’re serious, we’ll get a small alert on the TV so we can help quickly.”
  • “We’ll keep a tiny log for two weeks. Then it’s gone.”
  • “If you ever want to review or change the words it listens for, we’ll open it in vim together.”

Consent pattern

  • Agreement is revisited every few months.
  • The child can read the rules file and suggest edits.
  • Clear exit: at adulthood or earlier, by family choice, the system is removed.

Whisper itself does not detect the age of a speaker — it is purely a speech‑to‑text (STT) system. It transcribes the words it hears into text, without making judgments about who is speaking. Determining age from voice would require a separate voice‑profiling or speaker‑characterisation model, which carries its own accuracy limits, privacy implications, and ethical concerns.

If a parent wanted to explore that path, they would need to use an additional tool — for example, a machine‑learning model trained on vocal features such as pitch, tone, and formant distribution — to make an approximate guess about whether the speaker sounds like a child, teen, or adult. Such systems can be wrong, especially with voices that naturally sound older or younger than the speaker’s true age, or when audio quality is poor.

Because of those limitations and the potential for false positives, the safer and more transparent approach is to focus on what is being said (using Whisper and local language analysis) rather than trying to guess who is saying it. This keeps the system aligned with its stated purpose: protecting children based on the content of the conversation, with the child’s knowledge and consent, rather than making assumptions about a speaker’s identity.


Service layout and hardening

Run as a user service (systemd example on Arch):

# ~/.config/systemd/user/guardian.service
[Unit]
Description=Guardian Voice Safety
After=default.target

[Service]
Environment=VIRTUAL_ENV=%h/.venvs/guardian
Environment=PATH=%h/.venvs/guardian/bin:/usr/bin
WorkingDirectory=%h/guardian
ExecStart=%h/.venvs/guardian/bin/python guardian_listen.py
Restart=on-failure

[Install]
WantedBy=default.target
systemctl --user daemon-reload
systemctl --user enable --now guardian.service

Logging & retention

  • Write summaries (timestamp + short quote + tags) to a local SQLite DB or JSONL.
  • Daily cron trims anything older than retention_days.

Network

  • No inbound ports required.
  • DLNA control is LAN‑only.

Troubleshooting

No audio captured

  • Arch: ensure PipeWire sources selected in pavucontrol while the program is recording.
  • Alpine: load snd-aloop; pick correct devices with arecord -l and sd.query_devices().

Transcription slow

  • Switch Whisper model from basetiny/small.
  • Use faster‑whisper with GPU; or whisper.cpp on Pi 5 for strong CPU performance.

False positives

  • Tighten phrase lists; add context via Ollama filter.
  • Require multiple hits within a time window before escalating.

DLNA not playing

  • Some TVs gate AVTransport. Try a soundbar or a Chromecast alternative (cast a short WebRTC page). Consider MQTT → Home Assistant → TTS to speaker.

Why no roadmap

Breath Technology will not set out a feature roadmap for this. This is not a product to upsell or a service to subscribe to — it is a do‑it‑yourself pattern written to save a parent from paying for a cloud‑based monitoring tool when the same can be done locally, under their own roof, with equipment they already have.

Like building a cabinet from a set of instructions, we give you the basic tools and steps — only here, it is code, configuration, and clear guidance instead of MDF, glue, and screws.

Because the subject is sensitive and the purpose is child safety, Breath Technology refuses to profit from it. We simply say: yes, it is possible, here is how, and please do not buy what you can build yourself. The danger is real enough to warrant caution, but not panic. The focus is on awareness, consent, and protection — always for the benefit of the child.


Closing note

This pattern protects without pretending to replace wisdom at home. A gentle reminder on the screen, a quiet alert in the lounge, and a conversation that keeps a young person safe today — and grateful tomorrow.