slaytheaccent

landingABCDEFGHIJKLMN✕● LIVESR 16 000 hz · 23 dialects · wav2vec2-xlsr
v0.1-betanode: sta-mtl-01■ rec ready
01system · slaytheaccent2026-04-20
the mouth
is an instrument.
we tune it.slaytheaccent is a phoneme-level pronunciation coach for L2 English speakers. Three acoustic models align your recording against a native reference. You see exactly which sounds to fix — and which to leave alone.
initialize session↗[⏎] enter · [⌘K] docs
00:00“water” · /w ɔ ɾ ɚ/00:00.54
02phoneme · matrix36/44
p
b
t
d
k
g
f
v
θ
ð
s
z
ʃ
ʒ
h
m
n
ŋ
l
ɫ
ɹ
w
j
ɾ
i
ɪ
e
ɛ
æ
ʌ
ə
ɚ
u
ʊ
o
ɔ
consonants + vowels◷ hover to preview
phonemes ·ipa
44
trained on mixed corpus
dialects
23
en-us · en-gb · ...
assess · median
540ms
per 3-sec utterance
learners · beta
1 402
from 41 L1 backgrounds
phase — 01100ms latency
captureRecord a phrase. The signal is segmented by Silero VAD, windowed to match the reference length.
audio → vad → window
phase — 023-model vote
scoreThree wav2vec2 variants emit per-frame phoneme distributions. A DP alignment against the reference phoneme sequence yields a score per phoneme.
frames → align → score
phase — 03adaptive
drillThe weakest phonemes surface in the drill queue. You shadow the native, re-record, iterate until the score moves.
delta → queue → repeat
output · sample
the feedback is
a diagnosis,
not a grade.We don’t tell you “80%”. We tell you the /ɾ/ in water was realized as /t/, the /ɚ/ lost its rhotacism, and the /w/ was perfect. Actionable. Repeatable.
// POST /v0/assess
{
  "word": "water",
  "assess_score": 0.74,
  "phonemes": [
    { expected: "w", observed: "w", score: 1.0 },
    { expected: "ɾ", observed: "t", score: 0.12 },
    { expected: "ɚ", observed: "ə", score: 0.55 }
  ]
}
boot the coach.start session ▶slaytheaccent · phonetic lab · 2026wav2vec2 · whisper · silero-vad
the mouthis an instrument.we tune it.

capture

score

drill

the feedback isa diagnosis,not a grade.

boot the coach.

the mouth
is an instrument.
we tune it.

the feedback is
a diagnosis,
not a grade.