Sign language, made shareable

A bridge between signed and spoken.

BridgeTalk fuses hand tracking, body pose, and facial landmarks to recognize location-anchored signs — touching the chin for thank you, the chest for me, the ear for hear. Real signing is more than gestures in the air; we read where the hand meets the body.

Open communicator → Text to sign

~90

Curated signs

543

Landmarks / frame

Alphabet letters (ML)

Servers, accounts, uploads

01 — Pipeline

Holistic landmarks

21 hand × 2, 33 body, and 468 face points per frame from MediaPipe Holistic, fused into a single feature record.

02 — Anchors

14 body regions

Chin, forehead, mouth, nose, ears, cheeks, temples, chest, shoulders, neutral space, lap. Sized to each person's face width.

03 — Contact

Fingertip ↔ region

Proximity with dwell tracking. Distinguishes a sustained touch from an accidental pass-through.

04 — Vocabulary

~90 signs

Location, motion, and shape signs combined: thank you, eat, hear, think, sorry, me, you, plus numbers and fingerspelling fallback.

05 — Sentence

Real-time text

Auto-spaced output with capitalization, punctuation, undo and clear. Read aloud with the device's best voice. History saved locally.

06 — Privacy

Stays in your browser

No server uploads, no account. Holistic runs locally; only the optional alphabet ML uses your own machine's Python backend.

07 — Tunable

Tweakable in-app

Settings panel lets you adjust confidence, hold-time, cooldown, overlay layers, TTS speed, and audio feedback live.

08 — Open

Retrainable model

The alphabet model is a small RandomForest. Train it on real keypoint data with one script and drop the pickle into models/.

Honest scope

This is a real recognition engine, not a research demo of full sign-language translation. There is no pretrained, browser-ready model in 2026 that recognizes general ASL or PSL conversation — the state of the art needs server-side GPUs and still tops out around 60–70% on a few thousand isolated signs.

What you get here: ~90 signs that work reliably because their multimodal signature (handshape × location × motion) is unambiguous. The recognizer is built so a trained model can be plugged in later via a one-line hook. See the README for the roadmap.

Try these first

thank you

Open palm, fingertips touch chin, then move outward.

Index finger pointing at your own chest.

hello

Open palm at temple/forehead, small wave outward.

yes

Closed fist, knuckles facing camera, nod down.