Not a member yet? Register for full benefits!

Lip Reading and Visemes


Lip Reading

Lip Reading is the process of looking at how the lips, teeth, and tongue deform, then inferring from these movements, which phoneme is being articulated. It is used to reconstruct words from visual clues alone.


A viseme is the most basic level of mouth and facial movements accompanying the production of phonemes visually. 32 visemes are required in order to produce all possible phoneme with the human face.

Visemes can help with understanding speech - if the phoneme is distorted or muffled, the viseme accompanying it can help to clarify what the sound actually was. Thus, visual and auditory components work together when communicating orally.

We are just on the cusp of an age of visemes in virtual environments. Lip-synching technologies have advanced to the point where, given an input text stream, and knowing the language of said stream - how to pronounce the phonemes - a computer program can animate a virtual face with a compatible muscle structure, at the same time as it is converting text to speech. Likewise, hearing an audio stream, once the phonemes are identified from the audio track alone, visemes can be reproduced in a virtual avatar.

This puts us in a unique position, where spoken language and written language can be represented equally, within a virtual environment, at least as far as visual appearance is concerned, if not yet audibly.

However, this is why knowing the language is important, especially for written text. Different languages produce different phonemes, as the syllables themselves require different muscle movements to pronounce. One study found for example, 'lip-rounding' in French speakers, and prominent tongue movements in Arabic speakers. Thus, with text especially, it becomes necessary to know the language used, before beginning translation.

There are several ways to do this. One of the simplest, is simply to tell the system before typing begins, what language is being used. Another, is to have the system analyse each word before it acts. This is fraught with difficulty, as often words which are pronounced with different phonemes in different languages - and thus with different visemes - look identical as text.


Lip-reading computer picks out your language

Staff Comments


Untitled Document .