Pronunciation of Key Sounds Influences Perception of Avatar's Gender
In an interesting discovery that holds a great deal of potential for the creation of wholly artificial virtual voices rather than those derived from actors and actresses playing a part, Lal Zimman , a researcher from the University of Colorado Boulder has discovered that the style of speech, rather than simply the pitch or the voice used, is key to our interpretation of the gender. Getting it wrong whilst getting the other two right will lead to confusion on the part of the listener.
His research was based on extensive studies of transgender individuals as they underwent the transition from female to male.
As part of the process of transitioning from female to male, participants in Zimman’s study were treated with the hormone testosterone, which causes a number of physical changes including the lowering of a person’s voice. Zimman was interested in whether the style of a person’s speech had any impact on how low a voice needed to drop before it was perceived as male.
What he found was that a voice could have a higher pitch and still be perceived as male if the speaker pronounced “s” sounds in a lower frequency, which is achieved by moving the tongue farther away from the teeth.
“A high-frequency ‘s’ has long been stereotypically associated with women’s speech, as well as gay men’s speech, yet there is no biological correlate to this association,” said CU-Boulder linguistics and anthropology Associate Professor Kira Hall, who served as Zimman’s doctoral adviser. “The project illustrates the socio-biological complexity of pitch: the designation of a voice as more masculine or more feminine is importantly influenced by other ideologically charged speech traits that are socially, not biologically, driven.”
For his study, Zimman recorded the voices of 15 transgender men, all of whom live in the San Francisco Bay area. To determine the frequency of the “s” sounds each participant made, Zimman used software developed by fellow linguists. Then, to see how the “s” sounds affected perception, Zimman digitally manipulated the recording of each participant’s voice, sliding the pitch from higher to lower, and asked a group of 10 listeners to identify the gender of the speaker. Using the recordings, Zimman was able to pinpoint how low each individual’s voice had to drop before the majority of the group perceived the speaker to be male.
For our purposes, this work has many uses. It is of immediate benefit as stated above, in creating wholly artificial voices with the correct gender cues fo use by virtual actors, and by virtually embodied AI-driven sales avatars, whose voices must be modulated to the individual variants of the customer for maximum... well, sales.
It also of course has uses far closer to the line of work it originated in; in separating the physical constraints of a user of a social virtual environments from the virtual body the environment gives to them. Specifically, in helping to create a virtual voice for them far closer to the preferred virtual form than in that which the base physical form possesses. There is a perhaps unsurprising amount of experimentation from users in this area, and any development which offers a fuller, more complete embodiment in the virtual form, away from the limitations of the physical, is welcome.
Users that would benefit from such vocal refinements and a better virtual voice include of course the transgender community - transitioning both ways, those who lack the ability to talk in their desired language, and those who have impaired or lacking speech at all.
We are still a long way from a perfected virtual voice, but refinements such as this can only help the cause.