Vocaloid (ボーカロイド Bōkaroido?) is a singing voice synthesizer. Its signal processing part was developed through a joint research project led by Kenmochi Hideki at the Pompeu Fabra University in Barcelona, Catalonia, Spain, in 2000 (the same team that later founded Voctro Labs) and originally was not intended to be a full commercial project. Backed by the Yamaha Corporation, it developed the software into the commercial product "Vocaloid".
The software enables users to synthesize singing by typing in lyrics and melody. It uses synthesizing technology with specially recorded vocals of voice actors or singers. To create a song, the user must input the melody and lyrics. A piano roll type interface is used to input the melody and the lyrics can be entered on each note. The software can change the stress of the pronunciations, add effects such as vibrato, or change the dynamics and tone of the voice. Each Vocaloid is sold as "a singer in a box" designed to act as a replacement for an actual singer.
The software was originally only available in English starting with the first Vocaloids Leon, Lola and Miriam and Japanese with Meiko and Kaito, but Vocaloid 3 has added support for Spanish for the Vocaloids Bruno, Clara and Maika; Chinese for Luo Tianyi and Yanhe; Korean for SeeU.
The software is intended for professional musicians as well as light computer music users and has so far sold on the idea that the only limits are the users' own skills. Japanese musical groups Livetune of Toy's Factory and Supercell of Sony Music Entertainment Japan have released their songs featuring Vocaloid as vocals. Japanese record label Exit Tunes of Quake Inc. also have released compilation albums featuring Vocaloids. Artists such as Mike Oldfield have also used Vocaloids within their work for back up singer vocals and sound samples.
Vocaloid's singing synthesis (ja) technology is generally categorized into the concatenative synthesis in the frequency domain, which splices and processes the vocal fragments extracted from human singing voices, in the forms of time-frequency representation. The Vocaloid system can produce the realistic voices by adding vocal expressions like the vibrato on the score information. Initially, Vocaloid's synthesis technology was called "Frequency-domain Singing Articulation Splicing and Shaping" (周波数ドメイン歌唱アーティキュレーション接続法 Shūhasū-domain Kashō Articulation Setsuzoku-hō?) on the release of Vocaloid 1 in 2004, although Yamaha no longer uses this name since the release of Vocaloid 2 in 2007. "Singing Articulation" is explained as "vocal expressions" such as vibrato and vocal fragments necessary for singing. The Vocaloid and Vocaloid 2 synthesis engines are designed for singing, not reading text aloud, though software such as Vocaloid-flex and Voiceroid have been developed for that. They cannot naturally replicate singing expressions like hoarse voices or shouts.