The application ViChRR (Virtual Choir Rehearsal Room) provides
- low-latency lossless audio calls between tenths of users,
- surround effect for better distinguishing of individual voices – each coming from different direction,
- build-in metronome synchronized according to instant delays of individual participants,
- possibility of changing mutual delays for better synchronization when singing with leader,
- recording and other features,
- everything for free and open-source.
On the other hand, do not expect
- cancellation of echo and long tones,
- volume normalization canceling song dynamics,
- sound compression lowering its quality,
- encryption (so far) – sing as if somebody listens behind the window.
The application requires
- a computer with Windows, Linux, or MacOS,
- headset (using loudspeaker is not possible),
- stable internet connection with minimal bandwidth of 1.7 Mbps download and 1 Mbps upload,
- server with Linux as a central point for sound mixing.
This web is primarily a guide for installation, connecting incl. possible issues and their solutions, and controlling the application.
More technical details and source code are available on my github, as well as the downloadable executables, which are there in releases section.
Teaser
An audio track of the following clip was recorded directly in the application without leading track and any postprocessing from different places in Czechia. Video tracks were recorded separately and synchronized with the audio later.
What delay to expect?
Latency may have different meanings. Here we consider the whole time by which sound travels
- from microphone to the application,
- from the application to server,
- from server back to the application, and
- from the application to headphones.
It is so called round-trip delay. At the same time, we distinguish
the latency of computer sound system – the time from microphone to
application and bach to headphones – and the remaining (network) latency
– the time from the application to server and back. You will see these
two values separately in this order, e.g. 30+69 ms
.
The former value, i.e. sound system delay, is usually 20–30 ms. It is measured before connecting by playing beeps and recording them back. For lowering it see Choosing audio device.
The latter value consists of a network transmission delay (so called ping) and delayed playing of received sound as a compensation for transmission delay deviations (so called jitter). With cable connection and server near backbone network, it is possible to achieve around 50 ms on distance of hundreds of km. Using WiFi may have significant negative impact on this delay, see WiFi connection optimization if you cannot avoid it.
The length of a buffer (with received but unplayed sound) is maintained very low, which causes occasional glitches when some adjustments are needed. With stable connection, the buffer size may be as low as 5 ms on each side, while in other applications (without glitches) it may be even in hundreds of ms.
The time by which the sound is transferred from microphone on one side to the headphones on the other is then approximately the average of the total latency of both participants – the way from one to server and from server to the other. With aforementioned conditions it is about 70–80 ms.
Surround effect
Every participant sends mono sound to server, where it is mixed with other voices separately for left and right ear with a little phase shift and volume difference. This causes the illusion that each voice comes from different direction.
In particular, all the voices are arranged in a semi-circle around the listener in the distance of 2 m ordered as is shown in the application. This order is the same for everybody (they listen from the same place), only the individual’s voice is missing – the specific place in the semi-circle is not occupied.
The order of voices may be changed, but the volume of them is not affected by your distance from them in the list.
The use of phase shift between left and right channel allows better direction resolution than only using different volumes; the effect is, however, audible only with headphones.
Conditions of use
The application is as free software distributed under the license GNU GPLv3, briefly it means:
- you can use it freely,
- you have access to its source code,
- you can modify it,
- you can share both original and modified versions as long as you keep the original license.
The limitation consists in the fact, that if you base your application on this one and want to sell it, then the first buyer must receive even the source code and the right to freely redistribute the application without paying.