Final step in producing music scores
As mentioned in the previous post, at Soundnotation the automatic transcription process is checked and examined by humans. What is actually done in this last step of creating a ready-for-publish product?
Comparison with the audio file
Firstly, the transcription is compared with the audio. Audio recognition is a highly complex task. The relevant pitches have to be separated from the frequencies that are irrelevant to the transcription such as overtones and noise (e. g. percussion or consonants of the lyrics). All these are merged together in a single, complex waveform which should be decoded. If wrong notes or chords appear, they are corrected. This in turn helps the audio recognition technology to become even better.
Adaptation of the notation
Secondly, sometimes the score has to be adjusted to conventions of notation. For example, rhythms that reveal the beats of the basic metre (mostly requiring the division of a note into two which then are tied), or the notation of swing with even note values and the respective interpretation mark rather than triplets, or enharmonic change of accidentals to fit the tonality of the song and so on.
Playability adjustment
Layout customization
Finally, the layout is adjusted. Every note should have enough space to be easily readable, a system should not contain too few or too many bars, a page should not comprise too few or too many systems. Furthermore, every score should comply to Soundnotation’s score design.
Thus, the process of creating a score from an audio file involves both automatic processes and human expertise. The first one is doing the otherwise time-consuming transcription work, while the latter ensures a high-quality product.