Nicolas Jonason, Bob L.T. Sturm
February 2022
https://github.com/erl-j/neural-instrument-cloning
We have combined techniques from neural voice cloning with musical instrument synthesis. This makes it possible to produce neural instrument synthesisers from just seconds of target instrument audio.
If you want to try it we have released a colab with pretrained models.
There will be a paper out soon explaining more details.
Note: All audio examples are rendered at 48 khz.
Here is a real 16 second saxophone recording:
recording nr: 3 training data.wav
We then train a neural instrument clone on this passage.
Here is the resulting clone synthesising a musical passage it has not seen before. The inputs to the synthesiser are pitch, loudness and pitch confidence contours.
recording nr: 3 unseen estimate.wav
This clone can also be used to modify an excerpt. Consider the following excerpt from the training data:
recording nr: 3 training target.wav
Let’s use the clone to synthesise a quieter version.
recording nr: 3 loudness -12 db.wav
Or make it louder!