In contemporary music production, producers and arrangers make extensive use of samples from sound libraries. This is particularly true when talking about the drum tracks.
Sample libraries can be huge. It takes a lot of time to browse through them as they can contain thousands of samples. As a result, producers typically end up using the same five kick drum samples, which they know and enjoy. Everybody heard the same Roland TR samples in trap music!
Some audio companies suggest solutions to this exploration problem under the form of tools to better navigate sound libraries. By the analysis of relevant acoustic features with MIR, these tools organize your samples’ library. The Atlas plug-in, from Algonaut, is one such tool. Other examples include Sononym and Samplism. But the enormous sound library’s still there.
What if an A.I. was able to learn what is a drum sample, and offer producers the possibility to generate and manipulate samples without the use of audio libraries?
This is where the A.I. model family called Generative Adversarial Networks (GANs) comes in.
GANs employ two competing models to learn the statistics of a dataset. They perform very well on real-world, high dimensional content. In particular, they have produced astonishing results in image synthesis, such as the generation of fake celebrity faces. GANs can learn what’s a celebrity face, and therefore generate new faces from scratch.
Sony’s Impact Drums plug-in, developed at CSL Paris by Stéphane Rivaud along with Matthias Demoucron and Cyran Aouameur, uses a GAN-based model to generate and manipulate kick drum samples from scratch.
A noticeable difference with similar approaches is that the samples are generated with a sampling rate of 44.1kHz, leading to better sound quality. Also, we incorporate a priori hypotheses on human audio perception in the formulation of the model. It allows for further optimization of the perceptual quality of the generated samples. In particular, we can get rid of the notorious checkerboard artifacts while using transposed convolution in the generator, resulting in much lighter models than with other approaches.
Details will be published in Rivaud’s Ph.D. thesis to be published soon.
The trained model reflects non-trivial expert knowledge of music production. In particular, the model allows modifying the perceptual loudness of the sample, by any possible means such as making it longer or more compact, adding distortion… This shows a huge potential for creative content generation applications.
The video above demonstrates how Impact Drums can generate realistic kick drum samples. Note that all audio from the video is synthesized on the fly by the plug-in, real-time and without the use of sound libraries. Impact Drums is able to deliver countless kick drum samples using A.I. on a normal Mac computer. As a VST, you can plug it into Ableton Live, Logic, Reaper, Pro Tools…
With Impact Drums and similar CSL technologies, producers will have quicker access to a wider variety of samples, resulting in more varied drum tracks in music. This an example of how CSL is committed to using A.I. for helping musicians to make better music.
Join Sony music labels to take advantage of exclusive CSL technology!