Philippe Esling

Ircam laboratory and Sorbonne Université

Philippe Esling received a B.Sc in mathematics and computer science in 2007, a M.Sc in acoustics and signal processing in 2009 and a PhD on data mining and machine learning in 2012. He was a post-doctoral fellow in the department of Genetics and Evolution at the University of Geneva in 2012. He is now an associate professor with tenure at Ircam laboratory and Sorbonne Université since 2013. In this short time span, he authored and co-authored over 20 peer-reviewed journal papers in prestigious journals. He received a young researcher award for his work in audio querying in 2011, a PhD award for his work in multiobjective time series data mining in 2013 and several best paper awards since 2014. In applied research, he developed and released the first computer-aided orchestration software called Orchids, commercialized in fall 2014, which already has a worldwide community of thousands users and led to musical pieces from renowned composers played at international venues. He is the lead investigator of machine learning applied to music generation and orchestration, and directs the recently created Artificial Creative Intelligence and Data Science (ACIDS) group at IRCAM

AI in 64Kb: can we do more with less?

The research project led by the ACIDS group at IRCAM aims to model musical creativity by extending probabilistic learning approaches to the use of multivariate and multimodal time series. Our main object of study lies in the properties and perception of musical synthesis and artificial creativity. In this context, we experiment with deep AI models applied to creative materials, aiming to develop artificial creative intelligence. Over the past years, we developed several objects aiming to embed these researches directly as real-time object usable in MaxMSP. Our team has produced many prototypes of innovative instruments and musical pieces in collaborations with renowned composers. However, The often overlooked downside of deep models is their massive complexity and tremendous computation cost. This aspect is especially critical in audio applications, which heavily relies on specialized embedded hardware with real-time constraints. Hence, the lack of work on efficient lightweight deep models is a significant limitation for the real-life use of deep models on resource-constrained hardware. We show how we can attain these objectives through different recent theories (the lottery ticket hypothesis (Frankle and Carbin, 2018), mode connectivity (Garipov et al. 2018) and information bottleneck theory) and demonstrate how our research led to lightweight and embedded deep audio models, namely 1/ Neurorack // the first deep AI-based eurorack synthesizer 2/ FlowSynth // a learning-based device that allows to travel auditory spaces of synthesizers, simply by moving your hand 3/ RAVE in Raspberry Pi // 48kHz real-time embedded deep synthesis