Dec 1, 2020 · 5 min read

Immersive Music Experience Prototype for the Smart Home

Like many of you, I’ve been going through live music event withdrawals — craving the bright lights, lasers, and stunning visuals. This prompted the questions “how much ‘stuff’ do I really need to simulate the live music experience at home?” and “how expensive is this ‘stuff’ anyway?”

After experimenting with various setups for a few months now, I’ve learned that you don’t need a lot of expensive hardware to create a basic experience. By going more “pro” you risk getting to sensory overload more quickly. The most valuable component for creating the ultimate immersive music experience at home is live event lighting and music analysis data.

Here is a sketch of how combining live-event data with track analysis data, and then transforming it into a low-latency data stream for use by smart lighting devices synced to the music, could work:

Image for post

I wanted to share my findings and thoughts on how and why event lighting design and music track analysis data should be combined with smart consumer electronics products to enhance the at-home music consumption experience. Here is a video of the prototype.

 

Live Event Lighting Design & Motion Data

Live Event Lighting Design & Motion Data

I took a plunge into the world of lighting design for live events to get a baseline understanding of the underlying technologies and how it all works. The short version is this:

  • The performing artists, lighting designers, stage craftsmen, and promoters collaborate for months to establish the desired moods and feelings by mixing LED lights, lasers, fog machines, pyrotechnics, and video screens.
  • The key moments and sequences, the blending of colors, strobe patterns, lasers, and spotlights, are saved as “scenes” in a central system for easy recall during the live event.
  • The technology to control devices such as lights, lasers, and fog machines is called DMX (Digital Multiplex) protocol. The signal is unidirectional, meaning it only travels in one direction; from the controller or first light, all the way to the last. In its most basic form, DMX is just a protocol for lights, like how MIDI is for keyboards or DAW controllers.
  • The light show is improvised in realtime by a human during the live event.

Without getting into the legal issues, let’s assume for a moment that live music event lighting design, motion, and sequence data exists on laptops around the world. Let’s also assume that the incentives are there to monetize the data by either feeding it to ML algorithms to train visualization AIs, or transforming it for use by smart home music visualization devices.

Consumer Grade Software and Hardware

My wife wasn’t going to allow a big DMX control board in our living room, so I settled for a software-only solution when prototyping in my home office. I was happy to discover that my Pioneer Rekordbox license already included a DMX compatible lighting design module.

Image for post

Exploring the software nudged me to look into how much a DMX compatible laser and a moving head spotlight would cost. I was able to find a unit with 3 (RGB) lasers and strobe lights for $100 and a moving head spotlight with 12 unique stencils for $140. Both were very bright but not overpowering. I did notice they both get hot quickly, and the fans are rather loud.

Image for post

While testing, I wished to upload my own stencils for the spotlight and laser, such as the artist’s name or visualizing the lyrics.

Image for post

I also wanted to test out non DMX solutions such as mic/rhythm-based lighting. I used 6 Hue lights, 1 Hue LED strip, 1 Hue Bloom, Nanoleaf Rhythm Edition Panels, a mini disco ball lightbulb, and a few WeMo smart plugs to turn on/off the spotlight and lasers.

Image for post

I connected all devices using Apple HomeKit, enabling the most granular control over the Nanoleaf panels and providing a central control surface to prototype the experience with.

Learnings from the Piecemeal Approach

The first learning was that without the human touch or data describing the music structure, key, and phrase progressions, rhythm detection is not enough. While I was able to create a decent atmosphere by controlling the lighting modules manually, the “magic” quickly disappears when I put the iPad down. The use of a mic as an input device was responsive to trigger events across the Nanoleaf, spotlight, and lasers. Still, the devices lacked an understanding of the song structure, music key, and genre. This limited their ability to “choose” the most relevant colors and patterns to express the mood of the music. The user starts to notice the repetitive visuals, sequences, and patterns, revealing that the generated visuals and music were not thoughtfully weaved together, as they would be at a live event.

The second learning was that piecemealing a solution using multiple brands and technologies is not as straightforward as it sounds. It required linking multiple control apps, creating home automation scenes, and understanding how DMX works. Not to mention consumers are not interested in having ugly XLR cables sprawled out on their floors. The ideal solution would be a suite of well designed, compatible music experience products (LED lights, lasers, moving spotlights) with low latency wireless connectivity that consumers would be proud to mount on their walls and ceilings.

Filling Data Gap for Multi-Sensory Music Experiences

The core music analysis technology to extract track Key, BPM, Genre, and Beat Grid metadata already exists in most DJ apps as well as the Spotify APIs and can be used to generate a dynamic visual experience for the listener. To get a feeling for the data, check out the Spotify API documentation here:

Get Audio Features for a Track | Spotify for Developers

Get Audio Analysis for a Track | Spotify for Developers

Pioneer’s Rekordbox is excellent at identifying beat grids and phrases and then suggesting visualizations based on the genre, frequencies, and intensity of a track.

Image for post
Image for post
Image for post
Image for post
Image for post

The MixedInKey software assigns music track keys using the Camelot scale and sets an “energy” level. While the original intent for the software was to make it easy for DJs to identify 2 tracks that can mix in harmony, the same data and logic can be used to assign color pallets and light motion patterns.

Image for post

New technologies, such as the Algoriddim’s Neural Mix, can separate vocals, instruments, and drums channels in realtime. This enables the possibility to dynamically assign each channel to a unique device — for example, the drums to the strobe light and lasers, the vocals to the LED lights, and the instruments to the Nanoleaf panels.

Magic Wand vs. Voice Commands

There is something very unnatural about shouting commands at Alexa when you’re relaxing and listening to music. When testing the prototype I found myself wanting to control the intensity and patterns like a conductor, via a Bluetooth wand with a 3-axis gyro, made with the same graphite materials as the Apple Pencil.

Image for post

With Neural Mix or stem extraction technologies, the user could use the wand to amplify or mute the vocals, drums, or the instruments layer, and, for example, focus on Adele’s voice for a more intimate artist experience.

Image for post

My team and I developed a similar interaction model for the Adobe MAX conference back in 2007 using hacked Nintendo Wii controllers. The video is below:

Image for post

Closing Thoughts

To summarize, there are at least 3 ways to fill the visualization data gap:

  • Live event lighting design and motion data (transformed for device use or as AI training data)
  • Track analysis data (how DJ software and Spotify analyze music tracks)
  • Realtime separation of beats, instruments, and vocals (and mapping the frequencies as colors and motion patterns to smart devices)
Image for postSummary Diagram of the System

 

The major music streaming services — Amazon, Apple, and Spotify — are really in the best position to innovate in the space and enable a visualization data layer on top of audio streams. They already have digital tracks and infrastructure to reindex their music libraries, and their music players support similar functionality when displaying time-coded lyrics in sync with the music.

I haven’t done the market research to validate some of these hypotheses, but from a bill of materials (BOM) perspective the margins are there. A product like this would be the ultimate toy for audiophiles, further driving HD subscription upgrades. Genre Bundles with AR pre-visualization tools will let you preview the experience in your own space.

Applying Roger’s 5 Factors that influence the adoption of an innovation we get high scores across all 5 factors:

  • Relative advantage — the degree to which a product is better than the product it replaces
  • Compatibility — the degree to which a product is consistent with existing values and experiences
  • Complexity — the degree to which a product is difficult to understand and use
  • Trialability — the degree to which a product may be experimented with on a limited basis
  • Observability — the degree to which product usage and impact are visible to others

The major music streaming services — Amazon, Apple, and Spotify — are really in the best position to innovate in the space and enable a visualization data layer on top of audio streams. They already have the digital tracks and infrastructure to reindex their music libraries. Their music players already support a similar feature: time-coded lyrics in sync with the music.

If you made it to the end of this post, I’d love to know if you would consider purchasing such a product (imagine seeing a live demo at the Best Buy’s Magnolia Hi-Res audio section), and if so, what would you expect the price to be?