

Jay Long
Software Engineer & Founder
Published March 6, 2026
Right now, rZNjN is just a word. I use it to invoke an idea that will become the foundational technology of a system, a set of products, under the banner of CyberWorld Media Studios.
At its core, rZNjN is about collaboration. It's about fostering a hybrid intelligence that emerges from collaboration between human musical performances and generative AI. The goal is to form a deep, meaningful partnership with AI when it comes to creating music.
The state of technology has factored in largely to rZNjN actually becoming a thing. Now that I've coined a term for it, it's transformed my thought process, my inspiration. It's changed everything, and it's been really powerful.
It all traces back to the N96 project. N96 is a personal obsession of mine. Back in 1996 through 1999, Nine Inch Nails was supposed to be working on a follow-up album to The Downward Spiral. Instead, Trent Reznor worked on other projects: soundtracks, scoring, the Quake soundtrack, sound effects. Marilyn Manson's Antichrist Superstar record was effectively another Nine Inch Nails album, but not really. We were all anticipating that follow-up album that never came. All these different projects Reznor was working on at the time give you an idea of what that album would have been like in 1996. I've been obsessed with that ever since. I was about 11, 12, 13 years old at the time.
You could think of N96 as an attempt to generate what-if scenarios. And in order to bring N96 into existence, there are certain technological capabilities I needed. That need is what spawned rZNjN. N96 will still be a source of inspiration to test functionality along the way, but the technology that has emerged out of it goes way beyond one project.
DrumNote is the capability to generate standard musical notation for drum tracks. Take a music track, a song or a sample of a song, separate it into stems so we can isolate the drum track, then analyze just the drums and find the kick, snare, and cymbals. Ideally we want it all, toms, whatever, but different genres and different producers use different kinds of percussion, so let's keep it simple. Start with what's viable, meaningful, and useful. Get the core rhythm of the beat. Kick drum, snare, cymbals. Those three things. That's a huge win and a viable solution right there.
We can iterate as the code matures, as we gather more data, and as the foundational AI models advance. But the starting point is those three elements.
At its core, though, DrumNote is really just CRUD functionality. Create, read, update, and delete notation objects. That's the foundation. The ability to generate notation from an audio file is more like an import feature. There'd probably be part of the interface that can access your microphone and sample on the spot, but the bedrock is CRUD on notation objects.
NotePulse is the reverse of DrumNote. It takes drum notation and generates a beat. So where DrumNote can export to NotePulse, NotePulse has its own CRUD functionality around a different data object: an actual audio file, a rhythm track that you can play and listen to. I don't have a clean word for that data object yet, but that's what it is.
This is the bigger idea. We want to take classical standard music notation as a bedrock and expand it into a rich, fine-grained data object that explains everything about the music. Take notation and blow it up into something deeper.
It feels like this breaks into two main categories: analyzing music from a technical perspective, and analyzing music from a more human, emotional, expressive storytelling perspective. The latter becomes more relevant as we move beyond drums into other parts of the music.
One product that inspires the specification work is SoundCloud's user engagement feature. SoundCloud represents a song in waveform format graphically, and users can click on part of the song to like or leave a comment at a specific timestamp. That same concept could work for creators. The creator should be able to explain what they were trying to communicate at different parts of a song, give backstory, annotate intent. This kind of tool could help inform the Music as Code specification, especially later as we get into sentiment analysis and the more human, emotive class of notation.
Here's a good example of the kind of technical data Music as Code needs to capture. I've seen Breakcore producers demonstrate this. Breakcore is an interesting genre because it's totally computer-created music that's trying to emulate a lot of rock, metal, and punk drumming. A producer will generate a blast beat, play it back, and say, "See, that sounds really rigid and robotic because it's so perfectly timed." Then they'll go in and move the snare hits just a little bit off. And suddenly it sounds more human.
When a metal drummer in a grindcore band, or especially a punk band, performs a blast beat, their internal clock is keeping perfect time. But they're not performing perfectly on time. It might sound on time to the human ear because you can feel their rhythm and their performance, but if you measure it with enough precision, the human performance is a little bit off. A computer-programmed beat is perfectly, precisely on time. You can't really hear the difference so much as you can feel the difference. There's an uncanny valley that doesn't get crossed.
So one piece of technical data in the Music as Code specification would be the level of precision when measuring drum beats. The basic import feature, where we take audio samples and generate notation, is initially going to snap to the nearest perfect beat. But capturing that human imprecision is where the real value lives.
If I'm being honest, it's because drumming is badass. My personal tastes in music tend to lean towards more rhythm-oriented stuff. There's more energy behind my productivity when I'm focusing on the drums.
But there are strategic reasons too. For the N96 project, the ability to analyze and manipulate rhythms is critical. And from a technology standpoint, rhythm is the smartest starting point. The current state of LLMs is that they're most useful when you can very specifically and clearly instruct them. When you get into melodies, lyrics, vocals, the more expressive parts of music, it's really difficult to be specific about what you mean when you say "that doesn't feel right" or "I don't feel anything when I hear that."
Rhythm is a lot more technical in nature. A lot of times drum parts are fairly simple. You can usually communicate clearly what the rhythm is doing without having to explain what the human is feeling. We don't have to do a deep analysis on sentiment and storytelling to get meaningful results. We can start to experiment with that stuff, but it's not a requirement to ship something useful.
In conjunction with DrumNote and NotePulse, there's a third piece that rounds out a minimum viable product: some kind of sampler instrument or drum machine. Now that we've got the ability to turn audio files into notation and generate audio files from notation, we need a way to directly interact from a performance standpoint.
This is another reason drums are a strategically advantageous place to start. You can build this feature on a phone. You can render drum pads from a drum machine onto the face of a phone and just tap with your fingers. That's a pretty effective way of playing an instrument on a mobile device. No extra hardware needed, no physical instrument needed. It's an instrument built entirely in software.
For reference, BandLab has a good sampler. There are probably other solutions we can emulate, but looking at BandLab's sampler and drum machine, that's basically all we need to reverse engineer for this third piece.
Compare that to vocals. Yeah, you can sing into a phone, all you need is a microphone. But everything voice-related, singing melodies, lyrics, is going to be extremely challenging to analyze and generate. Drumming has the right balance: we get meaningful value out of purely technical information as a prompt, and it's reasonable to perform on a mobile device.
There are other projects beyond N96 that have to do with scoring, sampling, generating music and sound from movies. One specific use case: when I'm working, I like to listen to ambience tracks. A lot of creative people have built one-to-two-hour-long YouTube videos that are just, like, Terminator ambience, or Alien ambience. You can listen to the music, the scoring, and effects from the Nostromo for two hours in the background. It makes you feel like you're there. I also like to have movies playing in the background while I work, but sometimes the way the movie flows is distracting. It pulls you in.
I don't see anyone with a good product for separating effects from music and voice in movies. The stem separation technology for music, separating drums, bass, voice, guitar, piano, that's standard functionality that several companies do really well. You can take that and do a pretty good job of separating voice out of a movie soundtrack. But separating music from effects? That's a different problem. If there's a scene with a lot of action, people talking, and the score playing, you've got this mixture of effects that no one has separated into stems well. And if the soundtrack isn't orchestral, if it's a rock song or a hip hop track with someone singing or rapping, that gets confused with actors delivering vocal performances. That may be a really hard problem to solve. But it's another capability that would be very useful.
All of these data objects are going to go under the same schema. I'll probably call it CyberWorld Media Studios. It'll use CyberWorld login for identities, have its own profile. Each one of these products could be standalone, but I want to first develop them as scripts that I can run and define a manual workflow before I start building any software at all.
I'll probably build out landing pages for each one: DrumNote, NotePulse, whatever I end up calling the instrument. The landing pages give me a way to envision the future UI and communicate the user experience. But what I'll really be building behind the scenes is workflows. What can I already do with existing products? What's the closest workflow using different solutions, whether it's capabilities of my computer, my phone, existing products like BandLab?
We can iterate on all of this as the foundational models improve and as we gather more data. The point is to start with something viable, start with drums, and build from there.