Using an NPU/TPU to produce new techniques for manipuating vintage audio ICs

Stephanie Wallace · Apr 21, 2022

Hi all! I'm currently building a unique synthesizer using a bank of antiquated PSGs; as I'm sure it won't be terribly difficult to guess: YM2149s. The synth is meant to be used in two different ways (with the exception of one of the YM2149s, which is being used to route 9 bit vocals in real time—figuratively, of course there is a slight delay, though the Cortex-M7's math/DSP capabilities are so extraordinary that I can send audio in, convert it to the appropriate register values, and return it well below Reason's delay compensation threshold—for which the channels will be summed prior to ADC):

To output analog audio with zero post processing aside from low and high pass filtering to remove the DC offset and ugly harmonics at higher frequencies.

Return each individual channel via ADC to Reason so every individual channel can be "wired" to any Reason Rack Extension.

Each chip/group of chips serves a different purpose, for which I'm developing custom Rack Extensions: The aforementioned vocal routing chip, a drum machine, 2 chips chips that are being used to produce sounds well beyond the constraints imposed by the Atari STe, and three chips that are used with those constraints imposed with one exception: Channel A, B and C from each chip will be combined in 3 groups from each separate YM2149 so each channel can use a different envelope.

While this isn't what this post is about, and I've no doubt many of you are familiar with the AY-3-8910 and YM2149, but to assuage curiosity for those who aren't: The YM2149 produces only square waves, but simulates a few other envelopes by adjusting a 5 bit, logarithmic volume scale during each square wave cycle. When you select an "envelope", it applies to all channels, though each channel can be set to different frequencies. As you have very precise control over a given envelopes start time and length, you can create custom envelopes for each channel by rapidly switching envelope types before a duty cycle has completed. This one of only myriad tricks developed by demoscene artists, a huge number of which have been conceived over the last 10 years. If you're interested in the basics, here is a link to the datasheet of this extremely rudimentary PSG: http://www.ym2149.com/ym2149.pdf

With the lengthy preamble out of the way: I've already devised several means of doing much more with these chips that one what has been accomplished on the ST and STe. I don't consider this a terribly impressive feat given how much more control I have over the chip not just because I can change the registers several orders of magnitude more frequently, as well as change the dynamically adjust the clock speed and duty cycle, but that's no matter—I'm simply interested in creating a unique synthesizer for an EP I'd nearly completed last November, then realized I wanted to completely rework all of the track as the idea for the synth had been percolating for some time.

My question is thus: I recently purchased a TPU, and it struck me that I can program it to control all of the PSG's registers, then feed it sounds, and allow it to manipulate the chip to attempt to approximate them as closely as possible, potentially yielding new techniques never before conceived. I don't have any technical questions; rather, I'm simply curious if any of you have ever attempted IC hacking and manipulation of vintage chips with a neural network; I've searched extensively to see if others have attempted this, and have found no evidence of anyone trying to do anything other than break cryptographic keys/locking mechanisms of modern chips.

I'm very interested in folks thoughts on this, whether any of you have attempted similar projects yourselves, and whether anyone here has come across work of others doing the same. I felt (and hope!) it might interest many of you as much as it interests me, and at the very least, would be great discussion fodder. Any thoughts, ideas, and questions whatsoever are welcome, even if they're not specific to audio ICs!

(Probably worth noting: I have acquired over 100 YM2149s, primarily so I can ensure I can use the synth for decades, and because pretty much every lot comes with a fritzed chip or two—they all have to be tested before use, not just to ensure they're in good condition, but also to ensure they aren't AY-3-8910s that have been re-wrapped to appear as if they're YM2149s. This also means I can seriously stress test some of the chips without concern if I destroy a few while undertaking this endeavor)

cebersp · Apr 24, 2022

Hi,
no, I have not tried to use a NPU to drive a synthesizer. I have done some music sound processing. (One project was about analysing/visualizing amount and type of harmonic distortion.)
I am a little bit curious. As far as I have understood, a neuronal network can produce some similar output for some similar combinations of input values. Before usage it has to be trained to "know" what is meant to be "similar". So let me ask, what kind of inputs do you want to use for your network?
I have got a https://www.thomann.de/de/digitech_trio_band_creator_trio+_trio_plu.htm this unit does analyse a sequence of chords and rhythm and will then add some bass line and drums. I assume, that something like this could be done with a neuronal network. (?) In this case the input will be pre-processed sound.

Edit: Let me add another thought. During the last weeks I tried to find a way, how a computer can "create" graphics patterns, which do "look nice". (I wanted to have this to decorate some easter eggs.) - Meanwhile I think, that almost any pattern is "looking nice", if you first create a sub-pattern and mirror this for symmetry and if you then do variations of this. So this would be "variation of a theme" as you can find it in classics music or in blues schemes.

cebersp · Apr 24, 2022

Ah, after reading your post again, I see, that you are seeking to reproduce sounds and AI shall work out the right register settings as output. So you will then feed the network with a new sound and record the calculated register values for later usage?

Stephanie Wallace · Apr 29, 2022

cebersp said:
Ah, after reading your post again, I see, that you are seeking to reproduce sounds and AI shall work out the right register settings as output. So you will then feed the network with a new sound and record the calculated register values for later usage?

Precisely. I would feed the IC (or several ICs, as despite the short length of the samples, it would be kind of ridiculous not to take advantage of the ability to run the tests in parallel) register data randomly within all known constraints, as well as bit bang a couple of pins that haven’t been thoroughly tested, compare the recorded audio output to to the corresponding samples, then review the register data for the most successful matches, finally using that register data to write new routines for programming the chip. As one has control of the duty cycle within a 20% range, I’m inclined to toy with that as well.

To increase efficiency, when programmatically comparing the input samples to the output samples, I would provide rules for discarding output samples that have very obvious flaws, such as clicks/pops, large amplitude spikes, and skips (or perhaps save them in a discard directory in case they’re otherwise exceptionally successful, so I can then review the register data in such cases to see if those flaws are avoidable).

The largest issue as that in so-called “envelope” mode, the volume scale is 5 bit, while in software control mode, you only have 4 bit control over volume values, so a custom envelope—though it can be as long as you wish since you have complete control of volume manipulation over time—offers 16 rather than 32 individual volume levels. As all three channels must use the same envelope in envelope mode, you have very little control over matching values in all 3 tone channels, thus cannot easily take advantage of the logarithmic volume scale in sync to create higher resolution envelopes by mixing channels; this obviously leads to a great order of magnitude fewer possible volume combinations to create multi-channel instruments. With 32 values available, as the output of each channel is simply summed, you can achieve roughly “9.7 bit” audio using all 3 tone channels. Because of the logarithmic scale, however, some values are so proximate to others that they should be removed, more realistically leaving you with somewhat unusual sounding 9 bit audio.

I haven’t been receiving email notifications for posts I’m subscribed to for some reason, hence the delayed response (I thought this post had gone unanswered) but I’ll provide a little bit more information about some tricks that can easily be pulled off with accompanying ASCII art representations and audio snippets this weekend to lend a little bit more clarity.

Stephanie Wallace · Apr 30, 2022

cebersp said:
Hi,
I have got a https://www.thomann.de/de/digitech_trio_band_creator_trio+_trio_plu.htm this unit does analyse a sequence of chords and rhythm and will then add some bass line and drums. I assume, that something like this could be done with a neuronal network. (?) In this case the input will be pre-processed sound.

By the bye: The Trio+ looks pretty interesting (“im Stompboxformat” is such a hilarious construction), and something like this could definitely be accomplished with a neural network. It’s reminiscent of little trick I like to play with sampled noise leading into a beat kicking in: Using Reason’s built-in Alligator rack extension, which is a triple band gater (ugh…), filter, and envelope shaper, to bend the sample into a an approximation of a drum machine rhythm that’s about to kick in, ramping up the amplitude of the bass drum(s) from silence over the last quarter of a measure beforehand.

Regarding neural network audio synthesis, I’ve actually done some work in this area to decent success (I’ve been playing around with neural network-based computing since it became possible to do so affordably, and have developed an image enlargement application that, to my eye at least, yields better results than anything else I’ve seen thus far, including Adobe’s implementation, and am seriously considering commercializing it), and would be totally interested in discussing collaborating on a such a project if it’s an area of serious interest for you.

Using an NPU/TPU to produce new techniques for manipuating vintage audio ICs

Stephanie Wallace

Member

cebersp

Well-known member

cebersp

Well-known member

Stephanie Wallace

Member

Stephanie Wallace

Member