Request for opinions on sound library functionality

ZeroByte · Post by **ZeroByte** » Wed Nov 17, 2021 5:09 pm

As I've hinted at in other posts, I'm currently developing a sound API and tool suite, Zsound. I want to release it soon, but before doing so, I'd like to have the base features working well and in a state where additional features to the library do not break previously-published API behavior. Currently, it's only got music playback support in the library and tools for importing music and FM instruments from various sources - mostly VGM, but VGMs of more than just YM2151 type are supported. I.e. you can import Sega Megadrive tunes, or YM2203 tunes with the tool.

The goals of this library are speed and simplicity.

Keeping that in mind, consider this:

For the playback routine, it currently just loops indefinitely on tunes with loops, or stops playback when the tune ends. The process is fairly opaque to the program using the library, and I think it would be useful to have some external control over the player's behavior - for instance you may wish to have a tune loop only a certain number of times, or synchronize events to the music ending, etc.

Currently, it is possible to know whether music is playing or not based on the frame delay counter - 0 = not playing, nonzero = playing. There is no way to know how many times the tune has looped, if any.

The question is: What makes more sense for controlling the looping behavior?

Ideas I've had:

Have a variable like "num_loops" which simply increments every time a tune loops

Make the start_music() function have options to define the behavior and leave it up to the player to behave accordingly

Have the player do callbacks whenever the end of the tune / loop point is reached

Each one has pros and cons. I'm inclined to go with the first option, and leave it as an exercise up to the program to determine how to send signals to the music player based on its state - such as trigger a fadeout() whenever num loops > X.

So - what do you think makes the most sense for this?

Also, what sort of controls would you expect to see for a music player routine beyond: start, stop, fadeout?

Note - I don't currently have volume control implemented, but I'm planning on doing it in a way that wouldn't interfere with the current functionality, so this is not a showstopper for the initial "alpha" release.

Scott Robison · Post by **Scott Robison** » Wed Nov 17, 2021 5:36 pm

I think the problem with idea 1 is you don't know how many maximum loops a given application might care about. If it doesn't care you don't need it. If it does care, it might only need one byte, or two, or three ... I would be inclined to omit that default counter since it might not be used or useful depending on the consumer of the library.

Idea #2 is okay, but it also assumes that you can always know in advance how many loops you will require, and that it will always fit into a byte / word / more.

Idea #3 is best because presumably the application consuming the library can do whatever makes sense for itself. It can omit the callback in which case the library doesn't callback at all. If the application needs more or less space for such a variable, it will know how much room it needs. Or maybe it doesn't need to count, but some event might allow it to determine "now is the time to stop after this loop, which I could not possibly have known about before now". Presumably the callback could return a value that the library would use to know "I am looping indefinitely and shall continue to do so" or "I am looping but this should be the last loop" or "I am not looping, but I've reached the end, so the application can tell me to restart even though this isn't a loop" or even "I am not looping, but the user can tell me to start a brand new tune randomly or by some other criteria".

Of course, this is easy for me to suggest, since I'm not writing the library. I just think one callback routine can provide infinite flexibility that isn't available for #1 or #2.

If you *did* want to allow #1 or #2, it would be trivial for the library to provide built in callback functions that could be registered instead of a user defined callback that provided the easy functionality for applications that don't want to worry about it.

desertfish · Post by **desertfish** » Wed Nov 17, 2021 6:57 pm

I think the callback idea is best. Even more so because I'm thinking of a callback that triggers after a "pattern" or "bar" so that you might be able to sync up some behavior with say, a drum beat or the rhythm of the music?

would that be possible? Then we could create music games or puzzles or interesting visual effects (strobe light, whatnot)?

ZeroByte · Post by **ZeroByte** » Wed Nov 17, 2021 7:01 pm

On 11/17/2021 at 11:36 AM, Scott Robison said:

If you *did* want to allow #1 or #2, it would be trivial for the library to provide built in callback functions that could be registered instead of a user defined callback that provided the easy functionality for applications that don't want to worry about it.

I basically figure a byte is enough, and if any program wants more than that, they can implement whatever size variable is required - just occasionally check the n_loops and update your own counter accordingly. I mean, does a song really need to have a finite number of loops > 255? Even if the music was just 1 second, that's about 4 minutes, 15 seconds. If the music is 15 sec per loop, that's an hour and 4 minutes.

I agree that callbacks give the most flexibility, but they also introduce complexity. Maybe I should just make the step_music routine return a boolean "reached_eof" which is true whether it looped or halted.

Scott Robison · Post by **Scott Robison** » Wed Nov 17, 2021 7:39 pm

And I will not be critical of you for any decision you make. I was just answering the question as posed and trying to think of "ultimate flexibility" realizing that all engineering is about analyzing the problem and deciding which things are good to have and which are too much. Given my lack of musical / audio programming experience, I am mainly able to help answer questions in the "generic library" context. ?

ZeroByte · Post by **ZeroByte** » Wed Nov 17, 2021 7:46 pm

On 11/17/2021 at 12:57 PM, desertfish said:

I'm thinking of a callback that triggers after a "pattern" or "bar" so that you might be able to sync up some behavior with say, a drum beat or the rhythm of the music?

would that be possible?

What we're talking about would only happen whenever a tune ends/loops, so it wouldn't be useful for this sort of functionality.

However, the music data format does allow for such a thing to be done, potentially. Currently there is an undefined 4-byte command which I wrote into the spec as a place holder for doing the PCM digi sample track. Currently, the player just NOPs these and skips 4 bytes if they're found in a data stream. (I haven't figured out what sort of PCM commands might be needed). Essentially, this is kind of an "event" command, i.e. start/stop playback of a digi sample. It would be possible to define a generic "trigger" as one case of these 4-byte commands. The trigger might be defined as "send 3 bytes to the event callback routine" - and then it would be up to whatever the program decides to do with those 3 bytes.

The danger of such a thing is that it introduces "compatibility" issues - i.e. a ZSM file containing triggers would immediately become non-portable. Suppose one program decides to use triggers, and decides to use the 3 bytes for whatever behavior, and another program does something different. ZSMs for project A would cause unpredictable behavior if loaded and played back in project B.

So, I guess I'd prefer to have a set of standardized event types over a generic "send these three bytes to a callback routine", or maybe even just call it a "sync" frame that doesn't pass any data, and doesn't posses any particular meaning - kind of like "Send IRQ" - why? I dunno - just send it.

Or - maybe it should be "sync: track#" - so here's a trigger that means "something interesting just happened in track #" - do whatever you want or not. K, thx, bai.

Honestly, though, this is starting to sound like a job for MIDI. A MIDI playback library that lets you load a sound font, etc and play back MIDI files would be pretty useful for the community, and you could definitely embed events in there for sync-to-music stuff - like some rail shmup where there're enemies that emit pulses of energy in sync with the bass line of the music, and the lighting in the corridor changes when the music goes into the bridge vs the chorus, etc.

Fabio · Post by **Fabio** » Wed Nov 17, 2021 8:59 pm

This is an interesting topic But i hava a question : can software IRQ be generated in this system?

maybe just jumping to the IRQ address?

Scott Robison · Post by **Scott Robison** » Wed Nov 17, 2021 9:24 pm

On 11/17/2021 at 1:59 PM, Fabio said:

This is an interesting topic But i hava a question : can software IRQ be generated in this system?

maybe just jumping to the IRQ address?

In a 6502 BRK generates a software IRQ. That sets up the stack properly so that RTI does the right thing.

Tatwi · Post by **Tatwi** » Fri Nov 19, 2021 3:33 am

Here are some general functions that might be helpful.

playFromTo(FILENAME, START, END)

- Plays part of a file.

loopFromTo(FILENAME, START, END, ITERATIONS)

- Plays part of a file for a set number of iterations.

newSound = genEffect(FILENAME, START, END, EFFECT)

- Creates a new sound file from part of a file and applies one of a number of sound effects.

applyEffect(SOUND, EFFECT)

- Applies a commonly used sound effect, such as reverb, reverse, pitch shifting, fade in, fade out, etc.

newSound = join(SOUND, SOUND)

- Joins two sounds into one.

The from-to format to play different sounds stored in a single file is a concept used by other sound libraries. I don't know how helpful or purposeful it is, but it is "a thing". Being able to rip portions of songs/sounds and using them with joints and effects would allow for nifty stuff such as simulated record scratching or scene based ambience.

ZeroByte · Post by **ZeroByte** » Fri Nov 19, 2021 5:44 pm

I hadn't considered "utility" functions from the perspective of an editor utility as candidates for inclusion in the library. It's definitely food for thought - many of the functions mentioned (such as join(sound1, sound2)) are pretty much the type of thing that should be left up to the application to implement for whatever purpose - but that does bring up the point that the library should expose the necessary ingredients to facilitate such things.

I think seek() / rewind() / advance() make a lot of sense. These would require some functionality that's also needed for the ability to preempt music playback on one or more channels with SFX and then resume the music once the FX is done.

My estimation is that it will require 2 pages of RAM (512 bytes) for the YM cache and 64 bytes for the PSG cache. YM, being write-only requires a page of memory to shadow it, and another page to cache the state for "ghost writes" while a voice is suspended. This is why I'm strongly leaning towards the library using a bank of HIRAM for its workspace, in order to minimize the main memory footprint...

So my takeaways are that it would be good for the player to have the ability to seek a certain spot in the music, apply effects to the playback, and that I need to consider some non-playback-related things like functions to compute the duration of a tune, the duration offset of the loop start and end, etc.

Effects would need to be limited to fairly simple things like volume adjustments and pitch transposition.