Hey
for the game I'm currently developing I need sound recordind support. In the current state orx does not support this. however it is based on OpenAL, therefore it should be rather easy to implement it. I would like to implement this feature into the engine.
In order to be usefull for others too, I would like to discuss possible orx API extensions, that are independent of OpenAL, so that it could possibly work with other underlying frameworks, too.
I'm not that familiar with both orx and OpenAL.
What I though of:
We need an additional object, different than orxSOUND, like orxSOUND_CAPTURE.
This object has properties like the audio format(bits + mono/steroe) and frequency.
Also it might be stupid to have capture objects. you usually don't capture from multiple sources at the same time.
It might be better to just have an function orxSOUND_START_CAPTURE(FORMAT, FREQUENCY, BUFFERSIZE)
and
orxSOUND_STOP_CAPTURE(FORMAT, FREQUENCY)
after that orxSOUND_CAPTURE_EVENTs would be created. The payload would contain the audio data in the specified FORMAT. (BUFFERSIZE bytes at maximum)
for the actual capturing the default device on the system would be used.
then it might be usefull to add some functionality to playback those samples directly, without saving them in a file.
any suggestions?
Comments
Sorry not to have replied sooner but I wanted to take some time to check orx's current sound API and see how this feature could be integrated at best.
I think your proposition makes sense and I would just like to discuss the actual function names. Let me know if what I say makes sense to you.
Right now, a sound buffer in memory is handled by the orxSoundSystem API and is called orxSOUNDSYSTEM_SAMPLE. When you're done capturing your data, that's what we should end up with. We should also add a orxSoundSystem_Save(orxSOUNDSYSTEM_SAMPLE *, const orxSTRING) to save t on disk.
Based on the orxSOUNDSYSTEM_SAMPLE, we can create an orxSOUND (unified structure that can be played through the high level orxSOUND API). Right now the creation of an orxSOUND from an orxSOUNDSYSTEM_SAMPLE is private and hidden in orxSound.c but I can expose something like orxSound_CreateFromSample(const orxSOUNDSYSTEM_SAMPLE *) that would do the trick.
Now for the recording itself, I think you're right, we need two functions, something like:
orxSOUNDSYSTEM_SAMPLE * orxSoundSystem_StartSampleRecording(const orxSTRING _zName, orxSOUNDSYSTEM_FORMAT *_pstFormat)
orxSTATUS orxSoundSystem_StopSampleRecording(orxSOUNDSYSTEM_SAMPLE *).
In this case orxSOUNDSYSTEM_FORMAT would be a public structure with direct access to rate, channels, etc.
My only concern with that is that the orxSOUNDSYSTEM_SAMPLE * should be internally flagged so as to not be useable by any other function as long as StopSampleRecording hasn't been called on it. And possibly have to add a query function such as orxSoundSystem_IsSampleRecording(const orxSOUNDSYSTEM_SAMPLE *). To solve this particular issue, we could go with another structure such as your orxSOUND_CAPTURE, but that means we should then provide something to delete it and also it won't be explicit that the StopSampleRecording would delete such an intermediate structure. I tend to prefer the first solution (keeping the number of structures low so as to make it as simple as possible) but it's not a strong preference either.
To sum it up, I'd like the capture/record interface at the orxSoundSystem level (including the events) and only the possibility to create a playable orxSOUND from the result at the orxSound level. What do you think?
If you agree to go this way, I can easily take care of creating the orxSOUND bridge and all the plugin hooks, simply leaving empty functions for you to fill up in the plugin itself. Let me know what you think!
EDIT: Sorry I made a few post-send editions so you might have to re-read this message again as it may differ from the email notification you received.
I'm now going to have a deeper look at the current implementation of the orx sound system.
I use Codelite as an IDE on a Windows 7 machine.
I'm a bit confused because you have both a SFML and a OpenAL sound plugin. which is actually used? where can I see this?
I'd guess the SFML is used. As far as I know SFML itself uses OpenAL for audio related stuff.
The CodeLite workspace has only a project for the SFML soundsystem. however this one is not built when I build the whole workspace.
however at the end everything is working...I'm a bit confused right know.
...maybe it will be clearer tomorrow^^
You can read the REAMDE file packed with orx that explains a bit what's going on (especially the "Versions" section).
Using Codelite you have the option to compile for 2 platforms (win & linux), 2 link types (static & dynamic), 2 modes (embedded & non-embedded) and 2 levels of optimization (debug & release).
That's a lot of combinations! ^^
I'm assuming the only one that doesn't look obvious is the mode (embedded/non-embedded), am I right?
Well, in non-embedded mode, you'll end up with a library for orx that only contains the core but no platform dependent code (mainly I/O). The plugins will then be loaded at runtim and you can decide which ones you want to load from command line or config (SFML, SDL, GLFW, etc...).
In embedded mode, you lose that flexibility but you gain in ease of use and speed: a given set of plugins will be compiled and embedded in orx library at compile time. The defaults plugins are currently the GLFW ones (orx 1.2+) for computers and there's only one set available for iPhone/iPad.
The list of embedded plugins is defined in src/plugins/orxPlugin_EmbeddedList.cpp
Hope this helps a bit!
it was confusing, that the sources were not part of the codelite project. I didn't expect them to be #included.
so basically the nonembedded OpenAL plugin is missing from the codelite projects.
now I'll have a deeper look at the sound system code.
I'll fix that when I'll get my computer back. I don't package the non-embedded version for releases, so I'm not testing them often (if at all for over a year! ^^).
So one orxSOUNDSYSTEM_SAMPLE contains a single chunk of captured audio data, right?(the size depending on some predefined buffer size)
So the sample that orxSoundSystem_StartSampleRecording returns will be filled over and over again?
How would that work, when just having a single one?
I thought of something like this: orxSoundSystem_StartSampleRecording just initiates the capturing and doesn't return anything.
Each time a new sample is captured either some SOUND_CAPTURE event is triggered or some callback function called.
Now the user does whatever he wants with the sample and then deletes it.
so what are possible use cases for the sample data?
maybe I understood the orxSOUNDSYSTEM_SAMPLE wrong. but saving just one sample to a file is not really usefull, is it?
what do you think?
That's right. More precisely orxSOUNDSYSTEM_SAMPLE is simply a wrapper for the actual structure used by the underlying library. For the OpenAL plugin it holds the buffer ID (works the same way than texture IDs in OpenGL, for example) as long as the duration which can't be easily retrieved through OpenAL.
I'm not sure how OpenAL works for recording, but when the recording is done, your whole sample should be in memory. If OpenAL allows to record directly do disk we can add an additional method such as orxSoundSystem_StartStreamRecording.
When the recording is over, we simply create an OpenAL buffer based on the sample, we wrap it in an orxSOUNDSYSTEM_SAMPLE and we can now use it the same way we use all others. Does it sound ok to you?
Oh yes, I don't know why, I assumed we might want more than one recording at a time which is totally silly, my bad! This option is far better. In this case I'd rather have something like:
We can put in the format if we want to record to memory or to disc, if this is supported by OpenAL of course. I have no idea.
I like the event as it'll stay consistent with the remaining for orx API. The event would be, I guess, orxSOUND_EVENT_RECORD_START, orxSOUND_EVENT_RECORD_END, and the orxSOUND_EVENT_PAYLOAD will have to be updated to contain the relevant info.
Well, if OpenAL allows to record a stream to a file, the user can choose that at the beginning. If the user choose to record to memory, he can then either save it or play it directly. Again, I'm not sure what's available for saving as i didn't look too much into that. I guess libsndfile has save support but I'd be surprised if stb_vorbis had one.
Or for now we can simply focus on just recording to memory, get a sample out of it when it's done and see later for saving/streaming to disc?
Just to sum it up, the orxSOUNDSYSTEM_SAMPLE would be created at the very end from the memory buffer that would have been recorded. But again, I need to check OpenAL as I don't know if it can do dynamic recording or if we have to give it a max size before starting to record.
However for recording to memory, we can actually support non-bound recording and allocate new buffers on the fly, when needed. OpenAL can then create a playable buffer from all those recorded memory buffers.
I'm not sure if I'll have time tonight, but I'll try to prepare placeholders in orx's API for you to fill. I'll create a branch in the svn repository and can give you write access if you give me your sourceforge user name. Let me know.
What do you mean by "when the recording is over"? After orxSoundSystem_StopSampleRecording is called?
After having some thought about the whole thing, I think it's better to do it like to music playback.
This means introducing some generic stream interface. Now the SoundSystem will be able to fetch it's audio data from this stream. one possible implementation would be a filestream.
on the other hand the capturing code would be able to write the audio data continuosly to a output stream. this output stream could either be a file stream or forward it to an audio library. Or add the captured samples to an FIFO. the playback code could then access this fifo via an input stream interface.
I hope it's somehow clear what I mean and it wouldn't be like using a sledge-hammer to crack a nut oO
Let's go back a bit, what are the features you need. I'd like to keep the API as simple and portable as possible as orx will eventually have to support it on other platforms such as iPhone.
As I see it, here's a list of the needed features:
- record for a variable length of time (ie. the time doesn't have to be specified at the beginning of the recording)
- Reuse that recorded sound either as a sample (everything is in memory) or as a stream (from disc). This is generally known before recording.
(Also note that a sample can be created from disc if needed.)
So my guess would be to have functions for asking for recording (either to memory (sample) or to disc (stream)), stop recording (sample -> orxSOUNDSYSTEM_SAMPLE is created and memory freed, disc -> file is finalized) and make the sound accessible for the rest of the API (play, pause, pitch, etc...). Do we need anything else?
Do we need general streaming, ie. play the sound that we are currently recording? If so we can send events for every recorded packet which allows the user to inspect its content, stop recording if it's blank or even altering it. This is very close to what I understood of your streaming view, with a non-intrusive approach, I guess.
I do not need to save the audio data to a file or play it back. However I thought someone else might find this useful.
I don't need that. And as orx does not support networking I can't think of a usecase right now. However if networking is something to be added in the future it might become relevant.
having an event for each captured sample would fulfill my needs.
It would also be fine for writing it to a file.
However it would not be the best option if someone wants to playback the captured sample directly.
In this case I guess we can start recording with 3 options: discard, store in memory, store on disc. You get notified of captured packets through events anyway, the packet size and the mode of captures being defined in the orxSOUNDSYSTEM_FORMAT when starting recording, which should probably be renamed to orxSOUNDSYSTEM_RECORD_INFO?
The length of packets can differ from the internal buffer used for recording of course.
As for VoIP, we'll have time to see how it works, I'd be more concerned about object replication first.
This way the API stays minimal, what do you think?
so how about this: This starts the capturing. The format/info does not contain any information on what to do with the captured data, this means if it should be stored in memory or written to a file.
So now a orxSOUND_EVENT_RECORD_START will be generated. followed by continuos orxSOUND_EVENT_RECORD_SAMPLE events. Those will have the audio data inside the payload.
Once is called, orxSOUND_EVENT_RECORD_END will be fired.
Ok this should be the whole, minimalistic capturing API.
Now based on this API we could add some sound recorder, which catches the events opens a file(in case of an file recorder), writes the data to the file and closes the file after the recording ended.
additionally we could add an in-memory recorder, which does the same, but writes the data to some place in the memory.
What do you think about that?
I would simply rename the sample part as it might be confusing with actual sample (which are memory sound meant to be played).
I'd go with:
Should we merge the events of orxSOUNDSYSTEM and orxSOUND together, and maybe add a convenience wrapper in the orxSOUND module so that users don't have to call directly orxSOUNDSYSTEM_* functions?
Not sure about that, it was just to get more consistency.
yep, that would probably be good.
I'll try to work a bit on the API right now so that you won't need to learn how the plugin module works.
I think we need another function that checks if recording is available. It might not either if the plugin doesn't support it or there is no capturing hardware:
Currently I'm thinking about how do handle the buffers that contain the audio samples in the event payload.
It's probably a bad idea to allocate them dynamically and let the user care about deletion. the user might not have an event handle defined...this would result in a memory leak. and constantly allocating memory is not that great performance wise either.
We could use double/tripple buffering. then the user has to make sure that he has processed/copied the sample until the next event occurs.
First the orxSOUNDSYSTEM_RECORD_INFO contains the frequency at which audio data will be fetched from the audio device. Furthermore it contains the maximum buffer size.
I think it's best to do double buffering. So once the first buffer has been filled this buffer will not be touched, until the user calls some orxSOUNDSYSTEM_RECORD_SWAP_BUFFERS method. This way the data will not be messed up while the user is still processing/copying the audio data.
If the user can't process the audio data fast enough, this means orxSOUNDSYSTEM_RECORD_SWAP_BUFFERS has not been called, when the next fetching has been triggered, we will just drop some audio samples.
the downside is that only *one* sound capturing event listener should be attached.
pro:
buffer will stay consistent while being processed by the user
if the user can't cope with the update frequency samples will just be dropped -> no delay
con:
only one listener
Concerning the buffers, my guess is that you simply need one buffer that will be used both to communicate with OpenAL (in addition to OpenAL's own buffer) and be presented to the user through an even. Its content will then get either discarded or saved to disc after the event has been sent.
As events are synchronous, I don't think you need to worry about buffer swaping.
Also, as orx is monothreaded, you won't be able to fill your buffer while the user is still processing the event anyway, so I don't think double buffering is necessary. What do you think?
You can also have more than one listener to the event, as long as the processing time is less than the capacity of OpenAL's internal buffer, you should be fine.
At first I put the buffer size in the RECORD_INFO structure but I removed it as I don't think most user will need that, and that it could simply be a #define constant based on time (this way it'll adapt to the number of channels and the samplerate).
The OpenAL internal buffer size will then be based on this value.
Do you think you need to specify a different buffer duration for different recording?
So they can't actually interfere with a clock function?
As soon as my clock function will be called, I can be sure that no event handler will still be processing/using the data?
Although it might become a problem with SFML, as this library does the sound capturing in a custom thread. OpenAL should work fine if I can make this assumption.
yes, probably it would be possible to derive the buffer size from the polling frequency. so the user will not have to provide it and we don't need it inside RECORD_INFO.
I lack the experience in audio programming in order to evaluate if there would be any situation in which the user might want to use a custom buffer size.
maybe we could still have it in RECORD_INFO. But if it's unspecified or negativ we will set a decent value ourselfs.
The event handler will be called as soon as the event is fired, probably from a timer callback in the soundsystem module, after filling the local buffer from the OpenAL one.
When your handler is called, you can either process the buffer right away or make a local copy for later use.
So yes, when you're in your own clock callback, there's no other part of orx that is currently being executed. OpenAL might still record on its side but it'll be done in its own private buffer, not the one presented by orx.
I think it should also work, even if SFML is still running, it'll use its own buffer, not the one the plugin is presenting through the event.
That's a valid option. I'm still of the opinion we don't put it at first and only if there's an actual need for it.
I assumed the event system was non-blocking/asynchronos.
(I have overlooked that bit in your post, sorry >.<)
Then this will indeed not be a problem.
Ok I think I'm ready to start hacking...finally^^
Excellent news. Let me know if you face any difficulty. I almost started doing the OpenAL implementation yesterday night as I was getting carried away, then I remembered sleep was what I really needed.
while in the SFML plugin everything works fine I have a problem with the OpenAL one.
alcCaptureOpenDevice will always return NULL. However OpenAL will not set an error flag. I don't get that.
It even works, if I issue alcCaptureOpenDevice in the SFML plugin directly. So those two plugins might use a different version of OpenAL.
Don't know yet where the problem lies exactly...trying to figure it out.
I'm going to be on a trip till saturday. I hope or rather insist that you won't have implement everything yourself by then
(I'm now using VS2008)
Also does it happen on windows, linux and mac or only on one platform: linux and mac use the OpenAL that ships with the OS whereas the windows version uses the OpenALSoft free implementation.
No worries for the code, I have much else to do in my life right now. However, if you could checkin what you did I might try to fiddle around the call not working and maybe be able to give you a pointer when you come back.
As it's in its own branch, don't worry about stability when submiting code.
it's a oneliner that should definitly work:
I better not upload my changes, as I messed it up pretty much in order to find out what's going wrong.
I linked the OpenAL plugin to sfml-audio-d.lib (which contains the OpenAL version of SFML)
After doing so everything works as it should.
I don't know which version of OpenAL is used by SFML.
I was not able to compile the OpenAL library from the extern folder of orx, as the file "dsound.h" is missing.
Looking at alcCaptureOpenDevice 2 days ago I stumbled upon a forum where someone was asking for help as the call never succeeded. Someone told us it needed to be paired with an alcDeviceContext (from memory) in order to work, but that wasn't written anywhere. Not sure if the same device context used for playback can also work for recording. I wanted to toy with that.
As for SFML they're using the official implementation from Creative Labs, which comes with no source and can only be used dynamically. As I wanted the option of linking statically I had to use OpenAL-Soft for windows. That might be the problem. On linux/mac the official OpenAL library being part of the distributions, I didn't need to use OpenAL-Soft.
If you need dsound.h, it means you have to install DirectX SDK, as OpenAL will use DirectSound (->dsound.h) for handling low level.
http://objectmix.com/java-games/130542-openal-capture-problem.html
nevertheless opening a default device and setting up the context will not work.
However I think I get it now
Soft openAL uses different backends. For orx only the wavefile and the directsound backends are used/compiled. unfortunately the directsound backend does not support capturing audio data.
the readme of the soft openAL binary archive recommends to combine it with the creative library: So if you link against the OpenAL from creative and additionally have the soft_oal.dll in the application directory you will have the devices of both implementations available:
http://connect.creativelabs.com/openal/OpenAL Wiki/Enumeration with OpenAL on Windows.aspx
Why is the static linking important?
I have not yet build a version of soft openal with portaudio support. but theoretically this should be the solution to the problem I was having.
Nope, but the content was similar. Don't remember if it was on a BlitzBasic or Haskell forum.
nevertheless opening a default device and setting up the context will not work.
However I think I get it now
Soft openAL uses different backends. For orx only the wavefile and the directsound backends are used/compiled. unfortunately the directsound backend does not support capturing audio data.
the readme of the soft openAL binary archive recommends to combine it with the creative library: So if you link against the OpenAL from creative and additionally have the soft_oal.dll in the application directory you will have the devices of both implementations available:
http://connect.creativelabs.com/openal/OpenAL Wiki/Enumeration with OpenAL on Windows.aspx[/quote]
Well, in this case one might as well use directly creative's library.
Some people don't like using external libraries for redistribution. And I totally understand that point of view. But if there's no choice...
Would using PortAudio mean having the option of statically linked libraries? If so, I like it.
So I checked it out via git. This newest version support now capturning via WinMM. So no need for portaudio.
Which is good, because I had some problems with the portaudio implementation. I was not able to compile it.
Anyway, so I added the "dev" branch of OpenAL soft to the SVN and compiled a static library from it. With this library it is possible to capture audio through WinMM without the library from creative.
I linked this static library to the orx project. Now everything works fine..playback(I didn't break it^^) and capturing. The only weird thing is that when including the headers of the dev branch(in extern/OpenAL-dev/include) I get several "error LNK2019: unresolved external symbol "__imp__alcCloseDevice" in function "@orxSoundSystem_OpenAL_Init@0"none".
However if I use the old header files(extern/OpenAL-1.12.854/include) together with my new library, everything works like it should. That's totally weird and I don't get why
I like the possiblity to use both the creative lib an the opensource version in combination. But I think this is a personal preference. I'm thinking of linking soft OpenAL via DLL for my project. But well I will leave the SVN version with the static linking.
Excellent news!
Mmh, I'll check the source but it's possible they don't compile the library statically often and they might have forgotten a declspec somewhere. I remember I had to fix something like this but not sure if it was for OpenAL or libsndfile. It shouldn't be too hard to patch anyway.
Great. Why don't you use the Creative dll in your project, in this case? They should be compatible as they share the same API.
there are still some things that need to be done, but the basic stuff is working.
what are the sender and the recipient of orxEVENT_SEND?
to what should I set them?
Well, in your case you have no sender nor recipient, so you can just set both to orxNULL. There are orxEVENT_* macros to help initialize & sent events.
Furthermore I was not completely sure about creating the clock:
Which orxCLOCK_TYPE is appropriate? I chose orxCLOCK_TYPE_CORE.
You shouldn't actually need to create a clock if you use the global timer (which registers on the core clock too). I think that's how I did streaming in the same plugin.
I implemented the capturing in the OpenAL Plugin. Additionally I implemented recording to a file.
directly playing back recorded data is bit more tricky. we would have to create a interface to append buffers to the playback queue.
But I don't even know if such a feature is necessary/needed.
Great! I'll look at this tonight or tomorrow.
I wouldn't bother with that right now as none of us can't see any use with the current network-less state of orx. (And people use vent/mumble anyway ).
So any tutorials is welcome, including sound recording.
http://orx-project.org/wiki/en/orx/tutorials/community/tdomhan/sound-recording
I've fixed a couple of issues with other platforms and haven't tested on linux/mac yet, but I'm sure it'll work fine.
I took the liberty to simplify a bit the API while still granting the same degree of freedom you gave the user. I hope you don't mind.
You can now decide on a per packet basis if you want to save them to a file or not. You can also modify them prior to file saving (I postponed the event handler registration to ensure the user has the time to register his first).
Custom buffers can now be provided for modified packets, the user is responsible of the life/death of such buffer.
There's a small example on a simple sound processing in the bounce demo on the svn. While shift is pressed, packets will be saved and their frequency will be doubled. No packet will be saved if shift isn't pressed.
I'll update your tutorial in the coming days. Thanks again for your contribution!
I haven't looked into the changes you made to the API, but I will do so later. but I suppose they are sane
you're welcome. btw, I'm much more thankful for all the effort you put into this project! kudos to you!
I hope so! ^^
I've just updated your tutorial so you can judge by yourself.
Ahah, thanks! I hope orx still fits your need then.
regarding the capturing:
this wont work for me, as I definitly need the fixed block size.
but I guess this is a feature not many can make use of.
Simply accumulate the info in your own buffer. If you need to write the data to file, just give the buffer back when it's full through the payload.
In the same way, the polling frequency can be achieved using a orx timer to collect your locally stored data.
It's actually the same code you wrote except it's not internal anymore. If you think it's too much of a hassle I can put the code back.
However using a struct to initialize the capture is too error prone as I found when trying your branch first (the boolean where not set in your example resulting to a crash when starting the capture with my vs (boolean had the 0xCDCDCDCD value)). I guess I can add two extra parameters to the function if you'd rather have the feature internally.
leave it the way it is and I will figure something out.
Maybe I will completely switch to portaudio after all, in order to keep everything in sync.
I was thinking I could simply write a tutorial on how to have fixed size block for recording. Though it might not be of any use to you as it'd be based on your initial method, it could help any newcomers with similar requirements than yours.
I completely switched to portaudio now. all the above is doable with this library. things just got a bit more complicated then "orxSound_Play(...)"^^
I guess the current API is sufficient for most games.