Sunday, November 30, 2008

The Great Audio 2.0 Potential Customer Survey With Cream And Cherry On Top

Thank you everybody for your participation. The survey is closed, you find the results here.

Friday, November 28, 2008

Halebopp Alpha Feature List


It is about time to prepare the alpha launch of Halebopp, although there are still a few big features missing. I was asked to reveal which features were going to be included in the alpha release, so first of all, here is a list of major features already implemented, followed by a list of features to come:
  • Zoomable User Interface. Just like Google Maps, your project is arranged on a virtually infinitely large surface that can be panned and zoomed with the mouse, to get from a broad overview to a tiny detail and back in no time.
  • Audio Tracks. Drag samples or longer recordings from your collection into the project and place them on the timeline to make them part of the project. WAV and FLAC formats are currently supported.
  • Unlimited Loop Sections. By dragging in the editor, you can define loop sections which trap the playing cursor infinitely. You may add as many loop sections as you require.
  • Sample accuracy. Projects use an internal sample rate of 48kHz, enabling you to place tracks, envelopes and effects with a minimum resolution of about 21µs (microseconds).
  • Tempo Independence. On the top level, Halebopp does not measure speed of music in beats per minutes, but in seconds. This means that segments of different speed also look different in the editor, giving you a better understanding of speed changes.
  • Perfect Persistence. Your project is transparently stored in an SQLite database on disk. Any action that you perform on your project will directly be written to the database. Should Halebopp ever crash, you will not lose hours of work. Neither can you forget to save, as no Save action is required.
  • Transparent offline rendering. A mixdown of the entire project is constantly held in memory, and partially recalculated whenever the project changes. This means that when you are merely auditing your project, there is little to no CPU consumption, allowing you to stack as many tracks and effects as you like, without stuttering or interruptions.
As promised, here are the major features I would like to include before we launch the service:
  • Infinite Persistent Undo. Undo all your actions as far back as you like, even between sessions. Nothing is lost.
  • Q-Grids. Draw a Q-Grid in your project to split a time segment into beats and bars. Using Q-Grids, you can quantize track positions and lengths over only a part of your song, allowing you to smoothly change between different time signatures and BPM speeds within one song when building your rhythms.
  • Autorepeat. Drag the right side of a track to repeat its content. This saves you copypasting the track over and over again.
  • Powerful Timestretching. Squeeze and stretch your tracks in time to fit speeds. Transpose pitch to harmonize. Thanks to the Rubberband timestretching library, your results will always sound perfect.
  • Sleek Interface. We know that creative juices flow best when the applications visual design follows an appealing aesthetic. This is why the interface has been deliberately designed to be simple and clutter-free, impressing with fluid motions and an attractive finish.
There are of course a few minor features that have to be added as well before the release:
  • Handles for tracks, play cursor and loop borders
  • Colorizing and naming for tracks
  • Clone track
  • Cut/join tracks
  • Save mixdown
  • Select active project
I think that's about it.

Tuesday, November 25, 2008

Super Mighty Morphin' Power Halebopp

After a heated discussion on a functional programming approach to updating audio yesterday on the KVR audio forum, and the outcome that traditional methods to do DSP remain still unmatched in speed, I decided to bite the sour apple and follow the only single principle which has proven to work for me most of the time: "come on, M*therf*er, let's do this shit!". In this case, the maternal copulator I am referring to would be my computer.

So, as it looks, I invented a novel approach to generating audio on the fly. It doesn't work well for live input, which is, however, not the primary focus of the application. My approach is more suited for handling audio which may require intensive processing power, such as timestretching or other FFT effects. It also works very well for doing non-linear audio effects such as reversing a segment in time.

This is how it works: whenever the song changes, e.g. by dragging around tracks, the space that the track occupies will be marked as invalid - it needs to be updated. The final mixdown is split up into blocks of 64kb. A thread in the background generates a job for each block which needs updating (I know what you are thinking - we need a background thread for the economy!).

Each iteration of the thread, a few blocks are being worked, where blocks in front of the currently playing part of the song will be preferred, so the change in the song is immediately audible to the artist. In the screenshot above, the darker parts mark invalidated blocks which are in the process of updating.

The computer uses full CPU power when mixing audio, and thus finishes as fast as possible. When all data has been updated, no CPU power is being used. So just listening to your audio will never require much power.

What is currently not implemented is block dependencies, because I have no feature demanding it yet. Whenever a block, which depends on previous or future blocks, needs updating, these blocks have to be rendered first. But that shouldn't be hard to do, once the requirement pops up.

On a sidenote, I added wallpaper support. The wallpaper scrolls and zooms with the viewport, and makes editing a bit more creative and interesting.

Wednesday, November 19, 2008

Teenage Mutant Hero Halebopp

Just as I was about to write this article, I began to feel awfully weak, which certainly can be attributed to the late hour. Nevertheless, a rescue plan suggests the consumption of an invigorating beverage: Earl Grey, hot. I shall return shortly to finish this blog post which I so cunningly began.

Alas, as the divine beverage takes its well deserved time to draw, I shall lay out here, once and for all, my Halebopp master plan.

That shitty way of talking tires me, so I'm going to regress to my usual lobotomized way of talking english, or what I mistake for it.

Halebopp is coming along nicely. In a steady rhythm, I add new features as a proof of concept, code myself into a corner, and then rewrite them nicely to go on with my work. You wouldn't believe how much pondering and office chair squeaking such a serious endeavor requires. Coding a new approach to audio processing is like tiptoeing through a trap-ridden inca temple, without actually triggering any traps. Still, one wrong step, and you find yourself punctured by poison darts.

Okay, I like to pretend that it is that way, but the only thing my voyage has in common with described Indiana Jones experience is the pained moaning as I calculate the pros and cons of different approaches, their inflicted meanings and future connotations, the impending destruction of all mankind, blah blah, the usual. I am still learning to fly this thing, and I hope that I don't end up in a commerce tower - of shitty spaghetti code.

So, finally, here is the plan. Halebopp's first release is going to be for both Windows and Linux. It will be a minimal functional version, allowing to import and record waves, arrange them neatly and mixdown a final result. This should give you something useful enough to consider supporting its development in the future.

The Windows version is pretty much a release for musicians who just want to get going. Install it and you are set. This comfy Halebopp release will be available for a low budget price, which also includes 30 days of support.

The Linux version is going to be a libre, GPL licenced source code release which may or may not run on your favorite distribution - mine is Ubuntu 8.10 *wink* *wink* *nudge* *nudge*. A 6 month support package for the Linux version will be made available, but it's going to be a bit more expensive than the Windows 30 days option. You also have the option to contribute in naturals, which includes patches, extension proposals, implementations or selling your virgin nerd bodies.

It hereby be proclaimed that both versions shall be comparable not only in features but also in performance. User and development documentation will be made available online.

It will certainly be a few weeks until Halebopp launches, along with its own tiny website and demonstration videos, so don't get too excited yet. It's quite a path ahead, especially because I have to do everything on my own. If any of you fabulous people sees himself fit to help out with anything in particular, please do not hesitate to force your labor upon me.

Alakazamm!

Wednesday, November 5, 2008

Halebopp Progress Report

Time has passed. A new president has been elected. I need to shave. And Halebopp has progressed since the last time I wrote about it.

On the front end side, panning with the middle mouse button and zooming with the scroll wheel is now possible. Audio tracks sport a nifty amplitude preview. Tracks can be dragged around.

Regarding the backend, Halebopp is now completely supporting stereo recordings. There is a method that supports loading FLAC and wavefiles via libsndfile, but there is no GUI for it yet.

On a sidenote, I rewrote the sound engine. I'm trying something nobody has tried before, which is applying a datamodel for editing large images to working on a piece of music. Audio sequencers usually focus on mixing audio "real-time", that is: the data is being prepared, processed and finalized on the fly, as the sound is being output. The advantages are low-latency processing of recorded input and low memory consumption. The disadvantages are high CPU consumption and the inability to apply time-domain effects like reversed reverbs or timestretching.

Rather than generating audio data realtime, I render all data "offline" in the main thread into a large chunk of memory (which maps the complete song), and determine what needs to be rerendered based on invalidated ranges in that buffer. This is a model that is stupidly easy to handle from a programmers perspective. It limits CPU consumption to the situations where data needs to be rerendered. If the song does not change, there is virtually no CPU consumption, as the completely cached song is directly streamed. Skipping around in the track is equivalent to scrubbbing in an MP3 file: there is no restarting of effects, everything sounds exactly the same everytime you play it. Mixing down the song to disk simply means just dumping the contents from memory, which takes no time. And the rendering process is multicore friendly.

The challenge is still to keep rendering times short, well distributed and well segmented, while not overusing memory consumption - the usual issues. However handling offline processing is much easier than generating audio on-the-fly, since you can access data randomly across the entire timeline, which gives you a bigger range of possibilities and allows you to write code in a fairly straight-forward manner.