Mstation.org

Linux Audio Programmer Paul Davis talks about his collection of apps ...

Paul Davis has put a lot into the Linux audio scene. He has written a number of apps, libraries, and drivers such as the ALSA driver for the RME Hammerfall. Here we chat to him about aspects of Linux audio programming.

related links:

lad
las
quasimodo
ardour
softwerk
Philadelphia

M station: >Having said that I'd ask about ardour and quasimodo perhaps we could >lead in with advice to kids who think they'd like to do audio programming in a >comprehensive sort of way. I'd say, first learn a language (I'd go for C or C++), >then read some books and some code. Do you agree? Which books would you >recommend? I guess you'd probably side with C++ as the language. Do you think >Object hierarchies are easier to suss out than function libraries? Paul: Well, it depends a little bit on what kind of audio programming. if you want to write plugins (LADSPA, VST, whatever), then its probably enough to learn a language (and yes, C and C++ are the obvious choices because of speed and integration into existing systems), read any of the textbooks on DSP (most of them have something good to offer), and get coding. The way plugins work means that you can focus almost 100% on your DSP algorithm(s) and experiment much more easily than if you were writing an entire application. On the other hand, if you want to design more elaborate software (by analogy with plugins, lets call them "hosts"), then knowing a language is only half the challenge. You need good system design skills. I've never thought of myself as a particularly brilliant low level coder - I still have to use bc(1) to figure out binary<->hex conversions - but I think that the skills I have acquired designing and then implementing medium to large scale systems is extremely useful. You can't get that from a book, but if you could, then the bible of contemporary OOP programmers, "Design Patterns" is a good place to start. For host systems, I (obviously) do prefer C++. I find it very useful to have the compiler take care of a lot the stuff I would have to do by hand in C, and I also like the enforced encapsulation and public/private features that the language brings to my code. I don't think that OOP makes it necessarily easy for a non-author to understand the code, but I do feel that it makes it easier for me (the author) to keep the structure in my head and manipulate it. I find class heirarchies more plastic and malleable; function libraries work for me when the semantics of each function are simple (such as the basic POSIX API), but not as a way of wrapping up complex internals. Finally, you also need to really understand what real-time programming means, grounded in the details of whatever OS you're working with. For Linux, that means having a firm grasp of the relationship between the kernel and an application, POSIX RT scheduling, the poll(2) system call, how virtual memory and malloc work and many other "advanced" topics. >On the plugin side beginners might be confused at first about where a host >ends and their plugin begins .. the definition of a plugin. I guess the best >place to start there is the docs for LADSPA or whatever. In the context we're discussing, a plugin is a piece of code that does either generates or processes data (i.e. audio samples). It doesn't talk to an audio interface, or any of that complex stuff, and its not an entire application. A host is a complete application that *does* talk to an audio interface. It can load plugins, organize them into processing networks, and causes them to do their thing at the right times, feeding them data from the audio interface, and passing their results back to it. The host is a *vastly* more complex program than a plugin, at least in practice. Theoretically, a plugin could do anything, but in the real world, they implement things like delay lines, waveshapers, companders, multiband EQ's and so on. >What I was trying to get at was that where a person has limited knowledge >and thinks writing a plugin is a good idea, they might have some difficulty >working out what they can go ahead with ... but I guess it would be true to >say that if you don't know yet, you need to read more >ardour and quasimodo - could you say a little about their genesis? Quasimodo: I would best refer the reader to quasimodo.org/intro.html which describes where Quasimodo came from. Ardour: I started hearing about the RME Hammerfall on various mailing lists I was on. It sounded pretty amazing. In November of 1999, I decided to buy one, and proceeded to write the ALSA low-level driver for it. Once that was working, it became clear that no existing Linux audio software could really use the card, and moreover, no existing Linux audio software was really up to the task of use in a professional/commercial recording studio. In fact, the situation was even bleaker than that: it was hard to find any Linux program that could even handle the idea of a multichannel card, or a card with more than 16 bits of sample precision. So, I decided that it was time to fill this gap. At the same time, I was getting into an association with a friend who owns a commercial studio, and we wanted to try to use Linux and open source software there. We have 3 Alesis M20 ADAT recorders (about $5K for each recorder), and the first goal was to produce some software that could replace the tape machines. That goal was satisfied relatively quickly, but it became clear that doing so was actually fairly useless for our purposes, since you couldn't do anything with the result except play it back as-is: no existing Linux audio editor could handle 24 channels at 500MB+ per channel. After Bill Schottstaedt did some preliminary and excellent work to see if his editor "snd" could do this, I reluctantly came to the conclusion that it was impossible to retrofit this kind of operation into a program based around 16 bit stereo audio interfaces. Ardour's scope then expanded to the point where its "only" goal is to do everything that systems like Digidesign's ProTools, Sek'd's Samplitude and Emagic's Logic systems can do, and preferably more. >What do you think of MPEG4 and such things as SAOL? Have you had a play with it >at all? I have no opinion on most of MPEG4. MPEG4-SA (Structured Audio) is the part that involves SAOL. I think that Eric (Scheirer) did two great things with SAOL: first, he did some rationalization of the Csound orchestra language and second he worked very hard at getting SAOL included as part of the MPEG4 standard. However, I regret that the language that has been included is so old fashioned - both SAOL and Csound (examples of each are sometimes indistinguishable from one another) are based on the kinds of programming languages that were in use in the 1970's and early 1980's. In addition to the general syntactic structure of the language, Eric retained a feature of Csound that I think is a real problem - the distinction between audio signals and control signals. In most hardware-based systems I can think of, the really cool stuff tends to be the systems that allow you to mix and match such signals to see what they do. This may seem like a nitpick, but this is a language that by several accounts is going to end up being poured in silicon and distributed as part of mass market consumer items. Barry Vercoe's original introduction of the audio/control distinction was a brilliant addition to the Music N languages, but I firmly believe that its wrong to retain it. To try to be a little more positive, I would have been much happier to have seen something like James McCartney's SuperCollider language be adopted. However, SC is not "close to the machine" in any way, and that would have violated one of Eric's main goals in designing SAOL. I've tried using the sfront SAOL-to-C compiler that John Lazzarro and friends have been working on. Its an impressive piece of work, though my opinion is sullied by the fundamental problems I have with the language. >In relation to writing quasimodo to use a dual CPU setup, >is there a huge difference in approach in writing for multi processor >or is God in the details? Do you actually code at the processor level or >does the compiler handle most of that? God is mostly in the details. There are several keys things to take into account. The most significant is that more or less all X Window based GUI toolkits are not naturally multithreaded. This means that you must choose between an approach that uses explicit locks around every call to a toolkit function and one that ensures that only a single thread is responsible for making those calls. The second most important issue is that the thread that handles data movement to and from the audio interface must be "real time" (in the sense that it has to meet deadlines based on an external clock ticking). In practical terms, this means that this thread should avoid more or less 100% any system calls, any calls to malloc or free, any thread synchronization operations (mutexes, condition variables) and so on. >Is Quasimodo being used mostly to produce 'academic' computer music? >As an analogue synth it strikes me that it could be used in all kinds of >music. To be fair, Quasimodo isn't being used to produce any music right now. But its design is not intended to be limited to any particular musical form. I generally use it as an FX processor. >Those of us on l-a-d were witness to just how hard it was to get those >streaming rates up high enough to cope with all the data. I gather it's >worked out very well in the end... did you end up using any of the >low-latency hacks to help achieve this or are you using a standard kernel? The low latency hacks aren't necessary for the data streaming, but they are absolutely vital to get the thread that handles the i/o to the audio interface to work reliably. I always run a low latency kernel - right now, I use 2.4.0 patched with Andrew Morton's (excellent) "lowish" patch, and I keep meaning to move on to 2.4.1 which appears to be even better. The only thing that matters for the data streaming is the overall performance of the disk subsystem. The thing that I am happiest about is that I managed to get the desired performance without using a special filesystem and with standard audio files (Ardour currently uses RIFF/WAVE format audio files with IEEE 32 bit floating point data). Despite the fact that this is Linux, many of the guidelines for audio programs on Windows and the Mac will still apply: shutdown unnecessary applications, use a dedicated disk, etc. That said, I tend to run Ardour with Netscape, several rxvt's, xosview, and a huge emacs process all active, and its fine (though I have a dual CPU system). >What is the status of the inbuilt editor now? Well, its there. But I think the design needs a drastic redoing. When I started, I was working from models like snd and sweep and soundforge, programs that work fundamentally at the waveform level. It turns out that this is inappropriate for "arranging" pieces of music. I was very resistant to the ProTools "Region" model (known as "Objects" in Samplitude), but as time has continued, I have come to understand why PT (and most other similar tools) use it. It won't take a big reworking of the internals to switch to such a model, but the GUI will need quite a lot of changes. Being able to drag rectangular regions around is non-trivial in X, for example. >Pro Tools is certainly a nice functionality set to aim at. There's a >few man-years of work in there! How far along the road do you think >you are with ardour? Its hard to say precisely. We have no MIDI recording/playback at this point at all, though the plan is to incorporate another GPL project, probably MidiMountain, into Ardour itself. I would say that if the target is ProTools 5.0 (TDM-based) without MIDI or automation, Ardour is about 70% of the way there. The recording/playback aspects of the program are more like 99% there, the mixer parts are about 85% there, and the editing parts are more like 50% or less. We also have a few features that PT doesn't have, such as per-track looping, and an implementation of Bill Gribble's excellent idea for metering clipping. Automation work has just started recently, though it turns out that it started off as a meander down the wrong track. >What's the future for ardour? I am having to consider ways of making money from what I do. The way that my divorce settlement has worked out doesn't leave me in any immediate problems, but combined with the recent stock market situation in the US, my original thoughts of "never having to work for money again" have changed to something more like "should maybe consult from time to time to avoid drawing down capital". this is a challenge. i have no desire to get involved in "support", and i don't know of any business models for GPL'ed software. its an interesting problem. once ardour gets closer to protools functionality, i think that some heads will start turning (perhaps even rolling), and it might be easier to convince h/w manufacturers to start paying attention to what i and the rest of the linux audio+midi software community are doing. that might be an interesting avenue to take towards a revenue model for this kind of stuff. imagine that you buy an audio interface and a control surface and get Ardour "for free". but that, and other possibilities, really do hinge on getting ardour "finished". >Thanks for your time Paul.

Back to top

Bookmark:

post to Delicious Digg Reddit Facebook StumbleUpon

Recent on Mstation: music: Vivian Girls, America's Cup, music: Too Young to Fall..., music: Pains of Being Pure At Heart, Berlin Lakes, music: Atarah Valentine, Travel - Copenhagen, House in the Desert

front page / music / software / games / hardware /wetware / guides / books / art / search / travel /rss / podcasts / contact us