Saturday, January 30, 2010

Quadratic Bézier Envelopes with Linear X

While researching for ways to provide a nice way to create curved animation envelopes, I produced a classic f(x) -> y function that can resolve a quadratic bézier curve.

Here's the code:

def bezier3lx(p0,p1,p2,x):
    """
    For each X-coordinate of a quadratic bezier curve with end points
    [p0,p2] and control point [p1], return the Y-coordinate.
    
    The classic quadratic bezier function is two-dimensional, and resolves
    a non-linear t scalar into the resulting point X,Y. This method, however,
    restructures the original function so that X is the input, t is an
    intermediate value resolved from X and all points (using a binomic
    formula, IIRC), and Y is then produced from t.
    
    This makes this function suitable for being used in envelopes with
    fixed time indices, given that all points are horizontally ordered, 
    and do not share the same X position.
    """
    
    # points must be in order
    assert p1[0] >= p0[0]
    assert p2[0] >= p1[0]
    
    # clamp x to range
    x = min(max(x, p0[0]), p2[0])
    
    # can rarely fail because of floating point irregularities
    #assert (x >= p0[0]) and (x <= p2[0]), (x,p0,p2)
    
    # now for some math
    
    A = p0[0] + p2[0] - (2.0*p1[0]) + 0.0001
    
    B = 2.0*(p1[0] - p0[0])
    C = p0[0] - x
    N = (B*B) - (4.0*A*C)
    if N < 0:
        print "N < 0 for",x
        t = 0.5
    else:
        t0 = (math.sqrt(N) - B) / (2.0*A)
        t1 = (- math.sqrt(N) - B) / (2.0*A)
        t = t0
    
    s = (1.0-t)
    f0 = s*s
    f1 = 2.0*s*t
    f2 = t*t
    
    return (f0*p0[1]) + (f1*p1[1]) + (f2*p2[1])

Tuesday, January 19, 2010

Some Thoughts on a Visual Programming System

Some aspects that i want to see reflected in the system:

  • flat, 2D
  • intuitive, graphical thinking
  • teamwork: work can be shared and collaboratively worked on
  • game-like
  • real world metaphors
  • target group: artists, game designers
  • looks cool so it can be used for performances
  • linux-centric


I want to make use of physics simulation to indicate relations and to let the system appear alive and natural.

I want to find out if basing relations on proximity (wave propagation) and "puzzle pieces" (emitters & receptors) is more efficient and helpful than having to indicate each relation explicitly through a flow graph.

To make international collaboration easier, I want to avoid the use of text, as it always comes with language, which often differs between geographical locations. E.g. not everyone speaks english, which is strongly the case in asian countries. A symbolic language has to be found that can be commonly understood in all countries.

A second case against the use of text is the complexities of text rendering and formatting, which I want to avoid for practical reasons.

A third argument is against the use of text for documentation. Having the opportunity to explain a concept simply by textual description might allow the designer to compromise in areas where images would have been clearer, by falling back to textual descriptions.

A fourth argument is against the use of text for labels because it pulls the programmer out of her intuitive mental model of the problem - when labels have to be made up while a solution to a problem has to be found, programmers often resort to abbreviations and foobar names to get it done, which poorly describe the program.

Thursday, May 14, 2009

Small update...

...I'm not dead. I'm still working on the whole thing.

The current version of Halebopp is dead, but I learned a lot:
  • Clutter is not as extraordinary it seems. Sure, it renders everything in GL, but it doesn't give you much freedom over its widgets. All coordinates are stored as integers to support embedded targets better, which mades a zoomable UI a pain to implement.
  • Writing your own alternative to Clutter is no fun, especially when writing a music editor is your main goal.
  • I hate traditional widgets, windowed layouts, text and cursors. I love multi-touch, images, zoomable interfaces, animations. I want to be able to control everything with my fingers. I need to reinvent the interface here. What I dream of does not exist yet. The subprojects name is "Mjoo" (A revived old project of mine).
  • Offline audio rendering is an optimization, not a paradigm. It gets too complicated, especially if you want (some) live input support. I'm better off rendering DSP in the traditional way, with a bit of mix-ahead cache magic to use up free CPU cycles.
  • I hate writing UI code directly. I love to set up simple databindings and static/functional structure. I explored that concept in Holygrail (another subproject), and it went pretty beautiful. So I actually have an idea what I'm doing here.
  • Nobody needs components in Python. Simple enumeration of modules and auto-import is fine. Anything else becomes self-serving rather quickly.
  • Function decorators in Python are mighty.
  • The logging module is good enough for any log output job.
  • SQLite is too low-level to be used directly. Writing your own wrapper doesn't cut it. SQLAlchemy does the trick - for any kind of SQL RDBMS.
  • SQL blobs suck for asynchronous R/W. Use files and store only references in your DB.
  • I hate trees. I love flat lists and dictionaries.
  • Python can't do DSP work efficiently and never will.
  • Python can't do multithreading efficiently and never will. Use multiple processes and RPC.
  • Boost.Python is a friend to keep, but never wrap anything not especially written for it.
  • SCons remains a great build system, as long as you don't get creative; stick to best practice.
  • It's impossible to track DB changes directly from multiple processes using the DB file alone. I assume best is some kind of RPC document server doing all transactions and providing changesets to clients - like Verse for 3D editors.
  • C++ is a dish best served stupid. Stick to the STL, no clever tricks, especially if you plan to expose stuff to Python.
  • Twisted is a great framework for writing anything socket related.
  • SndObj has nice processing plugins and Python bindings.
  • Python SWIG bindings can be understood by Boost.Python.
  • Don't do anything nobody asked for. No docs, no tests, no website, no mailing list until there is actually something that you use yourself productively. Community building is for later.
  • Mercurial is fun to work with, so is Eclipse and it's numerous plugins, including PyDev.
  • Top down development doesn't work for science and experiments. Don't make big plans.

I'm currently working on the second revision of Halebopp. Right now the (small) plan is (in this order) to get the document server working, write a player client, write an editing frontend.

I'm not trying to get revolutionary with the DSP technology this time. I want to support a DSP graph for SndObj objects and, later, a sequencer to script events. Also, something to easily import and record waves.

So long.

Friday, March 27, 2009

User Interface 101: UI Data Binding

The traditional method of writing UI logic is hard. It requires writing an enourmous amount of housekeeping code, such as event handlers and update methods. Non-interactive testing is almost impossible to realize.

I propose a method to bind an user interface to data using only a set of declarations, describing data constraints and UI/data relations. This set is called "binding".

Here are a few applications for bindings:
  • A property view, allowing to select items from a list and change their attributes.
  • A spatial view, allowing to position and resize items on a surface.
  • A complete application made of cascading spatial and property views.
A binding connects a model and a graphical UI in both directions. It can be described in an XML document using a dedicated DTD scheme, obsoleting imperative micromanagement of dialogs.

Both model and UI should be reflective: their properties and elements can be enumerated and queried by the binding. If either model or UI is not reflective, additional XML documents are required, providing information about structure and constraints.

The model can be of any of these following format types:
  • SQL database
  • XML DOM
  • JSON tree
  • Python class tree
  • C structures
The binding will initially support following UI frameworks:
  • GTK+
  • Glade
  • Some canvas (Clutter, GooCanvas?)
A binding supports following model datatypes:
  • Numbers (integers, floats)
  • Strings
  • Lists of mixed types
  • Compounds (structures, classes)
  • References
  • Functions
On the UI side, a binding supports following display and editing methods:
  • Text/numeric entries
  • Combo/choice boxes
  • Radio buttons
  • Check boxes
  • Buttons
  • List views
  • Tree views
  • Sliders
  • Scroll bars
  • Progress bars
  • Canvas items
More will follow.

Sunday, March 15, 2009

User Interface 101: Device Independent User Input Processing

This is the first post of a new series called "User Interface 101", where I try to invent a functional, description-based approach to writing graphical user interfaces bottom-up, devoid of any imperative baggage. The first post deals with one of the basic building blocks of human-computer interaction, user input.

Users operate computers only through input devices. An input device, no matter what shape, can be divided into separate elements, of which each element carries a value and an activity state.

Internally, we are going to represent each input device element as an abstract "input source", which is designed to yield any information a computer program would require to successfully interpret user input, while burdening programmers with device specific interface details only if necessary.

An input source describes a single input device element such as a key on a computer/MIDI keyboard, a button on a mouse or joypad, an IR pointing device, a joystick or a finger on a touch pad. For multi-dimensional input devices (e.g. mouse, pen, touch pad), each input source describes a single dimension.

An input source yields one single (signed integer) value at a time contained within a hardware dependent range r0..r1, where r1 > r0. This enables tracking of both digital key/button strokes/presses/clicks and analog stick/mouse/pen/finger movements for a single dimension.

An input source may also yield an activity state, if supported by the device, which indicates if the input device element is enabled/alive/in use. This enables tracking of caps/scroll/num lock key states, cursor dead zones and retraction of pen/finger/device.

An input source may optionally point to a descriptor containing information that may help to identify the source and interpret the values.

An input source may be tagged as relative or absolute in its descriptor. Relative axes yield delta values. Adding delta values subsequently, one may acquire an absolute value, contained within an unspecified range. This enables tracking of rotary wheels and mouse movements while repositioning the device.

An input source may describe a glyph or command it represents in its descriptor. This enables querying a scan code or button index.

An input source may describe an one to three dimensional position vector in its descriptor, indicating the sources spatial position on the device in meters, relative to either the center or a corner of the device. This enables to map tasks to input sources based on their adjacency and location.

An input source may describe the dimension it operates on in its descriptor, if it is part of a composite source. Zero indicates no specific dimension. This aids with building multi-dimensional values from multiple input sources.

An input source may describe a default resting point or range in its descriptor that the control returns to when not operated by the user. This enables to determine whether a key/mod wheel/button/axis is resting.

Imperative GUI Programming Sucks

I have recently had my eyes on Haskell, and I liked what I saw. I spent about 20 years of my life writing programs in imperative tongue, and the more I knew, the more trivial the solving of problems became, the more I found the language to be obstrusive and low level.

Why do I need to be bothered with rolling out the exact details of a routine, when I have exact logic relationships in my head? Can't I put my understanding of the abstracts in letters and the compiler figures out all the implications and details?

Turns out, functional programming, even if it looks a bit ugly at times, does exactly what I want. Haskell is a functional programming language, where the specification is the program. With such a concept, you theoretically would write even less code than you would in Python, while making even less mistakes.

But this world is not quite ready for functional approaches. All that old bullshit comes back through Haskells imperative back door of monads and I/O operations, and if you do a lot of interfacing, there is right now virtually no difference between Haskell and Python or C++. I had a look at the GTK+ binding for Haskell and it outright sucks. There I am again, writing event handlers and filling GUI's sequentially. The specification is again not in the program, but in my head.

As a GUI programmer, how can I get closer to writing a GUI cleanly from specification, without muddling much with the details?

Since the web is said to be the future, I had a look at different web technologies. There is a functional language in use on the web for quite a time, and it is called XSLT. This XML based descriptive language specifies how an XML can be translated into something else, either another XML or a completely different format. No imperative I/O. Just pattern matching and templates, and the XSL transform does the rest. You can even transform the XSL document itself using XSLT.

You could build a GUI from this. One part of the job, at least, in a primitive way. You could turn the XML into an HTML form using pure XSLT, but there is still the problem of turning user input back into XML. You would also want to update only parts of the source DOM on user input, along with various validation mechanisms. Source data and front end would have to be connected in both ways based on a formal specification. I searched the web up and down for this "two-way XSLT", but found nothing yet.

Is it really possible that in the year 2009 there is no need for a functional approach to binding data and front end? Or is it simply too hard, or silly to think of? I can't believe that. From my perspective, all these different approaches to editing data - buttons, edit fields, lists, trees, knobs, sliders, graphs - boil down to about at most 30 primitive concepts in combination, reimplemented over and over, requesting each and every GUI programmer ever coming along to stupidly assemble every dialog as if it was the first he had ever written. The technology we use today does not scale. It should become easier the more interfaces you do, instead it becomes tedious. Refactoring is a nightmare because GUI tests are way too hard to write.

If I only were able to write down the specifications of an GUI and get my working program right away, testing steps would be self-evident, they could even be automatically determined. Changing a GUI flow would become a lot easier. I would literally simply change connections instead of rewriting dialogs and tests. Somebody told me: everything that is hard to do is worth doing. Screw this guy. This stuff should be trivial. Figuring out the specification should be the hard work, not working like a code monkey.

Imagine how easier experimentation with new paradigms would become if we had enough time to test different ideas for user interfaces. How utterly convenient and elegant these applications would be, regarding both code and usage.

If you know of such a project in the works, please let me know. I am desperate.

Monday, February 9, 2009

The Rendering Graph: What Renders When?

Using the current pipeline for mixing tracks, I tried to write a metronome synth for QGrids, but that was actually way too mindbending to be fun.

To explain: traditional DSP works with sequential rendering, where each samples value kind of depends on the previous one, and intermediate states need to be saved. Since my concept renders audio offline, and only subportions of a clip may change, the architecture requires that any clip can render one buffer of its range (8192 samples) independent of location.

Given this limitation, it is a bit tricky to render e.g. a metronome sound from somewhere in the middle of the clip. I need to calculate what kind of click I currently am at (to determine pitch) and how far I am away from the clicks location (to determine phase and volume). Right now, this is possible, but kind of impractical, given that most of the time, I still know pitch, phase and volume from the previous location.

The concept breaks apart as soon as I employ delay lines and reverbs, which strongly rely on previous states and additional delay buffers. It is still possible to calculate values independently, but it requires to be able to look up input sample values from other locations. Chances are usually big that these locations have been rendered before, and so I would often recalculate stuff that I already know. Another issue is that I would not be able to use traditional DSP code, because it is usually written for linear signal processing.

Three questions emerge, a.) when I have already rendered a part of the pipeline, how can I prevent to render it again? b.) how can I simulate linear signal processing to simplify algorithms? - and since we are at it, c.) how can I neatly parallelize all this?

a.) Transparent bouncing. Whenever a clip renders, its rendered output is usually written to a cache on disk. When the output is needed again, I just read it from the cache. Since disk space is far from being sparse these days, that should usually not be a problem. DONE

b.) Bounce linearly. Render a clip from start to finish in usual block sizes, and store intermediate states like phase, volume, voices, if applicable. DONE

c.) Invalidate dependencies. Fragments of a clip are dependent on other fragments. Root fragments need to be rendered first, then dependent fragments can be rendered. Most of the time, unrelated clips can render in parallel.