Archive for January, 2010

Musical Gem of the Week #1

Posted in Humanities on January 29th, 2010 by Noldorin – Be the first to comment

Due to the great irregularity of my post frequency, I have decided to bring upon myself the task of a weekly series of posts. Having considered a few potential topics, I came to the conclusion that some posts on particular (extraordinary) pieces of music in my (sizable collection). By extraordinary, I mean both compositions of great quality and those which are well outside of the repertoire of a layman, or even a fan of the genre. As anyone who knows me well enough might guess, I will inevitably be focusing on classical (largely common practice period) music. In fact, now that I consider it, I rather fancy doing this series chronologically.

Philistines, proceed with caution… Experts equally so, perhaps, since I am but a dilettante here!

Unsurprisingly, my first “gem” of the series will be a Baroque piece. I have chosen the little known piece, a Viola Concerto in G major, written by a one Georg Philipp Telemann. Telemann, a contemporary of Johann Sebastian Bach and resident of Germany (the Holy Roman Empire at that time), was in fact much more highly regarded than Bach (whose music was considered to be turgid and old-fashioned) in his day, and yet far less since – a fact that rather surprised me upon reading it.  He was, however, a most prolific composer, rivalling his other contemporary in Italy, Antonio Vivaldi, in this respect. Despite his fame being somewhat diminished by time, he was without doubt a musician of great talent, the Viola Concerto among his finest works.

Since I am neither inclined nor qualified to launch into a theoretical discussion of this piece, I hope you will simply hearing its beauty. Hearing and admiring such a work is, in my opinion, a very personal thing that should only be done through oneself, gradually, and in a holistic way.

1st Movement, Largo

2nd Movement, Allegro

3rd Movement, Andante

4rd Movement, Presto

Note: This recording, while quite decent, is not my preferred one, and I have included it mainly because is is Open Audio. My favourite recording of this concerto is actually available free from the Lancashire Sinfonietta website – you just need a quick registration before you can download the complete work.

I still remember the sensation of hearing for the first time a particular phrase near the end of the Allegro (you’ll know which I mean one when you listen). A feeling of recognition that is, but alas, I cannot remember the source. If anyone could suggest where I might have heard this before, that would much alleviate this ongoing irritance!

Expect (nay, rely upon) my second post in the series next week. Until then, I hope I have given some of you enjoyment in sharing this first piece. Comments and suggestions welcome, as always.

WPF on IRC

Posted in Programming, Software on January 29th, 2010 by Noldorin – Be the first to comment

Being a long-time member of the ##wpf channel over on the Freenode IRC network, I thought I would bring it to the attention of any readers who are interested in this wonderful technology. Since the post is really only of interest to developers who already know WPF (or are at least keen to learn it), I shall not say anything about the technology except that it is a superb (and very modern) user interface/presentation framework for Windows applications – I strongly recommend it to anyone who develops complex (or even simple) user interfaces as a replacement for whatever library/toolkit they are already using.

The channel is currently the most popular WPF-related channel on any IRC network (of which I know), and sees regular activity, though not quite as much as we hope, hence this post! While the channel averages around 20-30 users at any time, it has not quite seen the growth it deserves as WPF gains more and more popularity. Hence, I urge anyone interested in the subject to at least pop in and check out the channel, perhaps idle around for a while. We’re a friendly lot, I promise, and will be glad to help out any newcomers!

Designs for a Computer Algebra System

Posted in Maths & Science, Programming, Projects, Software on January 8th, 2010 by Noldorin – Be the first to comment

Creating a modern computer algebra system from the ground up, as has been a plan of mine for some months now, is no trivial goal, as anyone who has even a vague conceptual understanding of computational algebra must surely know. My efforts in this area, grouped under the title of “the Syracuse Project“, have involved mostly research and large amounts of contemplation so far, but I feel that I have finally managed to formulate enough solid ideas that they are worth presenting in a short article. I was also fortunate enough to find someone on IRC (in #math-software on Freenode) with whom I could discuss and refine many of our mutual ideas. The ideas discussed in this post are the culmination of my own thoughts and much of the knowledge I gained from my conversations with Robert Smith (nickname ‘Quadrescence’) online, which has served as the basis for some of my own investigation and explanations here, in a modified form.

What I am going to focus on here falls mainly under the domain of my Euclid.NET project, which is essentially the core of Archimedes, which one might describe as the “CAS kernel”.  Think of Euclid.NET as the framework for symbolic mathematics upon which Archimedes is built. Its primary purpose is to handle such tasks as expression evaluation, simplification, differentiation, series expansion, and so on. (Bonus points if you can spot the connection between the names of Archimedes and Syracuse.)

TeX.NET, which is the primary parser (expression tree builder) for the project at this moment, has already seen a stable release, and will soon see the second with any luck. As you might guess from the name, it takes input in TeX syntax (well, actually LaTeX, with a few extensions).

To begin, I thought I would tackle one of the tougher aspects of computer algebra systems, namely, simplification. More trivial features such as expression evaluation and differentiation are essential to a CAS, yet are hardly worth an in-depth discussion, at least not for now.

So what is simplification?

Simplification, to state the banal, is concerned with making an expression more “simple”. Unlike many every terms that have been borrowed by mathematics, “simple” or “simplification” truly does not have a more rigorous definition. The problem here largely stems from the fact that the definition of what is simple is somewhat fluid – it depends to a great extent on the context. To illustrate this nature more concretely, here a few examples.

  1. (x+1)^2 or x^2 + 2x + 1?
  2. y^{-3.5} or 1/y^{3.5} or 1/(y^3 \sqrt{y})?
  3. \tan x \sin x or \dfrac{\sin^2 x}{\cos x}?
  4. e^{A+(2x-y)} or e^A\left[\dfrac{e^{2x}}{e^{y}}\right]?
  5. 4 \sqrt{x^3 + 6x^2 + 9x} or 4(x + 3)^2 \sqrt{x} ?

A simple expression treeA visualisation of a very simple expression tree. The equivalent infix expression is (2 + 2) + (2 + 2) + (3 + 3). Note that in general such trees are not restricted to be binary.

I think that anyone who has had sufficient experience using maths and manipulating many formulae and expressions will realise that in some scenarios, one of the given equivalent expressions in 1 to 5 is desired in a certain scenario, while another is desired in a second situation. Hence, any conceivable simplification algorithm cannot be treated as a rigid mechanical process, bur rather must adjust itself depending on the parameters it is given, which hint at the sort of result that is desired.

When humans perform simplification of mathematical expressions, they often use so-called intuition, developed from much prior experience, along with trial and error, to (in most cases) quickly and accurately simplify maths. It is inevitable that even the best mathematicians hit dead ends when trying to simplify complicated expressions. A computer, perhaps unsurprisingly, cannot do any better. Moreover, mathematical simplification is, in my view, one of the few aspects of mathematical methodology that overall better suits a holistic intelligence, rather than the traditional sequential one that is most often associated with maths (theorem proving being a notable exception). In fact, mathematics is not all so logical and all-encompassing as even mathematicians not long ago thought – thanks to the magnificent Incompleteness Theorem proposed by Kurt Godel in the 1930s – this is however a subject for another day.

How can we measure simplicity?

Fortunately, and perhaps surprisingly, the field of evolutionary computation presents a rather handy way of treating the “simplicity” of expressions, that is, a fitness function – in its most abstract sense, something that measures the absolute “fitness” of any given solution for a certain optimisation problem (most commonly genetic algorithms, which lent the term “fitness”). The “fitness” may be thought of qualitatively as the value, worth, or suitability of a particular solution. The solutions, in this case, are of course expression trees.

To begin, it is greatly helpful to reduce the problem by extracting a certain (small) number of parameters from the expression tree, rather than trying to analyse the entire thing holistically. This reduces the parameter space (the set of all possible parameters) dramatically, which is typically highly beneficial in optimisation problems. The fitness function itself is allowed to take any form in general, though we shall see shortly that one or two classes of function in particular are desirable. For now, let us just focus on the set of parameters. After some consideration, I came to the conclusion that any effective parameter space must consist of the following variables:

  • Size of the expression tree (i.e. the number of nodes)
  • Height of the expression tree  (i.e. the number of layers)
  • Width of the expression tree (i.e. the number of nodes in the bottom layer).

These are all of course integer values, and thus the value of the function is restricted to lie within the set of integers. After building up an image that measuring simplicity is a tricky thing, this may seem like a rather straightforward framework; indeed, it is in some ways, thought it is worth noting that the apparent problems arise when we decide what function to use and when it should be applied. The reasons for the choice of these parameters should become apparent soon.

Let us first consider a basic fitness function that simply weights each parameter individually in a linear combination. In other words, the fitness function, F(S, H, W) may be defined as the following, where S, H, and W correspond to the size, height, and width of the expression tree respectively.

F(S, H, W) = aS + bH + cW

The constants a, b, c may be any integer (postive or negative), and should be passed to the simplifier routine rather than predefined, according to the desired output. It should be quite evident that by choosing different magnitudes as well as signs for these constants, each of the three parameters may be independently rewarded or penalised to greater or lesser extents. The question you might then ask is: why not a more complicated function depending on S, H, and W? My answer is a straightforward one: there is no need. A linear combination of terms gives enough control over the desired function that extending to the function to add higher-order or even exponential terms would be quite pointless and arbitrary. I have, however, far from closed my mind on this matter – as I design and test the system progressively, certain discoveries may be made that suggest a slightly different approach.

Indeed, my only other real consideration for a fitness function thus far is one of even more basic form. Suppose, for example, that the fitness function only depended on two things: a) which parameter should be prioritised, b) whether this parameter should be promoted or demoted. The other two parameters would simply be minimised in the case that the first shows no preference between two trees. I have not finalised the implementation of this, but hopefully this brief description should give you an idea.

I now only leave, as an exercise to the reader, the five example expressions given in the previous section, and considered how any desired result (simplified form) can be achieved through the selection of the appropiate fitness function (using either of the two I have just proposed). Of course, feel free to post a comment regarding any queries or findings you have regarding this matter.

An effective algorithm for simplification

So far, I have discussed how simplicity can be measured in absolute terms, and how in this way the most “simple” of a set of solutions (expression trees) can be chosen as the result of the algorithm. What I have not really mentioned, however, is how an algorithm might actually search out the possible solutions. Although the nature of the algorithm is mainly independent of the fitness function  and the evaluation of expression trees, it is helpful to discuss this second so as to give a clearer image to the approach as a whole.

Simplification when done by a computer, as when done by humans, involves at its heart the application of a large number of mathematical rules that transform expressions. To give some examples of a few of the more basic rules:

  • x + 0 \rightarrow x
  • 1 * x \rightarrow x
  • x * x \rightarrow x^2
  • x * (y / x) \rightarrow y
  • x^2 - y^2 \rightarrow (x + y)(x - y)
  • \sin^2 x + \cos^2 x \rightarrow 1

Given a complete set of all simplification rules, we can find a path (or derivation) between any two equivalent expressions. (In theory, this is no issue, since the set is finite, though rather large. The practicality of  all the required rules is another issue that I will not go into here.) Note that the rules are bidirectional; they allow you take a simple expression and sequentially transform it into an arbitrarily complex, yet still identical one. (For example, x \rightarrow x + 0 - 0 + 0 - 0 + 0.)

Assuming (quite reasonably) that this assertion that a “finite number of application of simplification rules can derive any equivalent expression from the original” is valid, we must then consider how the search should proceed. Under the utterly naive brute force approach, the algorithm is clearly non-terminating, but we can do a lot better than this.

The search algorithm is all about compromisation in essence. If we search every possible derived expression (up to a certain size), then it could take an unreasonably length of time to simplify even relatively tiny expressions. On the other hand, we make too many assumptions and cut off many branches in the search tree prematurely, the algorithm may terminate quickly, though not necessarily with the simplest solution (or even anything close). Hence, the idea of using genetic algorithms, albeit initially appealing, is in my view to restrictive to a problem that requires a “perfect” answer in most cases, and should not have any stochastic nature.

My currently planned approach is one that does not differ greatly from a simple brute force evaluation of all the simplification paths. The main improvement is one that falls out rather naturally from using a fitness function. The idea is that each node of the search tree is evaluated by the fitness function upon its creation, and if the fitness is below a certain (specified) threshold, the search from that particular node terminates. For a start, this prevents originally small/simple expression trees being transformed into ones that are absurdly large, while still allowing some limited expansion of expression trees in the hope that they may later be simplified very effectively. Many nonsensical (what we might call counter-intuitive) paths of reduction of the expression would also be eliminated in a similar way. There is also one practical problem of important note here: many disparate simplification paths along the tree do converge to the same solution during a search (some quite quickly), so it would be quite foolish to branch twice from identical nodes of the search tree. Instead, we really want to cache any nodes (expression trees) already visited during the search process, and not compute their descendants (derived children) more than once. A simple hash table (set, in fact) would seem like the most effective way of accomplish this. Creating an efficient and relatively collision-free hash function for an arbitrary tree structure is however no trivial task. I was to get a number of quite sensible and useful responds when I asked the question on StackOverflow.

Apologies if this discussion of “search trees” and “expression trees” and their corresponding nodes has led to some confusion regarding what is what. It is most important to recognise that the node of a search tree is itself an expression tree (what I sometimes call a “solution” to the “simplification problem”). Due to the risk of losing reader interest at the cost of an even longer post, I shall stop my ramblings here, and leave further elaboration of the search algorithm for another post.

The future of the project

What has been discussed so far is largely theoretical, yet I have tried to present it in such a way that the method of implementation is for the most part self-evident. Work on this project will likely proceed slowly in the short-term future, though as it advances, the features will surely solidify. I am hoping that at least by summer there should be some tangible results to these efforts. Regardless, I shall try to give status updates along the way.