A severe lack of sleep and the "too many irons in the fire" syndrome has left me a tad frazzled, so instead of a well-written (I see you laughing!) post, you get a collection of random thoughts and observations:
Too much lambda?
I wonder if I’m relying too heavily on lambda expressions. What do you think, is this going too far (extracted from one of my unit tests):
1: /// <summary>
2: /// Verifies that the correct number of graphs are returned
3: /// for various sizes.
4: /// </summary>
6: public void EnumerateGraphs_ReturnsCorrectNumberOfGraphs()
8: //The number of possible graphs is 2^((n*(n-1)/2) where n is the number of edges.
9: //This expression will calculate it.
10: Func<int, int> NumGraphs = (n) => (int)Math.Pow(2, (n*(n - 1)/2));
12: Assert.AreEqual(NumGraphs(1), Graph.EnumerateGraphs(1).Count());
14: Assert.AreEqual(NumGraphs(2), Graph.EnumerateGraphs(2).Count());
16: Assert.AreEqual(NumGraphs(3), Graph.EnumerateGraphs(3).Count());
18: Assert.AreEqual(NumGraphs(4), Graph.EnumerateGraphs(4).Count());
20: Assert.AreEqual(NumGraphs(5), Graph.EnumerateGraphs(5).Count());
22: Assert.AreEqual(NumGraphs(6), Graph.EnumerateGraphs(6).Count());
See line 10 for what I mean. The formula for counting the number of possible undirected graphs with a given number of vertices is quite simple, and it just didn’t "feel" like it needed to be a method; I just wanted to alias the expression so that I could easily reuse it. Is this a bad thing? Should I have created a private helper method instead (complete with comments and all that jazz)?
Parallel Extensions == Awesome
One of my assignments for grad school was to create a graph coloring application (I won’t get in to details, and no, I won’t be posting the code for that), but if you’ve ever messed with graph coloring, you know that the time taken to find a valid coloring increases greatly as the complexity of your graph increases. Well, I wrote my solution in C#, and I wasn’t satisfied with the runtime, so I used the Parallel Extensions June CTP release to parallelize it. The improvements were impressive. Sequentially, calculating the average number of colors needed to "colorize" all graphs of size <= 8 took about 1.5 hours (in DEBUG) on a dual-dual core Opteron workstation, with only one core saturated (a huge waste). I applied Parallel.ForEach in the problem, and the run-time dropped to 28 minutes. That’s impressive. 🙂
Parallel Extensions == Ugly
Despite the huge performance gains that are available by leveraging multiple cores, I still think the API has a long way to go. Here’s the sequential work-horse of my algorithm:
1: foreach (Graph graph in Graph.EnumerateGraphs(numVertices))
3: //Keep a running count of how many graphs there are.
6: //Tally how many colors this graph took.
7: totalColors += FindMinimumColors(graph);
And here’s the parallel version:
3: () => (new Pair<int, int>()),
4: (graph, index, state) =>
6: int minColors = FindMinimumColors(graph);
7: //The left half of the pair is the count, the right half is the color sum
9: state.ThreadLocalState.Right += minColors;
11: partial =>
13: Interlocked.Add(ref numGraphs, partial.Left);
14: Interlocked.Add(ref totalColors, partial.Right);
As you can see, it looks like the unreadable code fairy sprinkled pixie dust all over the sequential version, then kicked it in the knees and took its wallet. There needs to be a better, more readable way to do aggregations like this.
Yeah, for this week anyway. Next week I hope to write more about DeepCrawler.NET (there’s a chance that my current employer may even adopt the code, which means I could actually get paid to work on it!), but if anyone would rather see/hear about something else, let me know.