Yearly Archives: 2007

What I Learned From X That Makes Me a Better Programmer in Y

Reginald Braithwaite says he’d love to hear stories about how programmers learned concepts from one language that made them better in another. This pretty neatly coincides with a post I’ve been meaning to make for months, so I might as well just get on with it and write something (because as CHart reminded me, I haven’t even posted for months).

Sometime around late 2004 – early 2005 I heard about Ruby on Rails for the first time. I’d never really programmed in any languages but Java/C#/PHP before, but I’d read posts by guys like Sam Ruby and Martin Fowler about how Ruby the language was really expressive and compact. However, it wasn’t until Rails started getting some buzz that I really looked at any Ruby code and tried to decipher what it was doing. Rails put Ruby within a frame of reference that I was very familiar with (web development) allowing me to easily contrast the “Ruby Way” with the .NET/Java way I was familiar with.

The first thing that really caught my eye was the extensive usage of blocks, or anonymous methods. Coming from Java/C#, I had a hard time deciphering what was really going on when I saw something like this in Ruby code:

list.find_all{ |item| item.is_interesting }

It was pretty easy to see what the end result should be, but how does it actually work? All I knew was that a simple one liner in Ruby seemed to balloon into this in Java:

List interesting = new ArrayList();
for(Item item : items){
  if(item.isInteresting()){
    interesting.add(item);
  }
}

Sometime later, a pattern was introduced into the Java project I’m currently working on by another developer. This pattern seemed to accomplish roughly the same thing as the Ruby example (conceptually, there was still a lot of code in the Java version).

new Finder(list, new InterestingItemSpecification()).find();

Astute readers might recognize this as a variation on the Specification Pattern I’d written about almost a year ago. The point of this pattern is to allow the developer to specify how to filter a list of items, rather than manually iterating over the list by themselves. Never mind the fact that doing this in Java requires as many lines as the standard for-loop example… It’s the concept of telling the list what you want, rather than looping through manually to take what you want that’s interesting here.

I eventually created a sub-class of Java’s ArrayList that allowed it to be filtered directly, just like Ruby arrays and C#’s generic list class. Now the code ended up looking like this:

list.Where(new InterestingItemSpecification());

Once I got this far, things really started to fall into place. I started to see duplication everywhere. Hundreds of methods (it’s a pretty large project) that selected slightly different things from the same lists, the only difference lying in a little if clause. I started deleting entire methods and replacing them with Specifications. Booyah. Then I started seeing other patterns.

Accumulation/Reduction/Folding:

public BigDecimal getTotal(){
  BigDecimal total = BigDecimal.Zero;
  for(Item item : getItems()){
    total = total.add(item.getSubTotal());
  }
  return total;
}

Mapping/Conversion

public List getConvertedList(){
  List converted = new ArrayList();
  for(Item item : items){
    converted.add(item.getAnotherObject());
  }
  return converted;
}

Applying actions/commands to each item

public void calculate(){
  for(Item item : items){
    item.calculate();
  }
}

For each of these common informal patterns I was able to create a formal method for accomplishing the same thing. The goal became to distill each method down to just the part that made it different from another method. The act of iterating a list is boring, boilerplate noise that just doesn’t have to be there. Here’s the end result:

Accumulate

list.reduce(new SumItemCommand());

Map

list.convert(new ItemToThingConverter());

Actions

list.forEach(new DoSomethingToItemCommand());

There’s still the overhead of creating a class for each action/command/converter, etc, but the main goal was reached. (I realize C# doesn’t have this problem, but once again, it’s the concept that was important to my learning).

I eventually started to get really good at seeing these patterns in code, even though a method might combine several of the above concepts. It really is amazing how many different ways a method can be written, but how easy it becomes to distill it down to accumulation, conversion, filtering, and just basic actions once you’ve had this “revelation.”

Over the last few months I started seeing some other, more specific examples of the above patterns. Summing was just a version of accumulation that acted on numbers. SelectMany (which I stole from C# 3.0) was simply accumulating into a list. By the time I got around to almost implementing GroupBy, I just stopped. Whoah. I was well on my way to implementing SQL on in memory objects. Maybe I should just stop this madness and write a SQL query to get what I want in the first place.

It’s amazing when I think back on it, but simply being exposed to another language (Ruby) because the code looked so pretty caused me to learn the hows and whys of basic functional programming techniques. I also gained a new respect for SQL, as I completely stumbled upon most of its basic concepts in my quest to remove needless duplication from Java code. It’s funny to think that Lisp has been around for ages, yet most programmers either aren’t even taught the basic building blocks of functional programming (I wasn’t), or else forgot about it. The sad part is, it’s all just basic fucking Math.

Arguments You’ll Almost Never Hear

Here’s a bit of a cheeky question…

You know all those "conversations" us nerds have about scalability and performance where we endlessly debate about where to put business logic and whether scaling the database is easier than scaling the application servers? Well, how come we never end up talking about how to make arguably the most costly (in terms of both time and $$$) operation of our applications perform better?

The costly operation I’m talking about is the journey our markup makes from the web server to the browser. It’s funny, because we’ll architect fantastic applications, and then shove absolutely bloated junk markup across the vast, unreliable Internet without a second thought. That shit costs money too… (I’m talking about bandwidth). And it’s code that’s visible to the world.

Obscuring HTTP

Ayende has tried to explain why he doesn’t like ASP.NET Webforms many times, but based on the comments that pop up on his posts I’m not sure if he’s successfully getting his point across. I’ll try to help him out in this instance, as I think the same way about not just Webforms, but most other view technologies. This will take more than one post, however, so hopefully I can convince myself to increase my stunning post frequency of the past year in order to properly delve into this issue.

First off, let’s take a paragraph or two for a brief refresher on HTTP, the protocol that drives the Web as we know it. This will be quick, and I guarantee it will be dirty…

The HTTP protocol

HTTP is based on a request/response model between a client and a server. The client is assumed to be a web browser in this instance (but can be anything really), and the server is a central location (IP address, domain, URI etc) on the Internet that responds to requests made by the client(s). Responses are generally sent back as HTML documents, but can also be XML, JSON or anything else, really. Each response tells the client what format it is sending via the Content-Type response header. There are many other response headers that provide clues to each client as to what it should do with the body of the response.

When a client makes a request to an endpoint, it specifies a verb that provides a clue to the server as to what the client wants it to do. These verbs are as followed:

  • GET – Means that the client is simply requesting to retrieve whatever information resides at the endpoint URI.
  • POST – Used to send information to the server, which the server can then use to do things like update a domain object. When a POST request is completed by the server, it can send response back in the form of 200 (OK) or 204 (No Content), or more likely a redirect to another URI.
  • PUT – Rarely used and not very well supported, this verb is similar to POST in that the client sends information in the request that it expects the server to act on.
  • DELETE – Also rarely used, this one is pretty self explanatory. The expectation of the client is that the requested resource will be deleted.

The modern web generally just uses the first two verbs (GET and POST) to get things done, although the latest version of Rails fakes out the PUT and DELETE verbs to more closely match the intended spirit of HTTP. One thing that you may notice is that GET, POST, PUT and DELETE look an awful lot like CREATE, READ, UPDATE and DELETE, but that’s a "coincidence" for another post.

The way this stuff all gets mashed together to create a usable application on the web is only slightly complicated at the lowest level. In a common use case, a user makes a GET request (through a browser) to a URI that returns an HTML response. The browser then displays the HTML to the user. If the HTML response contains a FORM element, well that’s an invitation to the user to change the state of data on the server in some way (maybe by adding a new post to a blog via a few text boxes). When the user clicks the submit button, a POST request is sent to the server that contains all the text the user entered in the HTML textboxes. Once the server receives the request, it’s up to the application that drives it to figure out what to do with the data sent by the client.

I hope I haven’t lost everyone yet, because I swear there’s some sort of profound punchline to be found here.

The Quest For a Simpler Rube Goldberg Machine

Now, I’m sure we can all agree at this point that HTTP is pretty simple. Clients make requests using a verb that may or may not contain data, and the server responds back to the client in whatever way it deems appropriate. The issue that Ayende and I have with Webforms (and Struts, and other view frameworks) is that they take something simple and try to make it different. In the case of Webforms, Microsoft has tried to create an event-driven, stateful paradigm out of something that is resource-driven at it’s core.

The result of this is that Webforms has become a layer of indirection that sits on top of HTTP. Indirection in and of itself is not bad; as a guy that uses ORM’s to abstract the database will tell you. The problem is that I think it’s gone a little further away from the underlying model than it should.

Witness the ASP.NET page lifecycle.

Webforms is an attempt to make web programming look like desktop programming. As a guy who learned about web programming via ASP.NET, I found it was pretty intuitive. The problem came when I ran into leaks in the abstraction that I couldn’t deal with without the knowledge of what is really going on under the hood in the HTTP pipeline.

If You Only Read One Paragraph, Read This One

Now the first problem with Webforms is not that it’s an abstraction, or even that it’s a leaky one (they all are). The problem is that what Webforms attempts to abstract away is actually simpler than the abstraction!

The second "problem" with Webforms is that not very many people know the first problem. I know I didn’t, until I saw how Rails, Monorail, and other frameworks are able to work with the underlying model of the Web, while still being terribly simple to understand and develop on top of. Making it easier to program for the Web is a laudable goal, I’m just not so sure that abstracting the technology that it’s built on top of to the point where it’s unrecognizable is the way to go about doing it.

Revelation

Shit. And here I thought I was a Mac guy because I was simply a chump for superficial special effects and had a hate on for virii. Turns out I’m actually just sexy, charming and funny. Too bad I got married before I figured this out.

Hosting Awesomiality ™

Not sure if anyone noticed, but my blog was down for almost 24 hours after an utterly botched “urgent maintenance upgrade” by WebHost4Life. Now I’ve been with them for around 3-4 years with only a few outages, but this one was just utterly rad.

I put in a help desk ticket this morning at about eight o’clock, and got a response within about a half-hour saying that it had been forwarded to the "server guys." Great, should be back up in no time. Or not.

After having no response for 10 more hours, I finally started poking around in the hosting control panel. After a bit of devious delving, I noticed that IIS was being pointed at a directory in the E: drive, while the file manager was telling me all my files were on the C: drive. So I changed the mapping and I’m back up.

I have to say, I’ve had few complaints up until now with my service. But this may finally give me the motivation to move over to Text Drive, where I’ve got a lifetime account. Up until now I’ve just been too lazy. Speaking of which… I deserve a beer.

Persistence Ignorance

People are talking about Microsoft’s Entity framework and how it does not currently allow persistence ignorant domain objects.

I’ve been torn about this issue for a while now. On the one hand, having an O/R mapper that is persistent ignorant essentially means that it has to support XML mapping files. The downside to this approach is duplication of each entity’s properties (which leads to managing them in multiple places), having to edit and maintain these files, and not being able to see mapping information all in one place. This price is often worth it, though.

On the other hand, using attributes to specify mapping information leads to less "code" to manage, and the advantage of having your domain class and mapping information all in one location. The price is that your domain objects have to know about the persistence framework.

The one thing I’ve observed recently is that most of the Java developers I’ve talked to who’ve used Hibernate in the past are excited and relieved that the latest versions support annotations (attributes in .NET) for specifying mapping information. Most of them seem to dislike mapping via XML files, and feel that the price of using annotations is worth it.

It’s too bad for Microsoft that nHibernate already supports both methods, so they’ll have to as well if they want to keep up.

For Posterity’s Sake

It’s March 10th, and I just sat outside in a t-shirt, without socks, drinking a Newcastle Ale (in a delightfully diminutive stubby bottle) for about an hour. WX says it’s only 12 degrees centigrade, but I don’t believe it.

God, I can’t wait for summer.

Timesheet Released

Several years ago, while I was still working at Kanga, I wrote a timesheet application that was used by the company to track employee hours. Based on ASP.NET and MySQL, the original intent was to have it running on Mono. That never really panned out, but the app was in use for at least a year and a half while I was there. As far as I knew, it was still in use up until the very end of the company.

The code languished on my hard drive for the last two years. During that period, I’d get an average of one or two inquiries each month via email from various souls across teh Intarweb who were interested in taking it for a spin. Unfortunately, the code didn’t even compile anymore and I didn’t really feel like getting it to a working state. That all changed this past week, for whatever reason.

I spent a few nights whipping the codebase into a somewhat decent state. It now compiles, and has an updated SQL script to get the database up and running. It seems to work, so I thought I’d upload it to Google code for anybody who’s interested. The code has a staggering test coverage stat of 0%, but everything used to work 2 years ago, so what the heck. It’s no longer something I’m terribly proud of, but it works and I might commit to making it kick some sort of ass over the next few months.