Category Archives: Programming

Snippets

After a spirited bout with rubygems… I have these:

gem list -d gem_name
gem uninstall --install-dir ~/.gem/ruby/1.8 gem_name -a

The first one describes a gem, including the location in which each version is installed. The second will uninstall all versions of a gem from the specified location.

I fought hard for those.

Object Mother Testing Pattern in Rails

Over the years I was working in Java at CGI, I became used to the Object Mother pattern of test setup. Despite the fact that it can break down when objects and their relationships get complicated, it is something that works quite well on smaller projects. At CGI we implemented it like this:

  • One factory per type of object being created
  • Each factory has a createNewValid method that creates a valid object that can be saved to the database. This method does not persist the object.
  • Each factory also has a createAnonymous method that calls createNewValid and then saves the object.

Rails has a number of plugins that do something similar, but object_daddy is the closest to what I was looking for. It follows the same pattern, calling the creation methods spawn and generate instead. It also allows you to customize the look of a valid object by letting you pass in a block that can modify the object’s state. This eliminates a huge problem with our implementation at CGI, where we had hundreds of slightly different createNewValid and createAnonymous methods depending on the scenario we were trying to set up.

Unfortunately I had a couple of problems with object_daddy. The first is that the generate method saves the object before it yields to a passed in block. Thus, when you call User.generate { |u| u.name = ‘Fred’ } you get an object with a name of Fred, but that name is not yet persisted. That was a minor issue, though.

I also had a problem when I tried creating multiple objects with object_daddy within the same setup. I would get an error stating that attributes had already been added for a class, and couldn’t figure out what the heck was going on. It could be just me (probably is), but I found the source code very hard to follow for something that should be pretty simple. Since I’m new to Ruby, I figured I’d just try doing it myself from scratch and maybe learn something in the process.

Enter object_mother. Right now it’s just about as basic as can be. I can think of numerous ways to improve it, but for now I’m pretty happy. Here’s the source in it’s entirety:

module ObjectMother
  class Factory
    @@index = 0
    def self.spawn
      raise 'No create method specified'
    end
    def self.populate(model)
      raise 'No populate method specified for ' + model.to_s
    end
    def self.to_hash(model)
      raise 'No to_hash method specified'
    end
    def self.unique(val)
      @@index += 1
      val.to_s + @@index.to_s
    end
    def self.unique_email(val)
      user, domain = val.split('@')
      unique(user) + '@' + domain
    end
    def self.create
      obj = spawn
      populate(obj)
      yield obj if block_given?
      obj
    end
    def self.create!(&block)
      obj = create(&block)
      obj.save!
      obj
    end
    def self.create_as_hash(&block)
      obj = create(&block)
      to_hash obj
    end
  end
end

All the major concepts are there; create equals createNewValid and create! is the same as createAnonymous. There’s also a create_as_hash method for converting a valid object to a params hash when you’re doing post :create type stuff in functional tests. Currently, you utilize the Factory by sub-classing in your rails app somewhere (I’m just using test_helper.rb for now). Here’s an example:

class AreaFactory < ObjectMother::Factory
  def self.spawn
    Area.new
  end
  def self.populate(model)
    model.name = unique('SW')
    model.city = CityFactory.create!
  end
end

The spawn method just creates a blank instance of the class each individual factory deals with and could definitely be inferred (baby steps, I’ll get to that once the annoyance factor gets higher). Overriding populate is where the attributes of a valid model object are set.

Overall, this is really just what I wanted, and the basic code is only a few simple lines of ruby. I’ll add some smarts in eventually (ie; inferring the class to spawn, creating hashes automatically, giving the user a default place to put Factories, making sure that passing blocks actually works), but for the time being I’m pretty happy.

Just Be Honest and Tell the Truth

I just read Ron Jefferies latest post entitled My Named Cloud is Better Than Your Named Cloud and it got me riled up enough to post something I’ve been meaning to write about for at least a couple of years. His post touches on the point I’d wanted to make, but doesn’t quite say it as simply as I think it can be said. Here’s what I’m thinking:

If we could just always be honest with ourselves, as people, team members and organizations, software development wouldn’t be that hard.

There. Simple. Think about it for a second. All this stuff that we label and group together under methodologies and processes is really there so that we can do One Thing. This one thing, I’m assuming, is usually to create software that fulfills a need.

Let’s pretend that a software team has been assembled to create an application that fulfills such a need. They’ve chosen to use the XP methodology while writing this application. I’ll go through some of the tenets of XP and run them through my honesty filter:

  1. Pair programming: The team realizes that people will come and go on a project for good (quit, fired, etc), or for short periods of time (maternity leave, vacation, illness, etc). They want to minimize the risk, so they make sure everyone is familiar with every piece of the code base. They also realize that sometimes even the best developers do stupid shit for no good reason, and believe that having two sets of eyes on the screen at all times will lessen the likely hood of this happening.
  2. TDD: The team realizes that writing software is hard, and that stuff that worked last week will have to be changed this week. Since they want to ensure that stuff they worked on earlier still works when they make these changes, they write tests, and ensure that no code gets checked in unless all the tests are passing. They also hope that these tests provide some form of documentation to any developers (and perhaps users/clients) that may come later or who never worked on the feature originally.
  3. Incremental design: The team realizes that there are unforeseen forces that may threaten a project at any given time. A big project entails a large amount of risk for both clients and developers. If features can be developed in incremental fashion, hopefully in an order representing the importance of each feature, then risks can be mitigated. The team is always working with a usable application, so that if something comes up that stops the project, at least the client will have something to work with.

Now obviously there is more to XP than what I’ve listed. The point is that those three things exist to handle a need that most teams, if they are really honest with themselves have:

  1. Knowledge transfer. Nobody wants to have to rely on one person to get something done. Sooner or later, this always bites us in the ass. Pairing is one way to handle this.
  2. Developed features continue working. Nothing is more frustrating (to users and developers) than having something that used to work perfectly stop working. Testing is one way to handle this.
  3. The need to guarantee that something will come out of all the money we’re spending. What happens if the development shop you’ve contracted goes bankrupt before they’ve finished your application? Incremental design is one way to handle this.

Listen. We have methodologies and processes for a reason (I hope). Some of these processes may work for you. However, maybe only some parts of a process work for you. The point is, it’s not about the process. If you can understand why you’re using a particular piece of a process, you can assess whether it’s useful for your team. Who cares whether you’re doing XP, Scrum, Lean, Kanban, Waterfall, or Hack ‘n’ Slash. Those are just names. Identify your pain points, problems, worries, etc, and try to fix them. Honestly.

What I Learned From X That Makes Me a Better Programmer in Y

Reginald Braithwaite says he’d love to hear stories about how programmers learned concepts from one language that made them better in another. This pretty neatly coincides with a post I’ve been meaning to make for months, so I might as well just get on with it and write something (because as CHart reminded me, I haven’t even posted for months).

Sometime around late 2004 – early 2005 I heard about Ruby on Rails for the first time. I’d never really programmed in any languages but Java/C#/PHP before, but I’d read posts by guys like Sam Ruby and Martin Fowler about how Ruby the language was really expressive and compact. However, it wasn’t until Rails started getting some buzz that I really looked at any Ruby code and tried to decipher what it was doing. Rails put Ruby within a frame of reference that I was very familiar with (web development) allowing me to easily contrast the “Ruby Way” with the .NET/Java way I was familiar with.

The first thing that really caught my eye was the extensive usage of blocks, or anonymous methods. Coming from Java/C#, I had a hard time deciphering what was really going on when I saw something like this in Ruby code:

list.find_all{ |item| item.is_interesting }

It was pretty easy to see what the end result should be, but how does it actually work? All I knew was that a simple one liner in Ruby seemed to balloon into this in Java:

List interesting = new ArrayList();
for(Item item : items){
  if(item.isInteresting()){
    interesting.add(item);
  }
}

Sometime later, a pattern was introduced into the Java project I’m currently working on by another developer. This pattern seemed to accomplish roughly the same thing as the Ruby example (conceptually, there was still a lot of code in the Java version).

new Finder(list, new InterestingItemSpecification()).find();

Astute readers might recognize this as a variation on the Specification Pattern I’d written about almost a year ago. The point of this pattern is to allow the developer to specify how to filter a list of items, rather than manually iterating over the list by themselves. Never mind the fact that doing this in Java requires as many lines as the standard for-loop example… It’s the concept of telling the list what you want, rather than looping through manually to take what you want that’s interesting here.

I eventually created a sub-class of Java’s ArrayList that allowed it to be filtered directly, just like Ruby arrays and C#’s generic list class. Now the code ended up looking like this:

list.Where(new InterestingItemSpecification());

Once I got this far, things really started to fall into place. I started to see duplication everywhere. Hundreds of methods (it’s a pretty large project) that selected slightly different things from the same lists, the only difference lying in a little if clause. I started deleting entire methods and replacing them with Specifications. Booyah. Then I started seeing other patterns.

Accumulation/Reduction/Folding:

public BigDecimal getTotal(){
  BigDecimal total = BigDecimal.Zero;
  for(Item item : getItems()){
    total = total.add(item.getSubTotal());
  }
  return total;
}

Mapping/Conversion

public List getConvertedList(){
  List converted = new ArrayList();
  for(Item item : items){
    converted.add(item.getAnotherObject());
  }
  return converted;
}

Applying actions/commands to each item

public void calculate(){
  for(Item item : items){
    item.calculate();
  }
}

For each of these common informal patterns I was able to create a formal method for accomplishing the same thing. The goal became to distill each method down to just the part that made it different from another method. The act of iterating a list is boring, boilerplate noise that just doesn’t have to be there. Here’s the end result:

Accumulate

list.reduce(new SumItemCommand());

Map

list.convert(new ItemToThingConverter());

Actions

list.forEach(new DoSomethingToItemCommand());

There’s still the overhead of creating a class for each action/command/converter, etc, but the main goal was reached. (I realize C# doesn’t have this problem, but once again, it’s the concept that was important to my learning).

I eventually started to get really good at seeing these patterns in code, even though a method might combine several of the above concepts. It really is amazing how many different ways a method can be written, but how easy it becomes to distill it down to accumulation, conversion, filtering, and just basic actions once you’ve had this “revelation.”

Over the last few months I started seeing some other, more specific examples of the above patterns. Summing was just a version of accumulation that acted on numbers. SelectMany (which I stole from C# 3.0) was simply accumulating into a list. By the time I got around to almost implementing GroupBy, I just stopped. Whoah. I was well on my way to implementing SQL on in memory objects. Maybe I should just stop this madness and write a SQL query to get what I want in the first place.

It’s amazing when I think back on it, but simply being exposed to another language (Ruby) because the code looked so pretty caused me to learn the hows and whys of basic functional programming techniques. I also gained a new respect for SQL, as I completely stumbled upon most of its basic concepts in my quest to remove needless duplication from Java code. It’s funny to think that Lisp has been around for ages, yet most programmers either aren’t even taught the basic building blocks of functional programming (I wasn’t), or else forgot about it. The sad part is, it’s all just basic fucking Math.

Arguments You’ll Almost Never Hear

Here’s a bit of a cheeky question…

You know all those "conversations" us nerds have about scalability and performance where we endlessly debate about where to put business logic and whether scaling the database is easier than scaling the application servers? Well, how come we never end up talking about how to make arguably the most costly (in terms of both time and $$$) operation of our applications perform better?

The costly operation I’m talking about is the journey our markup makes from the web server to the browser. It’s funny, because we’ll architect fantastic applications, and then shove absolutely bloated junk markup across the vast, unreliable Internet without a second thought. That shit costs money too… (I’m talking about bandwidth). And it’s code that’s visible to the world.

Obscuring HTTP

Ayende has tried to explain why he doesn’t like ASP.NET Webforms many times, but based on the comments that pop up on his posts I’m not sure if he’s successfully getting his point across. I’ll try to help him out in this instance, as I think the same way about not just Webforms, but most other view technologies. This will take more than one post, however, so hopefully I can convince myself to increase my stunning post frequency of the past year in order to properly delve into this issue.

First off, let’s take a paragraph or two for a brief refresher on HTTP, the protocol that drives the Web as we know it. This will be quick, and I guarantee it will be dirty…

The HTTP protocol

HTTP is based on a request/response model between a client and a server. The client is assumed to be a web browser in this instance (but can be anything really), and the server is a central location (IP address, domain, URI etc) on the Internet that responds to requests made by the client(s). Responses are generally sent back as HTML documents, but can also be XML, JSON or anything else, really. Each response tells the client what format it is sending via the Content-Type response header. There are many other response headers that provide clues to each client as to what it should do with the body of the response.

When a client makes a request to an endpoint, it specifies a verb that provides a clue to the server as to what the client wants it to do. These verbs are as followed:

  • GET – Means that the client is simply requesting to retrieve whatever information resides at the endpoint URI.
  • POST – Used to send information to the server, which the server can then use to do things like update a domain object. When a POST request is completed by the server, it can send response back in the form of 200 (OK) or 204 (No Content), or more likely a redirect to another URI.
  • PUT – Rarely used and not very well supported, this verb is similar to POST in that the client sends information in the request that it expects the server to act on.
  • DELETE – Also rarely used, this one is pretty self explanatory. The expectation of the client is that the requested resource will be deleted.

The modern web generally just uses the first two verbs (GET and POST) to get things done, although the latest version of Rails fakes out the PUT and DELETE verbs to more closely match the intended spirit of HTTP. One thing that you may notice is that GET, POST, PUT and DELETE look an awful lot like CREATE, READ, UPDATE and DELETE, but that’s a "coincidence" for another post.

The way this stuff all gets mashed together to create a usable application on the web is only slightly complicated at the lowest level. In a common use case, a user makes a GET request (through a browser) to a URI that returns an HTML response. The browser then displays the HTML to the user. If the HTML response contains a FORM element, well that’s an invitation to the user to change the state of data on the server in some way (maybe by adding a new post to a blog via a few text boxes). When the user clicks the submit button, a POST request is sent to the server that contains all the text the user entered in the HTML textboxes. Once the server receives the request, it’s up to the application that drives it to figure out what to do with the data sent by the client.

I hope I haven’t lost everyone yet, because I swear there’s some sort of profound punchline to be found here.

The Quest For a Simpler Rube Goldberg Machine

Now, I’m sure we can all agree at this point that HTTP is pretty simple. Clients make requests using a verb that may or may not contain data, and the server responds back to the client in whatever way it deems appropriate. The issue that Ayende and I have with Webforms (and Struts, and other view frameworks) is that they take something simple and try to make it different. In the case of Webforms, Microsoft has tried to create an event-driven, stateful paradigm out of something that is resource-driven at it’s core.

The result of this is that Webforms has become a layer of indirection that sits on top of HTTP. Indirection in and of itself is not bad; as a guy that uses ORM’s to abstract the database will tell you. The problem is that I think it’s gone a little further away from the underlying model than it should.

Witness the ASP.NET page lifecycle.

Webforms is an attempt to make web programming look like desktop programming. As a guy who learned about web programming via ASP.NET, I found it was pretty intuitive. The problem came when I ran into leaks in the abstraction that I couldn’t deal with without the knowledge of what is really going on under the hood in the HTTP pipeline.

If You Only Read One Paragraph, Read This One

Now the first problem with Webforms is not that it’s an abstraction, or even that it’s a leaky one (they all are). The problem is that what Webforms attempts to abstract away is actually simpler than the abstraction!

The second "problem" with Webforms is that not very many people know the first problem. I know I didn’t, until I saw how Rails, Monorail, and other frameworks are able to work with the underlying model of the Web, while still being terribly simple to understand and develop on top of. Making it easier to program for the Web is a laudable goal, I’m just not so sure that abstracting the technology that it’s built on top of to the point where it’s unrecognizable is the way to go about doing it.

Persistence Ignorance

People are talking about Microsoft’s Entity framework and how it does not currently allow persistence ignorant domain objects.

I’ve been torn about this issue for a while now. On the one hand, having an O/R mapper that is persistent ignorant essentially means that it has to support XML mapping files. The downside to this approach is duplication of each entity’s properties (which leads to managing them in multiple places), having to edit and maintain these files, and not being able to see mapping information all in one place. This price is often worth it, though.

On the other hand, using attributes to specify mapping information leads to less "code" to manage, and the advantage of having your domain class and mapping information all in one location. The price is that your domain objects have to know about the persistence framework.

The one thing I’ve observed recently is that most of the Java developers I’ve talked to who’ve used Hibernate in the past are excited and relieved that the latest versions support annotations (attributes in .NET) for specifying mapping information. Most of them seem to dislike mapping via XML files, and feel that the price of using annotations is worth it.

It’s too bad for Microsoft that nHibernate already supports both methods, so they’ll have to as well if they want to keep up.

Timesheet Released

Several years ago, while I was still working at Kanga, I wrote a timesheet application that was used by the company to track employee hours. Based on ASP.NET and MySQL, the original intent was to have it running on Mono. That never really panned out, but the app was in use for at least a year and a half while I was there. As far as I knew, it was still in use up until the very end of the company.

The code languished on my hard drive for the last two years. During that period, I’d get an average of one or two inquiries each month via email from various souls across teh Intarweb who were interested in taking it for a spin. Unfortunately, the code didn’t even compile anymore and I didn’t really feel like getting it to a working state. That all changed this past week, for whatever reason.

I spent a few nights whipping the codebase into a somewhat decent state. It now compiles, and has an updated SQL script to get the database up and running. It seems to work, so I thought I’d upload it to Google code for anybody who’s interested. The code has a staggering test coverage stat of 0%, but everything used to work 2 years ago, so what the heck. It’s no longer something I’m terribly proud of, but it works and I might commit to making it kick some sort of ass over the next few months.

Why I Prefer C# to Java (The Language)

The first language I really learned to program in was Java. The first language I actually delivered a product with was C#. It wasn’t hard to move from one to the other, as in most aspects the language was exactly the same. There are some significant differences, though. The most striking difference in my mind is how the C# language has received so many nice little features that just make the code cleaner. You could write Java code in C# if you wanted to, but that would just be plain silly. Here’s a sample of what I’m talking about:

Java and C# both utilize a finally concept that lets the programmer clean up resources. However, C# takes this a step further with the using statement. Here’s the Java way…


Timer timer;
try{
timer = new Timer();
timer.start();
doStuffWeWantToTime();
} finally {
timer.stop();
}

…And now the C# way…


using(new Timer.start()){
doStuffWeWantToTime();
}

The using statement will implicitly call the Timer’s Dispose method once the block goes out of scope. The compiler actually generates the same code as the try/finally block, so it’s all just syntactic sugar. But sugar is so sweet.

Next up, we have the new generic Collections namespace in C#. On my current project (Java), we implemented a class called the Finder, which takes a collection and a specification. It uses the specification object(s) to filter the collection like so:


public List getProductsForSale(){
return new Finder(getProducts(),
new ProductsOnSaleSpecification()).find();
}
public class ProductsOnSaleSpecification(){
public bool isSatisfiedBy(Object obj){
Product p = (Product)obj;
return p.isOnSale();
}
}

The Finder abstracts out the looping, while the ProductsOnSaleSpecification tells the Finder which products we’re interested in. It’s pretty sweet, that is until I realized that this is actually built in to C#’s generic collection classes (the following is pseudo-code… Use of generics is implied, but I’m too lazy to html-encode the angle brackets):


public List getProductsForSale(){
return Products.FindAll(IsProductOnSale);
}
private boolean IsProductOnSale(Product p){
return p.IsOnSale;
}

It’s worth noting that the C# collection classes have more than just the FindAll concept… You can also call Exists, Find(one), ConvertAll, FindIndex, FindLast, ForEach, Remove, RemoveAll and TrueForAll in much the same way. You can also pass in an anonymous block of code, which is based on .NET’s support of delegates. I’ve written about this in more detail before.

A basic language feature that exists in C# is the notion of a Property, which allows you to present the internal state of an object in a cleaner form than the Java standard of using getters and setters. Again, this is just syntactic sugar, but it’s nice to be able to visually tell whether you’re operating directly on an object’s state. Here’s the Java code and it’s C# equivalent.


public class Dude(){
private int age;
public int getAge(){
return age;
}
public void setAge(int age){
this.age = age;
}
}


public class Dude(){
private int age;
public int Age{
get{ return age; }
set{ age = value; }
}
}

My next C# feature isn’t really a feature, and it’s perhaps the most contentious of my points. The feature is the lack of checked exceptions in C#. My current project has very few points in the code where we actually handle exceptions (the Facades, and various points in the UI, but almost nowhere in the Domain). Yet we’re consistently forced by the Java compiler to stick throws ValidationException on almost all of our methods. It quickly just becomes unwanted noise in the code base.

That’s about it for my little comparison. I still have a few more points I could make, specifically around delegates and events in C#, and how stupid it is that the Java foreach statement equivalent only operates on generic lists without needing a cast, but I’ll save those for another post.