Ruby vs. Go... FIGHT!

Note this was written 9 years ago, with correspondingly old versions of both Go and Ruby. It’s probably interesting only for PL historians.

Only sorta-kinda. I’ve been trying to use Go for some tasks for which I’d normally reach to Ruby; most recently was grabbing some date elements from a large-ish XML file. I know, I know… Ruby has the best XML library ever built into it, but I’m more aware than anybody of the performance issues it has, so I tend to use it for only very small files. So when I needed to extract some information out of this fat XML, I thought I’d try Go.

Warning: Micro-Benchmarks ahead! Keep this in mind as you read. Usually, micro-benchmarks are looked at with skepticism; however, I’m claiming this is useful information because it’s a real-world application, solving a real-world need… even if it is only a very tiny little program in a very large world, after all.

The file wasn’t so big that I was worried about memory use, but since I only needed a leaf from each branch of the tree, I chose the SAX-ish API anyway. That’s going to make any code more bloated, but it wasn’t too bad; the results are in Version 1.

After I got everything working, I got to wondering about the performance, so I wrote it again in Ruby. I did not try to use the same logic; instead, I did it in what, for me, is more idiomatic Ruby code. The timings (I chose the average-looking times; I did not properly benchmark these, but I did run each program several times and then grabbed the middlin’ looking one), which may or may not be surprising, look like this:

Version	Language	Total time (s)	CPU usage
Version 1	Go	0.113	96%
Version R	Ruby	0.099	66%

“Hmmm”, I hear you say. Well, the Go version is actually parsing the XML, and we all know XML for the bloated, expensive-to-parse format that it is. OTOH, Ruby is doing regexp on every line, and is additionally reading the entire file into memory first and splitting it into an array on line endings. Hmmm. Well, let’s try a Go version that is a little more like the Ruby version. That’s Version 2:

Version	Language	Total time (s)	CPU usage
Version 2	Go	0.284	103%

Yowsa! That’s going in the wrong direction. Interestingly, it’s now using more than one core of my CPU, so it’s doing something thready underneath. Maybe it’s because I’m reading the file line-by-line off the disk? Let’s make it even more like the Ruby version; Version 3:

Version	Language	Total time (s)	CPU usage
Version 3	Go	0.292	105%

Definitely going in the wrong direction. Maybe it’s the sre2 library? Let’s try Version 4:

Version	Language	Total time (s)	CPU usage
Version 4	Go	0.037	89%

Ok, that’s better. Armed with this, I went back to not reading the file entirely into memory in Version 5:

Version	Language	Total time (s)	CPU usage
Version 3	Go	0.292	105%
Version 2	Go	0.284	103%
Version 1	Go	0.113	96%
Version R	Ruby	0.099	96%
Version 4	Go	0.037	89%
Version 5	Go	0.034	91%

Not a lot of difference; I did see a couple of runs where the CPU use dropped to 89% without affecting the total time, but these are pretty small numbers and we could be seeing actual run time being overwhelmed by the program initialization and what-not.

Anyway, I thought it was interesting. Ruby is slow as all get-out, but for micro-tasks where most of the heavy lifting is running in native C (regexp in Ruby is native, as is IO), it’s more than capable enough. It’s also worth noticing that this was with ca. 30 lines of Go code, vs. 8 lines of Ruby code.