Go and Java

I recently had to write something in Java again, and was struck by the fact that – with Java – absence does not make the heart grow fonder.

The little test application just read in an HTML file and broke it out into pieces – a prefix, containing everything up to and including the tag, the inner body HTML, and a suffix including the close body tag and everything after it.  It was just a sanity test for some regexp in Java.  Here’s the source:

import java.util.regex.*;
import java.io.*;

public class Main {
  public static final void main(String[] args) {
    try {
      BufferedReader reader = new BufferedReader(new FileReader("testdoc.html"));
      StringBuilder builder = new StringBuilder();
      String line = null;

      String eol = System.getProperty("line.separator");
      while ((line = reader.readLine()) != null) {
        builder.append(line).append(eol);
      }   
      // Three groups: (BEFORE)(MIDDLE)(AFTER)                        
      Pattern p = Pattern.compile("^(.*<body.*?>)(.*)(</body>.*)$", 
         Pattern.MULTILINE | Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
      Matcher m = p.matcher(builder.toString());
      System.out.println("BODY:n"+m.group(2)+"n========================n");
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Simple enough. There are a number of ways to accomplish the same goal; java.util.Scanner might be shorter (albeit slower); in Java 7 there’s readAllBytes() in java.nio.file.Files.  Here’s the same program in Go:

package main

import ( "regexp" "io/ioutil" "fmt" )

func main() {
  text, err1 := ioutil.ReadFile("testdoc.html")
  re, err2 := regexp.Compile("(?msi:^(.*<body.*?>)(.*)(</body>.*)$)")
  if err1 != nil || err2 != nil {
    fmt.Println(err1, err2)
  } else {
    groups := re.FindSubmatch(text)
    fmt.Printf("BODY:n%sn========================n", groups[2])
  }
}

It’s doing the same things functionally.  However,  excluding the blank lines, the Java version has 75% more lines of code and 108% more characters (twice as much typing).

The thing that struck me the most, though – beyond the annoying boiler-plate in Java – was the fact that although Java 6 has around 203 packages and some 3800 classes, it doesn’t provide a utility function for reading the contents of a file into memory.  Go, for comparison, has 170 (non-purely container) packages and has fewer than 1900 public types; it isn’t as all-inclusive in utility functions as Ruby, but it’s much better than Java.

It’s just easier to kick out code in Go.  There’s less to type, the packages and functions are self-consistent, and utility functions are really that – not just layers of objects to wrap crap up in to get at what you want to do, eventually.