Erlang + TokyoTyrant / TokyoCabinet / medici
My first real, significant, observation about Erlang is this piece of advice to new Erlangers: repeat this mantra:
Mailboxes are not queues!
More experienced devs than I get this wrong. In any case, I’ve been working with Erlang and TokyoCabinet/Tyrant at work – I’ve been using MongoDB, too; it isn’t clear yet which will best fit our needs – and I’ve been using medici by Jim McCoy, a set of libraries for interfacing with TokyoTyrant/Cabinet. I found a bug today in the principe module, which the following code demonstrates:
-module(principe_test).
-export([runtest/0, loop/0]).
mongod() -> "mongod". % Where is your mongod executable?
runtest() ->
start_mongo(),
Inserter = spawn_link( fun ?MODULE:loop/0 ),
Inserter ! quit,
Inserter ! go,
timer:sleep(500),
{ok, Fd} = file:open("mongod.pid",[read]),
{ok, Data} = file:read_line( Fd ),
os:cmd(io_lib:format("kill -HUP ~p",[string:strip(Data,both,10)])).
start_mongo() ->
filelib:ensure_dir("test/file"),
Cmd = io_lib:format(
"~s --dbpath ~s --port 9999 --fork --logpath ~s --pidfilepath ~s",
[ mongod(),
filename:absname("test"),
filename:absname("mongod.log"),
filename:absname("mongod.pid") ]),
os:cmd(Cmd),
timer:sleep(1000).
loop() ->
{ok, P} = principe:connect( [{port, 9999}] ),
receive
go -> principe:put( P, "key", "value" )
end.
The problem is that principe:get/3
(and many
other functions) use receive
to get
call-backs from the socket library, and assume that they’re going to be
receiving messages from only the socket library. This is a bad
assumption. In effect, the methods hijack the mailbox of the calling
process and then assume that they’ll only find socket messages in
there. Happily, the fix is really small:
diff --git a/src/principe.erl b/src/principe.erl
--- a/src/principe.erl
+++ b/src/principe.erl
@@ -718,7 +718,7 @@
{error, conn_closed};
{tcp_error, _, _} ->
{error, conn_error};
- Data ->
+ {tcp, _, _} = Data ->
ResponseHandler(Data)
after ?TIMEOUT ->
{error, timeout}
Incidentally, MongoDB is about 4x as fast as Riak, and TokyoTyrant is about 2x as fast as MongoDB. MongoDB and TokyoTyrant are faster than opening a file directly on disk for each small record. Riak is about as fast as direct filesystem access. That’s almost certainly due to memory caching + bulk writes used by MDB & TT, vs. many, many inode create/write/close calls.