Erlang + TokyoTyrant / TokyoCabinet / medici

|
|

My first real, significant, observation about Erlang is this piece of advice to new Erlangers: repeat this mantra:

Mailboxes are not queues!
More experienced devs than I get this wrong.   In any case, I’ve been working with Erlang and TokyoCabinet/Tyrant at work – I’ve been using MongoDB, too; it isn’t clear yet which will best fit our needs – and I’ve been using medici by Jim McCoy, a set of libraries for interfacing with TokyoTyrant/Cabinet.  I found a bug today in the principe module, which the following code demonstrates:

-module(principe_test).
-export([runtest/0, loop/0]).

mongod() -> “mongod”. % Where’s your mongod executable?

runtest() ->
start_mongo(),
Inserter = spawn_link( fun ?MODULE:loop/0 ),
Inserter ! quit,
Inserter ! go,
timer:sleep(500),
{ok, Fd} = file:open(“mongod.pid”,[read]),
{ok, Data} = file:read_line( Fd ),
os:cmd(io_lib:format(“kill -HUP ~p”,[string:strip(Data,both,10)])).

start_mongo() ->
filelib:ensure_dir(“test/file”),
Cmd = io_lib:format(
“~s –dbpath ~s –port 9999 –fork –logpath ~s –pidfilepath ~s”,
[ mongod(),
filename:absname(“test”),
filename:absname(“mongod.log”),
filename:absname(“mongod.pid”) ]),
os:cmd(Cmd),
timer:sleep(1000).


loop() ->
{ok, P} = principe:connect( [{port, 9999}] ),
receive
go -> principe:put( P, “key”, “value” )
end.
The problem is that principe:get/3 (and many other functions) use receive to get call-backs from the socket library, and assume that they’re going to be receiving messages from only the socket library.  This is a bad assumption.  In effect, the methods hijack the mailbox of the calling process and then assume that they’ll only find socket messages in there.  Happily, the fix is really small:
diff –git a/src/principe.erl b/src/principe.erl
— a/src/principe.erl
+++ b/src/principe.erl
@@ -718,7 +718,7 @@
{error, conn_closed};
{tcp_error, _, _} ->
{error, conn_error};
- Data ->
+ {tcp, _, _} = Data ->
ResponseHandler(Data)
after ?TIMEOUT ->
{error, timeout}
Incidentally, MongoDB is about 4x as fast as Riak, and TokyoTyrant is about 2x as fast as MongoDB.  MongoDB and TokyoTyrant are faster than opening a file directly on disk for each small record.  Riak is about as fast as direct filesystem access.  That’s almost certainly due to memory caching + bulk writes used by MDB & TT, vs. many, many inode create/write/close calls.

Copyright © Sean Elliott Russell

comments powered by Disqus