For a while now I have been following news about Erlang. I’ve always been interested in learning about new languages and paradigms, if only to get some fresh ideas into my day-to-day work using Python. I felt interested in Erlang for a couple of reasons: it seems to to be a good candidate for writing heavily concurrent software spanning processors and machines, it’s in use by some of the bigger players today, plus I’ve always enjoyed the functional paradigm starting from my early days of programming with Scheme.
Before diving in fully I wanted to write something simple just to get a feel of the language. Pattern matching is an integral part of the language but what caught my eye was that you can actually use pattern matching down to the level of individual bits when reading in binary data. This gave me the idea of porting the small SWF metadata parser package I wrote some time ago to Erlang to see how it compares with the Python implementation.
After playing with it for a while I came up with the following implementation for the parser:
parse(Filename) -> {ok, Swf} = file:read_file(Filename), {Compression, Version, _, Payload} = parse_signature(Swf), {Xmin, Xmax, Ymin, Ymax, Frames, Fps} = parse_payload(Payload), {version, Version, compression, Compression, width, Xmax-Xmin, height, Ymax-Ymin, bbox, {xmin, Xmin, xmax, Xmax, ymin, Ymin, ymax, Ymax}, frames, Frames, fps, Fps}. parse_signature(<<"FWS", Version:8, Size:32/little, Data/binary>>) -> {uncompressed, Version, Size, Data}; parse_signature(<<"CWS", Version:8, Size:32/little, Data/binary>>) -> {compressed, Version, Size, zlib:uncompress(Data)}. parse_payload(Payload) -> % Extract the RECT field size <<Nbits:5/little, _:3, _/binary>> = Payload, % Calculate the byte alignment ByteAlign = (8 - ((5 + 4 * Nbits) rem 8)) rem 8, % Pattern match the RECT element <<_:5, Xmin:Nbits/big-signed, Xmax:Nbits/big-signed, Ymin:Nbits/big-signed, Ymax:Nbits/big-signed, _:ByteAlign, Frames:16/little-unsigned, Fps:16/little-unsigned, _/binary>> = Payload, % Convert the RECT values from Twips to pixels {Xmin / 20, Xmax / 20, Ymin / 20, Ymax / 20, Frames, Fps}.
The bit-level pattern matching really worked to its advantage in this particular case and I feel the parser implementation in Erlang is much more elegant than the code I wrote in Python. Basically the two functions: parse_signature and parse_payload do all the work in only 9 lines of Erlang code (without pretty printing and comments). I had to split the implementation into two parts because the SWF format contains variable length bit fields. A more uniform format using byte-aligned fields would have been even easier to parse. The full code is available in GitHub should anyone be interested in taking a closer look.
All in all, I enjoyed this little exercise and will keep learning more about Erlang in hopes of possibly integrating it into my list of useful tools for future projects. If you’re interested in learning more about what you can accomplish with Erlang there is a pretty cool Erlang related series of posts about creating a “Million user Comet application with MochiWeb” (part I, part II, part III).
Update 2009-02-03: Fixed a bug in the second parse_signature clause that failed to decompress SWF files correctly.