tpl history

A brief history of tpl

Development of tpl began in 2005. The author was developing a multi-process server. The server had several database "loader" sub-processes that generated over half a gig of data, in the form of C structures. These needed to be read into the parent process.

Rumination

There were boring and interesting ways to approach this problem. The boring way would be to write the C structs to disk and read them back in the parent.

The interesting way

There is nothing wrong with writing C structures to disk, but it'd be far more useful to solve the binary data interchange problem at a more general level.

XML?

Sure, the half gig of binary data (mostly bit vectors encoded as integer arrays) could be stored in XML. But binary data interchange isn't XML's strong suit. XML is inconvenient to parse in C, and it's extremely bulky. Why sacrifice the speed of a modern processor by converting all its native data representation to text and back?

Binary skeptics

Many criticize binary formats because they're often proprietary, hard to inspect, non-editable, non-extensible. But, I had in mind an open binary format, that could be easily inspected (via an XML conversion utility), and read or written through a standard API.

Inspiration from D-BUS

A Linux Journal article by Robert Love, entitled "Get on the D-BUS" sustained my interest in binary protocols (D-BUS itself sends typed binary messages). D-BUS also had a clever approach to type-safety. Message types are expressed as a format string, and stored in the message itself. That idea was too good not to steal. A format string like A(i) tells the recipient, "This is an array of integers." But, it also tells the API how to parse the binary payload.

Just add API

I wanted to find an API that would be compact and innocuous-- one that didn't look out of place in a C program. I really wanted to avoid the kind of API that requires writing a call for each data item being read or written. ("Now read an int. Now read a double. Now read a char[2].") Mapping the format string to C variables solved that problem. By saying, "pack index 1" the API could do all the work of finding all those C variables and copying their current values, in one fell swoop.

Quanta

I wanted the serialized form to constitute a discrete unit, or packet. A packet encapsulates an entire package of data. If several packets are stored in a file back-to-back, or sent across a socket, I wanted the recipient to be able to discern the byte boundaries of the individual packets. The key would be to have a fixed header section that encoded the overall packet length, among other things such as endian order, and its format string.

IPC

These packets provide a convenient basis for inter-process messaging (IPC). Sending and receiving whole messages (which is how we construe a packet when it's used for IPC) is much more convenient than using lower-level socket reads/writes. The API would have to let us load or emit a whole packet at a time.

Validity

A strictly-defined binary packet format would also allow the recipient to validate its well-formedness programmatically. No DTD? Great!

Late nights

Writing open-source projects is not for the weakly-committed. All the ingredients were coming together but it took a long time for me to get the internals of the tpl data structure right. In particular, the first version of tpl (Sep 2005) didn't support nested arrays. One lesson I learned in developing tpl and earlier projects, is that you'll find more bugs if you compile and test on two (or more) distinct OS/compiler suites. I repeatedly found bugs under Solaris that were invisible under Linux. And speaking of debuggers, gdb is great, but Sun's dbx with its "check -access" is invaluable.

A time to be born

I managed to get tpl ready to release in September 2006. You never really finish a program, you just decide to stop fiddling with it. At least, I did for now.

documentation

download

mailing list

about

other projects

developer