Tame v1 is supported in sfslite version 0.8. As of sfslite version 1, we've migrated to Tame v2, documentation for which is available here.
tame is a generic tool for simplifying libasync programming,
useful with OKWS, but also for any other libasync-based
application. It is available under the sfslite distribution.
tame is a simple C++-level rewriter. Input files look a lot like
regular C++ files, with some small modifications. Output files are
standard C++ files that are then compiled and linked against the
libasync libraries to produce your library or application. Files
generated with tame can call legacy libasync applications, and
legacy libasync applications can call tamed functions,
without any glue.
We have a paper under submission about the tame system. If you read the paper and this Wiki, you might notice some disagreements about syntax. We'll be closing those differences with future versions of tame but for now, if in doubt, use the Wiki semantics.
Here is a simple complete program written with tame (code can be
found in the sfslite distribution under tutorial/ex1.T or via
anonymous CVS
here). All of these examples assume that you have installed sfslite. To enable building the tutorial files given below, configure sfslite with the –enable-tutorial flag to configure.
#include "tame.h" #include "parseopt.h" TAMED static void try_connect (str h, int port) { VARS { int fd; } BLOCK { tcpconnect (h, port, @(fd)); } if (fd >= 0) { warn << "Connection succeeded!\n"; exit (0); } else { warn << "Connection failed!\n"; exit (1); } } int main (int argc, char *argv[]) { int port; if (argc != 3 || !convertint (argv[2], &port)) fatal << "usage: ex2 <hostname> <port>\n"; try_connect (argv[1], port); amain (); }
If we put this code into the file ex1.T, then we can compile and run it as follows (of course, depending on where the SFS libraries are installed on your machine):
% tame -o ex1.C ex1.T % g++ -I/usr/local/include/sfslite -g -Wall -Werror -Wno-unused -c ex1.C % g++ -g -Wall -Werror -Wno-unused -o ex1 ex1.o -L/usr/local/lib/sfslite -lasync
Now you can run this little program to see if you can connect to various TCP ports around the internet:
% ./ex1 www.yahoo.com 80 Connection succeeded! % ./ex1 am.lcs.mit.edu 23 Connection failed!
This is cool to a libasync developer because the function try_connect looks like a single function, even though it's calling tcpconnect and then waiting for a response. Traditionally, another function would be explicitly required to accomplish the same task, as in the following example:
static void try_connect_cb (int fd) { if (fd >= 0) { warn << "Connection succeeded!\n"; exit (0); } else { warn << "Connection failed!\n"; exit (1); } } static void try_connect (str h, int port)) { // tcpconnect: // Try to connect to host h, on port p; when the // verdict is in, call the function try_connect_cb, // with either the filedescriptor of the new socket, // or <0 if the connection failed. tcpconnect (h, port, wrap (try_connect_cb)); }
Thus, the tame tool has spared the the programmer from having to call wrap (which prepares a closure and a continuation) and also from having to explicitly declare/define the callback try_connect_cb. Though so doing in this example would not have been much of an inconvenience, one can imagine that real code in libasnyc can get very hairy, very quickly.
First, note the additional include of the header tame.h, which is installed with sfslite in include directory:
#include "tame.h"
Next, note the TAMED keyword:
TAMED static void try_connect (str h, int port) { ....
This syntax tells tame that it should rewrite the function
try_connect to use tame-style closures and continuations. In the
case of static functions, as we have here, tame will first make a
static function declaration, then output its internal classes and
functions, and then output the function definition. For non-static
functions (such as class methods), tame will not output the
function declaration, and assumes the programmer has done so
manually. There is one wrinkle to manually writing function
declarations, which is that tame automatically inserts a last
argument to functions that it rewrites, and manually-written function
declarations need to be consistent with this. For instance, the above
code could equivalently be expressed as:
static void try_connect (str h, int port, CLOSURE); TAMED void try_connect (str, int port) { ....
The preprocessor macro CLOSURE expands to
#define CLOSURE ptr<closure_t> __cls_g = NULL
The point is that the user gave the function try_connect with the
two arguments 'str h' and 'int port.' The tame filter adds
the optional last argument, which it uses internally to make closures
within the function try_connect. You can safely ignore this last
argument (cls_g), since your code will never reference it.
Within a TAMED function, the tame filter looks for four types of blocks: VARS { … }, BLOCK { … }, WAIT {…}. We discuss the first two here and the last later on in this document. The VARS {…} block is used to allocate stack variables. In this manner, tame programming is like old-fashioned C programming: local variables should be declared all at the beginning of the function (or method) and then all occupy the same scope. C++ has some neat scoping rules based on blocks within functions/methods, but tame does not support them.
The code given is of the form:
VARS { int fd; }
Meaning that the programmer only wants one local stack variable, an integer whose name is 'fd.' Simple enough.
The next type of block is the 'BLOCK { … }' block, which is used to make several parallel asynchronous functions and continuing on in the function only after all have completed:
BLOCK { tcpconnect (host, port, @(fd)); }
The idea is that all asynchronous calls within the 'BLOCK { … }' block (such as tcpconnect in this case) are called, and when the last one returns, control will return to the next line after the BLOCK block, with the exact same stack configuration. Of course there is other new syntax here. Recall that in libasync, one interface for connecting to a remote host via TCP is:
tcpconnect_t *tcpconnect (str hostname, u_int16_t port, cbi cb, bool dnssearch = true, str *namep = NULL);
Our code is using the first three arguments of tcpconnect and is ignoring the return value. The new syntax here is @(fd), which will tell tame to generate a wrap of type cbi. Recall that in libaysnc:
typedef callback<void, int>::ref cbi;
That is, cbi is a callback, that should be called with one argument of type int. By convention, tcpconnect will call cb internally after it has completed its computation, with an integer representing its return value. In general, say that we have variables a,b,c of types A,B,C, respectively. Then, code of the form:
@(a,b,c)
Will compile to a callback of the form:
callback<void, A, B, C>::ref
That is, a callback that must be called with three variables, the first of type A, the second of type B, and the third of type C. To accomplish the generation of such a callback, tame will first generate a function with a signature like:
void anonymous_callback_1 (A,B,C);
And tame will then substitute @(a,b,c) with wrap (anonymous_callback_1).
Returning to our example, the syntax @(fd) also tells tame that when tcpconnect finally returns via 'cbi cb', to stick that integer result into the local variable 'int fd' in the function try_connect. Under the hood, the logic of the auto-genreated callback is simply to load the parameters passed to it into the object that contains the tamed function's stack. This gives the illusion of directly setting a stack variable without switching to a different function call (such as the callback).
The next example ex2.T is slightly more involved but does not require any new syntax or constructs. It starts to get at the convenience that tame can provide libasync programmers. Note that this example uses an RPC protocol (ex_prot.x), has a corresponding RPC server(exsrv.C), all of which can be found in the sfslite CVS distribution under tutorial/.
As a quick libasync refresher, the object involved with making asynchronous RPC calls is called an aclnt. The most relevant method for the acnlt object is the aclnt::call function, whose signature for our purposes is:
callbase *call (u_int32_t procno, const void *in, void *out, aclnt_cb);
Note there are optional arguments that have been omitted for clarity. The first argument to call() is the RPC procedure number that will be called, which is often represented by an enumerated value in C/C++ code. The next two arguments to call() tell it where to find the argument to the RPC call, and where it should put the response when the RPC completes. Note that the out argument needs to be dynamically allocated and persist past the scope of the caller, since it will be set arbitrarily long in the future (accounting for network delay and so on). The final argument is the callback to call once the RPC completes. aclnt_cb is given by:
typedef callback<void, clnt_stat>::ref aclnt_cb;
The clnt_stat type is the standard RPC enumerated type for reporting the status code of an RPC call. Thus, aclnt_cb is a callback that expects to be called with one argument, of type clnt_stat, that will report how the RPC fared — did it succeed, did it timeout, was it rejected due to a protocol mismatch, etc.
The code below makes several RPCs in a parallel; this idiom is common in asynchronous network code.
#include "tame.h" #include "parseopt.h" #include "ex_prot.h" TAMED static void try_rpc (str h, int port, cbb cb) { VARS { bool ret (false); int fd, r1; // standard libasync class that wraps a TCP connection ptr<axprt_stream> x; // standard libasync class for an anonymous RPC client ptr<aclnt> cli; // the ex_* datatypes were generated from the input // protocol file ex_prot.x, for demostration purposes. // // ex_str_t - an RPC string // ext_struct_t - a simple RPC struct, with two fields // of type ex_str_t and unsigned. // ex_str_t r2, a2; ex_struct_t r3; // standard RPC return codes, that will be set when our // RPCs complete. clnt_stat e1, e2, e3; } BLOCK { tcpconnect (h, port, @(fd)); } if (fd < 0) { warn ("%s:%d: connection failed: %m\n", h.cstr(), port); } else { // great, we have a working TCP connection; now, simply // wrap it in a libasync object, so we can send and // receive RPCs over it x = axprt_stream::alloc (fd); // Given the TCP connection x, make an RPC client that // will speak the protocol declared in the file ex_prot.x cli = aclnt::alloc (x, ex_prog_1); // load in values into RPC arguments a2 = "go hang a salami i'm a lasagna hog"; BLOCK { // // EX_RANDOM, EX_REVERSE and EX_STRUCT are RPC procedure // numbers, defined automatically by the RPC compiler. // cli->call (EX_RANDOM, NULL, &r1, @(e1)); cli->call (EX_REVERSE, &a2, &r2, @(e2)); cli->call (EX_STRUCT, NULL, &r3, @(e3)); } if (e1 || e2 || e3) { warn << "at least 1 RPC failed!\n"; } else { warn << "the results are in:\n" << "\trandom # = " << r1 << "\n" << "\treversed string = " << r2 << "\n" << "\tstupid stuct = { s = " << r3.s << "; u = " << r3.u << " }\n"; ret = true; } } (*cb) (true); } static void finish (bool rc) { exit (rc ? 0 : -1); } int main (int argc, char *argv[]) { int port; if (argc != 3 || !convertint (argv[2], &port)) fatal << "usage: ex2 <hostname> <port>\n"; try_rpc (argv[1], port, wrap (finish)); amain (); }
The thing to note here is that after tcpconnect returns successfully via the BLOCK mechanism discussed above, a volley of 3 RPCS are launched in parallel. When the last RPC returns, control picks up right after the corresponding BLOCK block. Also, the new function try_rpc does not call exit directly; rather, it returns to the caller via the continuation cbb cb passed into the function.
A major complaint about event programming is that it destroys intuitive control flow constructs, such as while loops. The next example (ex3.T) shows how tame helps programmers keep their loops:
#include "tame.h" #include "parseopt.h" #include "ex_prot.h" TAMED static void try_rpc (str h, int port, cbb cb) { VARS { bool ret (false); int fd, n (5), i; ptr<axprt_stream> x; ptr<aclnt> cli; vec<int> rv; vec<clnt_stat> status_codes; } BLOCK { tcpconnect (h, port, @(fd)); } if (fd < 0) { warn ("%s:%d: connection failed: %m\n", h.cstr(), port); } else { x = axprt_stream::alloc (fd); cli = aclnt::alloc (x, ex2_prog_1); status_codes.setsize (n); rv.setsize (n); BLOCK { for (i = 0; i < n; i++) { cli->call (EX_RANDOM, NULL, &rv[i], @(status_codes[i]) ); } } ret = true; // check for n-fold success as usual for (i = 0 ; i < n; i++) { if (status_codes[i]) { warn << "A failure: " << status_codes[i] << "\n"; ret = false; } else { warn << "Result " << i << ": " << status_codes[i] << "\n"; } } } (*cb) (ret); } static void finish (bool rc) { exit (rc ? 0 : -1); } int main (int argc, char *argv[]) { int port; if (argc != 3 || !convertint (argv[2], &port)) fatal << "usage: ex2 <hostname> <port>\n"; try_rpc (argv[1], port, wrap (finish)); amain (); }
This example is slighlty more involved than the previous example because the results from the RPC calls must be put into different vector slots. However, the tame syntax remains the same.
A really cool result of tame is that parallel constructs (like the one above) are very similar syntactically to serial constructs. Recalls that the core of the above example was the following code, which launches a volley of RPCS in parallel:
BLOCK { for (i = 0; i < n; i++) { cli->call (EX_RANDOM, NULL, &rv[i], @(status_codes[i]) ); } }
Let's say that for whatever reason, we now wanted this code to execute in serial: one RPC launches only after the previous finishes. Then, all we have to do is swith the order of the BLOCK and for lines!
for (i = 0; i < n; i++) { BLOCK { cli->call (EX_RANDOM, NULL, &rv[i], @(status_codes[i]) ); } }
void my_existing_function (int i callback<void, ptr<int>, bool>::ref cb) { (*cb) (New refcounted<int> (i+2), false); } TAMED void foo_func () { VARS { ptr<int> ip; bool b; } BLOCK { my_existing_function (1, @(ip, b)); } }
Using classes, and setting class members with '@(..)' is exactly the same as dealing with stack variables:
static void my_existing_function2 (int i, callback<void, ptr<int> >::ref cb) { (*cb) (New refcounted<int> (i+4)); } class my_class_t { void my_method (int i, CLOSURE); ptr<int> k; }; TAMED void my_class_t::my_method (int i) { VARS { ptr<int> j; } BLOCK { my_existing_function2 (i, @(j)); /* exactly as before */ my_existing_function2 (i, @(k)); } }
In this example, we are making two calls to the same asynchronous function, my_existing_function2. In the first case, we ask for the callback to put its result into the stack variable j. In the second case, we ask for the callback to put its result into this→k. Note that tame will work with private members.
The BLOCK mechanism just discussed should be sufficient for a large percentage of control flow in asynchronous networked applications. It provides an intuitive decomposition of a program into parallel and serial components. The parallel components are dispatched from within a BLOCK {…} block. Each component dispatched is a serial component, which may, in turn, contain parallel components. In other words, functions that return via continuation-passing style are the serial components in this model. To make a new unit of serialized computation, one need declare a new function (or class method). Such a program decomposition seems consistent with intuition. By contrast, in manual event-driven programming, functions lose their intuitive correspondence to a program's design, since separate functions are often need to embody “the second half of a computation, after the blocking call.”
Though the BLOCK {…} style of programming is simple and sufficient for many applications, other situations demand more complicated control flow, which tame can accommodate:
When programming Web applications, or other distributed systems, I often find myself “RPC Windowing.” That is, I have 1,000 RPCs to make but want to be polite and not blast them all at once. So the obvious solution is to set a window (like 5) and commit to having at most that many RPCs outstanding at a given time.
At first I thought this sort of thing was possible just given the primitives already described, but there is a deep problem. In RPC windowing, when each RPC returns to the caller, a new RPC must be launched. However, with BLOCK {…} as decribed above, control only returns after all RPCs have returned.
To accomplish tasks with more involved control flow (such as RPC Windowing) a more expressive mechanism is required, given by the nonblocking calls WAIT {…} statement in tame. As we will see later, BLOCK {…} is easily implemented in terms of WAIT.
Here is a sample program (ex4.T):
#include "tame.h" #include "arpc.h" #include "parseopt.h" #include "ex_prot.h" TAMED static void dostuff (str h, int port, cbb cb) { VARS { ptr<axprt_stream> x; ptr<aclnt> cli; vec<int> res; vec<clnt_stat> errs; int fd, n_tot (40), window_sz (5), i; bool err_occurred (false); int return_id; // Declare a coordination group for each set of RPCs that will be launched // asychronously and later joined synchronously. Supply up to // 4 templated parameters to "tag" calls when they are launched. // In this case, we "tag" calls with a single integer. coordgroup_t<int> window; } BLOCK { tcpconnect (h, port, @(fd)); } if (fd < 0) { warn ("%s:%d: connection failed: %m\n", h.cstr(), port); err_occurred = true; } else { res.setsize (n_tot); errs.setsize (n_tot); x = axprt_stream::alloc (fd); cli = aclnt::alloc (x, ex_prog_1); for (i = 0; i < window_sz && i < n_tot; i++) { // launch an asychronous function call, but don't // block on its return. Associate the call with the // coordination group 'window' and the index value 'i' cli->call (EX_RANDOM, NULL, &res[i], @[window,i](errs[i])); } // When the above calls complete, the will send signals // to those waiting on a coordgroup. So we only want // to keep going as long as we have outstanding signals // left while (window.n_signals_left ()) { // Block until a call has returned; when it has, get // the index variable associated with the call, and assign // it to 'return_id'. WAIT (window, return_id); if (errs[return_id]) { err_occurred = true; warn << "RPC error: " << errs[return_id] << "\n"; } else { warn << "Success " << return_id << ": " << res[return_id] << "\n" // in the common case, for every call that returns, // another should be launched. but don't bother launching // more if an error occurred. if (i < n_tot && !err_occurred) { cli->call (EX_RANDOM, NULL, &res[i], @[window,i](errs[i])); i++; } } } warn << "All done...\n"; } (*cb) (!err_occurred); } static void finish (bool rc) { exit (rc ? 0 : -1); } int main (int argc, char *argv[]) { int port; if (argc != 3 || !convertint (argv[2], &port)) fatal << "usage: ex2 <hostname> <port>\n"; dostuff (argv[1], port, wrap (finish)); amain (); }
The details of establishing an RPC-over-TCP connection are exactly as before. The difference comes with how the asynchronous calls are fired off, and later joined.
for (i = 0; i < window_sz && i < n_tot; i++) { // launch an asychronous function call, but don't // block on its return. Associate the call with the // coordination group 'window' and the index value 'i' cli->call (EX_RANDOM, NULL, &res[i], @[window,i](errs[i])); }
Asynchronous calls launched outside of BLOCK are first associated with a pointer to a coordination group, or a coordgroup. A coordgroup is a simple object that will coordinate launches and joins for a given group of events (see async/tame.h for the class definition of a coordgroup). Each coordgroup is also associated with data types used in distinguishing callbacks from each other when they are eventually joined. In the given case, we are launching window_sz RPCs, labelled by an integer from 0 to window_sz - 1. A distinct value for i is associated with each call, and consequently, with each callback.
As before, the callback argument is given with @(..) syntax, but with an extra phase in between brackets, to express the coordination group the new callback belongs to, and also any bound-int values. When this function eventually “returns” by calling its callback, it will set the status of the RPC to the ith slot of the vector errs.
Once all RPCs in the windows are launched, control will continue to the subsequent WAIT statement:
// a join is needed for every call that returns while (window.n_signals_left ()) { // Block until a call has returned; when it has, get // the index variable associated with the call, and assign // it to 'return_id' WAIT (window, return_id); if (errs[return_id]) { err_occurred = true; warn << "RPC error: " << errs[return_id] << "\n" } else { warn << "Success " << return_id << ": " << res[return_id] << "\n"; // in the common case, for every call that returns, // another should be launched. but don't bother launching // more if an error occurred. if (i < n_tot && !err_occurred) { cli->call (EX_RANDOM, NULL, &res[i], @[window,i](errs[i])); i++; } } }
The condition in the while loops returns true so long as the coordination group has any RPC calls oustanding, or calls that have not yet been joined. Thus, assuming that window_sz and n_tot are greater than 0, control will reach the WAIT statement at least once. As the comments suggest, control will appear to block at the WAIT statement until one of the RPC calls from above completes. When control passes the WAIT statement, the closure variable return_id will be set with the values bounds to the joined asynchronous call at call time. If the call launched when i==2 above, then return_id will have the value 2. In this way, code within the WAIT block can respond to the specific RPC that returned.
In the above example, the declaration of the coordgroup_t and all references to it are in the same function, but keeping all references to a coordgroup_t within a single function is not required. New callbacks can be added to coordination groups and can be subtracted via WAIT from anywhere where the given coordination group is in scope. Of course, coordgroup_ts are regular C++ objects, so they can be passed to functions, declared extern, etc.
It is suggested that all coordination variables created with @ calls are later WAITed on, and I have trouble thinking of a case in which one would not want to WAIT on all outstanding calls. However, programs should still work as expected if zombie calls are leftover. That is, the coordgroup will only be deallocated once all of its callbacks have been called.
Another example that comes to mind is to make a equivalent request of N replicas, and to continue the computation only with the first to respond. Such a policy is easily implemented with WAIT:
TAMED static void connect (vec<str> hosts, int port, cbi done) { VARS { u_int i; ptr<int> fd; coordgroup_t<u_int, ptr<int> > CG; bool got_one (false); } for (i = 0; i < hosts.size (); i++) { fd = New refcounted<int>(); tcpconnect (hosts[i], port, @[CG,i,fd](*fd)); } while (CG.n_signals_left ()) { WAIT(CG, i, fd); warn << hosts[i] << ":" << port << ": "; if (*fd >= 0) { warnx << "connection succeeded"; if (!got_one) { (*done) (*fd); got_one = true; } else { warnx << "... but too late!"; close (*fd); } warnx << "\n"; } else { warnx << "connection failed\n"; } } if (!got_one) (*done) (-1); }
This code is part of a larger example (ex7.T) that given N different hostnames, does parallel DNS lookup and TCP session establishments. Whichever host is ready to go first is kept, and the others are closed. Then, a Web request is issued, read, and dumped to standard output.
In my short experience using tame, I've done most of my work with the more straightforward BLOCK {…} construct, and I think that most people will have the same experience. For this reason, tame implements BLOCK as a primitive, and not in terms of more expressive WAIT construct. The real advantage to BLOCK as a primitive is that the code that tame emits is much simpler for BLOCK, and therefore, programmers will find this code easier to debug.
However, as an exercise, one is easily implemented in terms of the other. The following input code:
BLOCK { /* blocking code */ } /* code after the block */
Is equivalent to:
VARS { coordgroup_t<> __fresh_join_group; } /* blocking code, in which all callbacks @(a,b,c) are * rewritten as @[__fresh_join_group](a,b,c) */ while (__fresh_coordgroup.n_signals_left ()) { WAIT (__fresh_coordgroup); } /* code after the block */
An example to test this is ex3b, which does the same this as ex3 but with non-blocking flow instead of BLOCK.
The hidden danger in using tame is that it's possible to leak closures. For each function wrapped with TAME, the rewritter creates and allocates a function-specific closure, capturing all arguments passed to tamed function, and all values declared within VARS {…}. We would want these closures to be automagically deallocated when control leaves the tamed function for the last time. And it does so long as nothing else is refering to the closure.
See the paper version of tame for more information on this.