I did a proof of concept in ruby, but it certainly won't scale. After reading the SEDA stuff, and playing with Java NIO a bit, I figured I might actually be able to get away with doing this in Java instead of OCaml or C.
First fruits are paying off. I didn't go for a full fledged SEDA architecture (though I may base a future version on Sandstorm) but a simple worker pool handling requests, and it scales up pretty nicely. I need to put it side by side with Pound and see how it does. I expect it to underperform on a single instance, as Pound is pretty well done, and much closer to the metal, even with NIO. That said, Pound doesn't do failover or clustered spraying, so I am willing to bet that the actual scalability of monica will surpass Pound pretty quickly. Need spare machines. I don't think I am as efficient as possible in the worker threads at the moment, each thread is dealing with a BufferedReader on a Socket. NIO Buffer access may be quicker, may not. May need to hit Borders and nose through O'Reilly's NIO book.
Also, I may look into swapping out to SEDA's NBIO library, the java.nio -- java.net combo seems like something tacked on, despite being the primary benficiary of NIO. We shall see. Need to spend more time on OJB... Argh. Too many itches.
3 writebacks [/src/monica] permanent link
Thu, 21 Aug 2003As I need an arbitrarily scalable load balancer, and I know of know general purpose free/open one... Time to scratch my itch. Too many itches at the moment
Basically monica is a smart sprayer with hooks for the workers to communicate back. It is designed such that all monica instances replicate state across all other monica instances. However, the state is very small (typically two 32 bit integers per session) and they don't do much work. Ideally a low number of machines running monica can service a very large population of workers.
Monica will tag each http session via cookie or uri tagging (ala JSESSIONID) and use said tag for all future spraying. The tag will map to a bucket. The bucket can be X number of IP addresses. The spraying token will name a bucket, and a machine in the bucket will receive the work (any machine in the bucket). Typically it will jsut round-robin within a bucket - eventually behavior of mapping to what ina bucket may be pluggable. Round robin is fast though, and fast is good because monica clusters are not highly scalable (all monica instances share state) whereas the number of buckets should scale up very high.
Now, the construction of the state tokens that monica needs to share. In order to facilitate scaling monica works in broadcast mode. When an instance creates a new state mapping it broadcasts the mapping to the entire network. Now, to avoid mapping key collisions each instance needs to prepend each of its tokens with a unique id for that instance. So... when a monica instance comes live it broadcasts a "hello" and waits 10 seconds for replies. Each monica instance hearing the hello broadcasts its id. The new monica instance then builds a new id that doesn't conflict with any of the existing ones. This should be random. It then broadcasts its new id to double check against race conditions of two monica instances coming up at the same time. If a monica instance receives an id broadcast with its id, it renogiates a new id. Once the new instance is live, it makes a TCP "update me" request to one of the monica instances that replied. It begins logging all broadcasts while this is built. Once it is built it applies new broadcasts to its state table. This allows a monica instance to come up very quickly and maintain accurate state.
When a new session is established monica broadcasts the session identifier (8 bit id + 24 bit sessionid) allowing for each monica instance to have 24 million sessions live at a given time. If this isn't sufficient we can tweak the id size, or the sessionid size to allow for more. Assuming eight monica instances as access points via round-robin dns that gives (8.0 *. 2.0 ** 24.0) 134,217,728concurrent sessions, assuming one quarter are live and nine tenths are idle and not yet timed out, that allows around 3-4 million concurrent users. This solution may not work for google, but anyone else is probably okay - and google maintains all of its state in the uri anyway, so it doesn't need server affinity. A more practical limit is the number of TCP sockets available. ID size is fine.
Buckets are initially configured in the monica configuration, however a bucket can broadcast its existance if it is not in the monica configuration. As adding and naming buckets in the monica configuration is the easy part, the protocol for announcing a new bucket is the interetsing bit. As the API for bucket -> monica communication is not yet defined, lets take a look at it. As this communication shouldn't happen often, and is pretty important to the bucket, we can move to TCP for this. A bucket opens a connection to a known monica instance, transmits its command, gets acknowledged, and closes its socket.
Now, the bucket-announcement message is received by a single monica instance. It prepends its monica-id to a 24 bit number and broadcasts it to the monica network, appending the relevant IP addresses (why not encode them as nice 32 bit integers) to the announcement message. Voila, bucket now starts to get load.
Speaking of the bucket -> monica interface, lets look at what other messages a bucket can send:
Hmm, not bad. A formalized protocol for buckets to talk to sprayers is a nice idea even if monica never takes off.