tef

bad poster & mediocre photographer

  • they/them

sometime over a decade ago, i got badly nerdsniped at work. since then, i've written posts, demo code, and even give talks about doing weird things with rpc. the worst bit is that i'm still learning things.

anyway, here's a quick summary of where my little "rpc toolkit" side project is, and how i got there:


it is 2012 and I am bored at work.

we have a little distributed webcrawler, and it comes with some odd requirements. we need something pure python that handles sending over both unicode strings and raw bytestrings, and json and base64 won't cut it. we end up hacking together something with http, bencode and decorators, and it sorta does the job, but it's clumsy.

herein lies the problem that started the itch. i would like to write code like this:

api = Connect()
workQueue = api.QueueFor(worker_id)
while workQueue.Active():
    job = workQueue.Next()
    result, error = Process(job)
    job.Complete(result, error)

i admit, it's not great code, but it has objects, those objects carry state around. what queue i'm looking at, which job i'm reporting. these are things you cannot take for granted when writing rpc code.

this is what you actually have to write in the year of our lord, 2024:

api = Client()
queue_id = api.QueueForWorker(worker_id)
while true:
    result = api.QueueActive(queue_id)
    if result.Active.Bool():
        break
    result = api.QueueNext(worker_id, queue_id)
    job_id = result.JobId.String()
    jobResult, error = Process(job.JobDescription.Dict())
    api.JobMarkComplete(job_id, worker_id, result, error)

just look at it. a complete fucking nightmare to maintain.

every single method is reduced to a function with a variety of arguments the objects kept hidden from you. now consider adding a new bit of state to the code. in the first example, you barely do anything. in the second example, you dodge and weave through the code to implant a new integer somewhere amongst the functional calls, and hope for the best.

it's just really clunky, and neither I, or my coworker were willing to put up with it anymore. we had everything we needed for a side project. we'd both worked in screen scraping, restful was in fashion, and somewhere along the lines one of us had suggested "what if we make the api more like a website, with links and forms", and somehow we thought it would be fun.

  • let's define our api code on the server
  • use reflection to generate an admin-style webpage for the object
  • each webpage had links, forms, and data, for each attribute or method on the object
  • clicking on those links took you to other pages, which could contain plain old data, but could also be other object-admin-pages
  • we then wrote a screen scraping library

it's a bit of a mouthful, but it worked.

our code had objects again.

api = Connect() // Get the Admin webpage 
workQueue = api.QueueFor(worker_id) // Submit the form named 'QueueFor'
while workQueue.Active(): // Submit the form named 'Active' on the workQueue page
    job = workQueue.Next() // Another form
    result, error = Process(job)
    job.Complete(result, error)  // Yep. Another form.

the objects in question are just wrappers around our webpages. calling foo.bar() scanned for a form, and submitted it. underneath, an object might map to a url like /Queue/?id=123 or /Job/?id=job_id, but the client had no idea. it just looked for forms called "Next", and followed the link it was given.

by using names for navigation and urls for state, we could add in new parameters to the urls without changing any details on the client. we could even return objects that didn't exist on the server, hiding enough state in the url to reconstruct it when the next request arrived.

sure, it wasn't the fastest or nicest rpc system in the world, but it was one of the few that had objects, without needing schemas or code generation. never having to write a client stub to rethread state was the well deserved reward for our fucking around.

that was the point i got nerd sniped.

i was convinced.

to me, this was how rpc should be done, or at least, a lot of people could genuinely benefit from the approach:

  1. we'd built a reflection service for our rpc service. being able to connect to any service with the same library was very useful for debugging.
  2. we'd built something using state-transfer. hiding object state inside urls made for a very pleasant experience.

unfortunately, "a lot of people" probably meant about two or three people, tops. most programmers i'd met hadn't worked on distributed systems, things with rpc calls, or things with enough rpc calls to warrant spending any engineering time on making things better.

... and i hadn't made things better for everyone.

sure, it worked very nicely in python, but as soon as you picked up a more static language, you entered a world of pain: the idea of navigating an api through reflection isn't a pleasant one.

even so, it's a fun idea, and i kept hacking away. i moved away from webpages and started using more "typed" results, hoping to make things nicer in static languages. i wrote experimental command line clients, with things like command line completion, and even some nice text layout too.

it turns out the ten years has mostly been turd polishing, alas.

fast forward to 2024 and I have bummed out of another golang job, and I start scratching the itch again. this time i'm going to make things work in a static language, no matter what.

... and I might have made it work, ish.

api = Connect() 
workQueue = api.Call("QueueFor", worker_id) 
for workQueue.Call("Active", ()) {
    job = workQueue.Call("Next")
    result, error = Process(job.Call("Args"))
    job.Call("Complete"(result, error)

yeah, it sucks.

i won't get that lovely object api without generating stubs, and as soon as you move to a more traditional api.Invoke("method"...) style api, you lose a lot of the magic that charmed me to begin with. there's almost no benefit to building reflection services for static clients.

that said, there are still benefits to the approach. having some api.Call().Call() method chain of requests thread state from one to the next invisibly is kinda neat, and there's barely any reflection or self description involved.

you could do the same by adding a single int to the protocol.

  • instead of sending over call <method_name>" i send over call <object_id>, <method name>
  • instead of sending back result <value>, i send back result <object_id>, <result>
  • api.Call().Call() will use the object id returned from the first call in the second, linking them together in the protocol
  • now, every rpc call can limit the scope of the ones chained after it, add new state, or offer different methods entirely.
  • you can think of it as namespacing methods, or you can think of it as just having an object address too

the actual code in question does something more fancy. it addresses objects as a (int, string list) pair, and returns new addresses as a (int, string list) pair, so it can smuggle state back to the client.

again, it's not great, but it does work well enough. i'm not writing stubs, i'm not threading state by hand, and it is still an improvement. i don't know if the other three people will agree with me, though.

it is genuinely weird to spend a decade playing around with hypermedia only to end up going back to something that looks fart more like rpc than anything else. that said, by sending over relative addresses, not absolute addresses, the server remains in control of the namespace. i already wrote about that, but it does bear repeating.

the big idea in hypermedia is that although the client operates in terms of addresses, it navigates to them through named links given to it by the server. this is what lets you move things around behind the scenes, and lets you carry state around between requests. it just never occurred to me that i could do it server side, and skip all the navigation.

i've spent a decade asking how to make apis more like websites and it turns out i should have also been asking how to make websites, well, hypermedia systems, more like rpc.

whoops


You must log in to comment.