tef

bad poster & mediocre photographer

  • they/them

here's my problem: i want to have some sort of api, i'd like to have crud operations, and i'd like to be able to browse and search the api.

it's not a lot to ask, kubectl does all these things and more, but it's very tied to kubernetes itself. time for a side quest.

now i could use gRPC, i could hack something up with HTTP, but remote filesystem protocols do a lot of these things already: they're crud, you can browse them, and sometimes you have a HTTP like POST operation to tell a fileserver to do something magical, or you can fake it with special directories. most importantly? i can use the same file system client for lots of different servers. the old plan9 trick.

that's why i started asking "hey, what if kubernetes-alike spoke 9p" and then "what changes could we make to 9p to make it a better fit for these sorts of services". yes, 9p exists, it's supported, but that's not as fun as playing around with a new design.

i have needs. let's go think about protocols. let's think about 9p.


9p is a inode and file descriptor based protocol. roughly speaking it looks like this:

# First api request gets things like root node id
config = server.Auth(...) 

# You can list directories manually
entries = server.Readdir(config.Root, count=10, offset=100)
# or walk to a specified file from a relative path
inode = server.Walk(config.Root, "/home/rob/foo")

# Then it's a very file-descriptor like API to update things
fd = server.Open(inode)
fd.Write(...); 
fd.Read(....);

let's dig in and make some changes.

let's fix ReadDir first. a count, offset pair isn't the best way to paginate a list. it would be better to pass around some opaque wodge of bytes back and forth, so that servers can do more efficient lookups when paginating.

listing = server.ReadDir(Root, "")
while listing is not nil {
     print(listing)
     listing = server.ReadDir(Root, listing)
}

looks nice enough. next, let's remove the file descriptors.

we'll add a Get(inode) and a Put(inode, contents) method onto the server, as our imaginary client is always going to be doing file-at-a-time updates, and it saves us having to keep track of open cursors on the server.

server.Get(Root)
f = server.Walk(Root, "/My file")
server.Put(f, "hi")

hmm.

a little suspiciously http like, but that was our intention. while we're streamlining operations, it's worth noting that a Walk will usually be followed by Get or a Put. what if we changed those operations to take a relative path, as well as an inode:

Root = server.Walk(config.Root, "/home/tef")
server.Get(Root, "/my file.txt")
server.Put(Root, "/my file.txt", "hello")

hmm.

hmmmmmm.

let's take a step back. have we just reinvented http? what gives? can we just drop the inodes and call it a day?

let's go back and ask why we used inodes to begin with:

  • inodes are unique: because of hard links, two file names can point to the same file, but every inode points to a different file.
  • inodes are stable: inodes stay the same even when you rename things.
  • storing things as inodes and directories underneath makes renaming things very cheap, and makes lookups very fast, for reasons i won't dive into.
  • this means a 9p client can cache files, even when things are being fucked around with, or even work offline.

offline copies, cached files, these are two polite ways of referring to a horrific problem: replication. inodes provide unique stable keys, and we'd have to add them if they weren't already there.

similarly, directory entries are necessary for replication, too.

forgive me for not going into details, but to replicate deletions, you either need to use tombstone entries, or you need to return lists of keys still considered valid. that second one is a directory entry. a client can pull over whole directory entries to trace which inodes are still considered reachable from the root, and thus prune deleted entries. neat, huh?

in the end, "it looks like http" isn't a bad thing, even if we've glued inodes onto the side. let's keep going and see what this clown car looks like. let's go full on HTTP and add a POST method.

it's not the worst idea in the world.

plan9 services do expose "run this thing on the server" to clients, but each one does it in a slightly different way. some expose a file that ad-hoc commands get sent to, others have a ticketing style system where one file returns a path to a directory, and that directory contains control, read, and write files.

it's just so much easier on the server just to have a POST like method. it's even worth asking why plan9 didn't add something themselves. i can't be sure, but there's a few obvious reasons

  • the plan9 people wanted to avoid implementing syscalls, and POST is a big pick-me sign waving loophole to smuggle them in
  • with a file descriptor based protocol, you already have the means of writing something into a file descriptor, POST doesn't add anything new.
  • similarly, pipelining walk and open wouldn't save much time and only add complexity

and while we're here, absolutely no-one sensible would go and make their own filesystem protocol for a network service, so there's no point in asking why kubernetes or other people haven't done this. the real benefits of this approach is common tooling, and that doesn't help when you're the only consumer.

anyway, there's still a few more touches to add.

  • we could implement conditional get and put, passing in a selector to match against version or etag/content id
  • we could pass in selectors to ReadDir to filter out entries
server.Get(inode, path, Where{Equal{Id: 123}})

again, these aren't reasonable demands of 9p, but reasonable feature requests of a protocol for a kubernetes-like store, and again, we're back to reimplementing http. this time we've added query strings to the requests.

it is time to ask ourselves a deeper question. are we re-designing webdav? kinda. webdav has xml, webdav doesn't have inodes. despite making a grotesque monster of 9p, we're not that cruel and heartless.

and we should also admit, we're not exactly reimplementing http. we're doing a http-like protocol, but with stronger constraints around the format of requests and responses. that, and the server resolves links.

well, kinda. forgive me another tangent.

in a more traditional http crud protocol, clients discover urls through some out-of-band-mechanism (discovery api), or they download directory listings and follow ids, or urls inside of them. if a client finds a relative url, a client is expected to resolve it to being an absolute one. here, well, we're doing the opposite. 9p too.

9p's Walk operation saves clients from processing directories to navigate the filesystem, and plugging it into Get and Post means that clients no longer have to resolve relative addresses.

it might not seem like the biggest difference, but it is worth noting that we're moving yet another thing out of the clients hands and into an opaque token, much like we changed ReadDir earlier.

ok, so we're not reimplementing http, but we are copying from the note book.

let's take a look at the final horror


type Address struct {
        Id uint
        Path []string
        Query Selector // some ast for x = y & z != w operations
}

type Entry {
        Id uint
        Name string
}

type Service interface {
        Root() (Address, error)
        Stat(Address) (Address, *Metadata, error)
        List(Address) (Address, []Entry, *Address, error)
        Get(Address) (Address, *Metadata, any, error)
        Put(Address, *Metadata, any) (Address, *Metadata, error)
        Delete(Address) (Address, error)
        Post(Address, any) (Address, any, error)
        Copy(Address, Address) (Address, error)
        Move(Address, Address) (Address, error)
}

let's review some of the neat things:

  • There's no extra arguments to list. It just takes an address, and returns an address to use next.
  • Similarly, Copy and Move don't need string arguments, they pass in a relative address, an inode and path pair.
  • We got rid of walk, every method takes a relative path. You want a id? use Stat.
  • Every method returns an Address, resolved for the client to use.
  • Aside, i like the idea of separate words for "update, overwrite, patch, or create" rather than one "Put" fits all, but that's just a minor issue.

it isn't just adding inodes in the headers of http responses, or having a standard format for directory listings, the whole protocol has been changed around inodes, metadata, and files. it's still very http like though, and trivial to map to http as a transport.

hopefully it makes a great fit for a kubernetes-esque store.

  • services can expose a filesystem
  • clients can browse the filesystem, upload and download files
  • clients can call on special files to perform server side actions
  • inodes allow for easy client-side caching, directory entries allow for pruning deleted files
  • replication is possible, but there's still so much more work to be done.

not in this post though, that's the end of the thought experiment. personally, i'm still not entirely convinced that a 9p webdav cut and shut needs to exist, but it doesn't feel that bad overall.

maybe


You must log in to comment.