boredzo

Also @boredzo@mastodon.social.

Breaker of binaries. Sweary but friendly. See also @TheMatrixDotGIF and @boredzo-kitchen-diary.


posts from @boredzo tagged #software development

also: #coding, #programming

Is there a known hashing algorithm with the following properties?

  • Preferably 32 or 64 bits
  • Does not need to be a cryptographic hash
  • [Added] … but ideally it should uniformly fill out the whole 32/64 bits for any input or length of input
  • Can concatenate multiple hashes to get the same hash as the concatenated messages

By that last one, I mean: Let's say that I have a message divided into chunks A, B, C, …. I want to hash each chunk, hash(A), hash(B), etc., and be able to combine the hashes in such a way that hash(A) 🤝 hash(B) 🤝 hash(C) … = hash(A + B + C …).

(It's OK for this to be order-dependent. In fact, it would be preferable for the order of the chunks to change the resulting hash.)

One possible (maybe the only possible) mechanism for this would be that the hash function has no state besides the current value of the hash.

Edit: Changed the operator between the hashes to something wholly fictional to clarify that I'm not proposing concatenating the hash values. Hashes are fixed-length values; it never makes sense to concatenate them.



Inspired by a real situation that isn't interesting. Suffice to say that I'm both the author and user of the program and the errors represent bugs I need to fix, so I'm interested in every error, but there may be a lot of them.

Some precepts:

  • An error can be more than one line. Do not conflate “errors” with “lines of error output”. By “error”, I mean “one discrete thing that went wrong”, regardless of how many lines are used to display it.
  • I'm primarily thinking of a command-line program (because that's what I'm writing) and communicating the breadth of errors without spamming the terminal.
  • The errors in question may be heterogenous, so the first n are not necessarily representative of the full set of them.
  • The errors are independent; no error is the cause of another error (though they may be caused by the same underlying fault).


boredzo
@boredzo

Having previously argued that programming is literally writing, I think there's something to this idea (even if I do find minimaps helpful sometimes).

What would a table of contents look like for program code? While we're at it, how about an index?

Glossaries also present some attractive possibilities. Think of every codebase you've ever encountered where you were like “OK but what does a _____ actually do? What does it mean to ______ a ______?”.

Figures and tables are already in use in some limited ways. How can we expand this? (Figures presumably would mean images embedded in comments. Does anything support that yet?)

If programming is writing for humans first and the computer second, which it is, then what tools do we use, what facilities do we provide, in writing for humans-only that we ought to bring in to programming?


 
Pinned Tags