sirocyl

noted computer gremlinizer

working on a @styx-os.

 

laptop.
                                                                                                     

"accidentally-vengeful telco nerd"
—Tom Scott

platform sec researcher, OS dev, systems architect, composer; Other (please specify). vintage computer/electronics nut.

I am open to tag suggestions - if there is something you want me to tag on my posts, leave a comment. <3


take a look at
this cool bug I found 🪲
discord
@sirocyl
revolt.chat (occasionally active)
@sirocyl#5128
styx linux OS project
styx-os.org/

ckolderup
@ckolderup

[now mirrored on my actual blog: https://motd.co/2023/05/archive-stumbler/]

I like to browse the Internet Archive late at night. A lot of the internet is really boring now but the Archive is super refreshing but it lacks a key discovery feature: serendipity.

the archive.org collection listing sort bar, featuring the ability to sort by: views, title, date reviewed, creator

These are the only options you have in how you sort a collection when viewing it, so if you're faced with, say, the full archive of Western Hills Access Television because you want to know what was going on in Western Maine 1-4 years ago, the only way to embrace any kind of serendipity is to kinda scroll for a little while while unfocusing your eyes and then click somewhere on the page. Very unsatisfying!

Luckily the Archive can return data on a few of its main endpoints in JSON format, so you can chain a few HTTP calls together and make it possible to make a computer use a PRNG to pick for you. So I did!

screenshot of the interface for Archive Stumbler

Archive Stumbler will accept any Internet Archive collection URL and provide a hyperlink that you can click on (or, if you're settling in for some real crate digging, ctrl-click a bunch to open in new tabs) to send you to a random item in that collection. If you don't know where to start, there's a button you can click that'll pre-fill the field with some sample collections.

Hope you enjoy! It's pretty slapdash, so if you do anything wrong it will probably fall over in bad and confusing ways. EDIT: I think I know what the main failure is and might try to put a workaround in place! In the meantime, if you keep refreshing the tab it'll keep trying again with the same collection.

The one enhancement I'll probably add to it soon is encoding the value of the URL input box into the page URL so that you can make bookmarks/share links that will pre-fill with a specific collection. EDIT: I did do this one! The URL should auto-update any time you have a valid a.o collection URL in the box, you can copy that and share it with someone or make a bookmark/shortcut/etc.


invis
@invis
This page's posts are visible only to users who are logged in.

sirocyl
@sirocyl

if you don't know what StumbleUpon is, good. If you don't know what it was, it was a toolbar that you'd install in IE or Firefox, which gives you three major buttons when you signed up in the toolbar: "Stumble!", "I like this", "I did not like this."

What I'd love to see, is exactly this model - a mixer of content with a minimal social, democratic component to it - applied to archive dot org specifically. Web Archive as well as the multimedia and reading material.

Below the fold, I get a bit winded about what stumbleupon was. because it is possibly the reason I am Posting here on Cohost now, queer as ever, with the friends I have.


how did it work?

The options were simple:

  • if you found a neat page, "I like this". It'll tack your vote onto the page. If it's a new page, StumbleUpon would add it to a discovery queue, which random people were mixed into, in order to get votes on it.
  • Clicking Stumble!, would yield your browser into a random choice from "hot today/this week/this month" sites with a lot of votes, the discovery queue, or sometimes staff picks and collections (Never Ads, never paid for.)
  • Clicking "I don't like this" also works anywhere, but only really matters if someone has voted for the page, or is voting for the page later on (if you're the first to thumb it down).

There was also some algorithms in place to work against some mass-behavioral issues, too (like the 'new post downvote hell' on Reddit or StackOverflow, where one downvote, if it comes in first, now ranks your post straight off the frontpage without anyone seeing it)

what made it cool?

A lot of what made StumbleUpon cool, is a lot of what made reddit, or del.icio.us cool - it was a quick and painless way to stick a pin in a random website and say "I was here!" and also "This site is cool! You should try it!"

Except, rather than being a Forum, or a List of bookmarks, it was chaos, distilled.

You would be randomly heaved to a website, any website, from the discovery mixer populated by people's Like's on random websites they visited. Malware was rare - when you clicked "I don't like this!", there was a pop to explain why - and if it was because of porn, malware, excessive ads or other things blunting the Stumble experience, staff could review and blacklist the site - but even before then, you likely wouldn't be served it once it got a thumbs-down for a Reason.


where is it now?

The way StumbleUpon currently works, I'm not entirely sure. The Wikipedia page about it starts with "StumbleUpon was", despite their website still being online (at the time of this writing). It mostly lost its way around 2006, got eaten by SV investors and eBay's rampant 2000s M&A army, and now the URL redirects to something new called "Mix", an "App Website", which is your typical Silicon Valley Sludge. Clicking the ":x:" in the modal, however, boops you back to something called StumbleUpon, but it looks like a very unfinished, skeletal site with an invite-only "login" form. I think they're trying to do something, but either don't feel like it, or fundamentally can't for one reason or another.


You must log in to comment.

in reply to @ckolderup's post:

I don't know that there's a total lack of serendipity in browsing the Internet Archive. The crux is you're not likely to find it in the search bar. To give you an idea of what I mean, I went to check out a Cutie Honey: The Live archive last night, and found out through the related archives at the bottom of the page that the 1973 Cutie Honey is also archived.

I know it's not a glamorous discovery, but I have found shit through this feature I didn't otherwise think I would find.

I mean, sure, serendipity is a broad concept. Some would say that the top item of a sorted-by-title collection being actually of interest to you is serendipity, or that noticing a given uploader and browsing all their uploads only to find that you actually share two niche interests and the OTHER one is a goldmine is great. And it is! It's why I love digging through that site. I just think more websites that offer listings of large collections of things should embrace the beauty of selecting an index at random

I like this! I remember way back someone coded a way to randomly access something from everything technically publicly accessible in JSTOR and that was fascinating. I think JSTOR shut that down pretty quickly, if I remember right due to them being far less cool than IA is

thanks! I actually asked Jason Scott whether this already existed as some kind of undocumented URL param or API endpoint or something and got a very in-character "make it yourself" from him, so I assume it'll be tolerated at the very least!

this is cool! i'd be interested to see if even more randomness could be injected into the process - maybe by searching for items/collections using random words, or trawling the most recently updated stuff

that'd be a cool way to expand on this for sure-- my use of the search endpoints here is pretty naive, mostly just for collection traversal. At the very least I think I'm going to go through this weekend and add like 100 collections to the autosuggest button to make it a little more useful to people who DON'T already have a bookmarks folder full of a.o links like me!

in reply to @sirocyl's post:

Pinned Tags