dog

Only here to post about CD-ROMs

I want!!
⬅️ this mark
🍷white
and Taste 🦆11
require

 

 

(Avatar by @texture, CD-ROM Journal button by @candiedreptile)


CD-ROM Journal
cdrom.ca/

0xabad1dea
@0xabad1dea

This is a graph of Discord’s algorithmically inferred gender (extracted from “request your data” json; axes are probability and days) for a user whose display name is “Tiffany”, whose bio is “she/her”, whose pfp is a drawing of a girl and whose profile theme color is pink.

Algorithmically inferred gender is worse than useless. Presumably the issue is that she talks about programming, and all the deliberate “I am explicitly telling you I am a girl” signaling in the world can’t convince a computer. I sometimes watch a livecoding streamer whose youtube stats claim his audience is 99% male even though you can see fem-coded chat participants regularly. Algorithms like this are deleting the women


You must log in to comment.

in reply to @0xabad1dea's post:

it may not be present in your discord data export to begin with (whether because you revoked consent to use personal data at some point or another, more inscrutable reason) but if you have it, you can find it by searching the json for "predicted_"

Platform: Most of our users are currently men

Algorithm: That means for any given user the odds of them being a man are pretty high! My job is so easy!

Algorithm: 99% of users are >50% likely male.

Platform: All of our users are men and we should only appeal and advertise to them. Thanks algorithm!

ive always always always wondered about this esp when youtubers are like "my audience is predominantly [such and such]" and im like "HOW DO YOU KNOW. YOUTUBE TOLD YOU THAT BUT HOW DO THEY KNOW"

this is honestly fascinating not only because of the obvious bizarre-ness of it but the fact that this is the best argument I have ever seen for gender being a social construct. Bravo!

I finally got my data dump from Discord but the analytics file is about 2.5GB of JSON and I'm having difficulty finding any tools that can actually parse it meaningfully. Do you have any suggestions about how to extract the predicted_gender and associated date information into something that can actually be handled reasonably easily?

jq is capable of streaming the output successfully but I have no idea what the schema is or what query path I should use to pre-digest the data.