• she/her

queer code witch - 18
discord @‍mintexists
send me asks! :3


boobs
I'm not convinced that this needs to be a link?
Yea no
it doesnt
i wonder if
**markdown** formatting *works* no it doesnt thats sad

wavebeem
@wavebeem

so i got my discord data export today. there's a massive JSON stream file (mine was 1.7 GB) containing 78 gender prediction events over the last year and a half or so

i wrote a script to parse the JSON file into gender.csv


discord-gender.mjs

import { open, writeFile } from "node:fs/promises";

function compare(a, b) {
  if (a < b) return -1;
  if (a > b) return 1;
  return 0;
}

const filename = process.argv[2];
if (!filename) {
  console.error(`
- Open the folder activity/analytics in your Discord archive.
- You should see at least one JSON file in there.
- Run this script with the JSON filename as its argument.

    node discord-gender.mjs "YOUR_LOG_FILE.json"
`);
  process.exit(1);
}

const rows = [];
const file = await open(filename);
// This file can be quite large, so make sure to use an efficient line-by-line
// reader rather than gobbling the whole thing into memory. My file was 1.7 GB.
for await (const json of file.readLines()) {
  // There are other objects in this stream not about predicting gender. Let's
  // do a quick check without parsing the JSON so we can skip thru the other
  // events faster.
  if (!json.includes("predicted_gender")) {
    continue;
  }
  const obj = JSON.parse(json);
  rows.push([
    // Technically Discord's date strings may not parse correctly since
    // they're not actually ISO 8601 lol, but Node.js seems to parse them
    // fine! Anyways, serialize as ISO 8601 since that's better.
    new Date(obj.day_pt).toISOString(),
    obj.prob_male,
    obj.prob_female,
    obj.prob_non_binary_gender_expansive,
  ]);
}
await file.close();
// Sort by dates since the events aren't in chronological order for some reason.
rows.sort(([dateA], [dateB]) => {
  return compare(dateA, dateB);
});
// Add CSV headers
rows.unshift(["date", "male", "female", "other"]);
// Yeah, I should use a real CSV serializer. No, I'm not going to. Node.js
// doesn't include one, and this data doesn't have "special characters" in it.
const csv = rows.map((row) => row.join(",")).join("\n");
await writeFile("gender.csv", csv, { encoding: "utf-8" });
syntax highlighting by codehost

You must log in to comment.

in reply to @wavebeem's post:

i assumed this was gearing up for targeted ads, since advertisers want demographics they can target, and they love gender

we're already seeing discord sponsorship ads, though i'm not sure if any have been targeted yet

well, the file is an event log. most entries have a "log type" but the gender one is missing that field. it just contains a bunch of weird gender info!

so i came up with that name myself. i bet it's computationally expensive so they don't run it that often. probably has to scan all my messages and shit.

i don't have this because i opted out (yay)

Activity: Contains four folders (Analytics, Modeling, Reporting, Trust & Safety) with information about the actions you've taken on Discord (note: you will not have Analytics or Modeling folders in your data package if you've opted out of those activities)

yeah i mean my existence is neither determined by nor meant to satisfy their disgusting algorithm, but i thought it was fun to look at considering they already did this without my consent

"ummm idK????" is a good answer :p

Pinned Tags