catball

Meowdy Pawdner

  • she /they

pictures of my rats: @rats
yiddish folktale bot (currently offline): @Yiddish-Folktales

Seattle area
trans 🏳️‍⚧️ somewhere between (30 - 35)


Personal website
catball.dev/
Mastodon (not sure if I'll use this)
digipres.club/@cat
Pillowfort (not sure if I'll use this)
www.pillowfort.social/catball
Monthly Newsletter (email me to join)
newsletter AT computer DOT garden
Monthly Nudesletter (18+ only, email me to join)
nudesletter AT computer DOT garden
Rat Pics (placeholder, will update)
rats.computer.garden/
Website League main profile
transgender.city/@cat
Website League nudes profile
transgender.city/@hotcat
Website League rat pics
transgender.city/@rats

catball
@catball

but also maybe if syntax pedagogy wasn't so Dirt Bad then people wouldn't be so scared to try modeling language in a way that takes a more supervised approach to relating syntax and semantics


catball
@catball

I feel like the popularity of current semi/unsupervised approaches to language modeling probably stem more from no one wanting to pay people to annotate data, and that being an effective annotator takes a lot of skill and training


You must log in to comment.

in reply to @catball's post:

the vibe i have picked up without digging really deep into any of the literature is that discriminating between semantic meanings requires so much context it's not really solvable or even attempted to be solved by any means other than statistical interpolation right now. like it's essentially an epistemological problem, and most "ai" researchers would rather fill the internet with sewage and then drown than learn a single new fact about epistemology. is this anything

i wish someone would just go back to the basics and strip all the bullshit off embeddings and try to fashion something usable out of those, they seem like maybe the only genuinely useful component of this whole mess

that's totally valid that machine parsing useful semantics is extremely hard. and honestly, context in natural language leans on you having a lifetime of experiences and assessing common ground with your other speakers, which seems like something that would be like near impossible to capture with great accuracy. even when we speak to each other, we're constantly clarifying and negotiating meaning while we talk

even if you could give a machine the level of background that it could be useful at establishing context with its users such that it could do reasonably well at semantics, you run into the ethical issues of "that machine has a mountain of data about me" and then starting to slip into like, shouldnt we just design UI better such that we can reasonably specify the semantics of a task?

or if you cant specify the semantics of your task to the machine, it starts to feel like you want a machine to to reasoning and decision making for nontrivial things, which is The Scary Zone imo. (this paragraph brought to you by the ghost of jospeh weizenbaum)

this is the point where if we were VC funded we would say "damn that's so crazy and interesting i bet we're going to figure it out 100% and solve this problem by making it bigger or something idk i have very little imagination. anyway let's just plow ahead with what we've got and sell this to investors hell yeah bro"

i think the "frontier" of "ai" research right now is disincentivized from even looking into this, this very fundamental problem that sure seems like it cannot possibly be solved by any existing approaches. llm cannot and do not process semantics, they can only statistically approximate its leavings; until someone comes up with something fundamentally that actually has more internality and mind than ELIZA i don't envision these tools gaining any ability to negotiate meaning like that

in reply to @catball's post:

After working on this on and off for a decade (as a software engineer and linguist who keeps having to act as SME), our companies neverwant to actually invest the time to annotate, and always underestimate how much time, effort, and data is required.

Even people who should probably know better treat it like magic, and don't appreciate that annotating is itself not straightforward