niss

im gay animals irl. it's true

enjoyer of type systems and weird creatures

banner: zylex


🔞 lewd account
cohost.org/nissnsfw
🌐 elsewhere
yummy.cricket/#links

ketsuban
@ketsuban

I said in a comment that everyone always makes the same mistakes over and over again when it comes to tagging rather than learning from sites like Danbooru and I feel energised to talk about it in a post which in the name of irony and self-demonstration I will not be tagging.

I strongly believe that most websites which implement tags do it wrong, and that the result of doing it wrong is always a tagging system which is at best unhelpful and at worst useless. Here's an unordered list of some ways websites do it wrong and the consequences.

  • Tagging images is the exclusive domain of the person who uploads the image. This tends to come up with websites that think of themselves as a repository of creations, and I get where artists are coming from when they argue that you are the only person empowered to provide an accurate summary of your own work. The problem is that tags are not exclusively for you, they're for everyone. If you are the only person tagging your work, the best case scenario is your tags aren't meaningful outside of your work. More often than not, though, you get people treating the tags as just another input field, which pollutes the namespace. (Tumblr is so bad at this that the former has reified "the tags" on a post as a whole parallel comments section, which is straight-up an admission of failure. Cohost's tags working in precisely the same way does not give me confidence.)
  • Tags are immutable once created, and can't be renamed, merged into a canonical name, etc. As soon as your tag cloud has big_boobs and big_breasts as separate but overlapping categories, you've created busywork for the prospective user who has to not just know what they want to find but also predict synonyms people might use and perform multiple searches to find everything. More likely, they won't do that and miss up to half the potential results that interest them.
  1. Lack of metadata. People have already commented on how searching #latex on Cohost gets you two very different categories of post in one search result; in a functional tag ecosystem like Danbooru those would be something like latex (clothing) and latex (typesetting), and the social norm (aided by tag-completion functionality which suggests expansions as you type a tag) would be to use what you actually mean rather than assuming that clearly the only thing anyone could ever mean by latex is LaTeX.
  • No implications. This allows a relatively small number of tags added by humans to turn into a much larger group of tags on an actual work, and prevents a scenario where someone who dutifully tags every tardigrade they post never gets their work seen despite protostome being very popular with people who would be very happy to see tardigrades.

I can't think of any others right now but so many websites go "let's use tags to help discoverability!", decide it's so easy they don't need to do any research and promptly tie their shoelaces together and fall flat on their face.


atomicthumbs
@atomicthumbs

i've been saying this for years: tags have never been a good way to describe, organize, and search for images. what is needed for booru-style sites, and (in a perfect world) every site that art is shared, is a formal subject-verb-object or subject-verb-object-context ontology with, respectively, semantic triples or quads, backed by an appropriate triplestore or graph database, for every characteristic of the art piece worth documenting.

it's been done incorrectly for years. tags are an ugly hack that's been passed down solely because "well everyone did it before." there's no formal documentation and moderators have to scramble to keep up, and add even worse hacks like tag categories and tags implying other tags in an attempt to add additional metadata search abilities. This system was designed by amateurs and only presents an increasing burden as time goes on, requiring massive work investment by developers, moderators and taggers, for a system whose precision and abilities are inherently and severely limited by the single bit of information a tag association can convey about an image.

fuck tags! do it correctly! use triples! standardized and capable ways to define and search for this information exist! this shit is what the Semantic Web was built for!

caveat: you may have to have someone with the Correct Type of Neurodivergency to build and maintain your tag specification. this is not necessarily a drawback and many of those folks are overloaded maintaining the existing tags already. it becomes easier when you have a formal way to define important information


You must log in to comment.

in reply to @ketsuban's post:

it's especially bad for fandom media where a show has multiple well-known abbreviations and shortenings. Star Trek: Deep Space Nine does not need a dozen different canonical tags but that's where we are since there's every iteration of abbreviation and also swapping "Nine" for 9 and also leaving Star Trek out entirely. there shouldn't have to be a dozen tags on one shitpost just to categorize it

That's easily solved, and anyway the current setup does nothing to prevent me reblogging this post with my own set of tags appended. You just don't see it done all that frequently because that's terrible for anything but malicious applications

it actually does in the sense that reblogging with tags is useless and does nothing. tags on a share do literally nothing unless they were already in the op's tag set.

for reasons you have already identified, mind. but it means that the behavior is impossible, not just discouraged.

having seen the absolute destruction of tags on e621 in particular, i cant help but be thoroughly opposed to tag merging except under extremely specific circumstances where there's exactly 0 chance of overlapping 2 different concepts like in the above example, though huge_x should not be the same as big_x.

'milking' for instance can refer to different things, but the tag has been merged making it extremely annoying to filter out the ones you dont want vs the ones you do.

'enf' is a concept you literally cannot search for, having been redirected to an on-paper-related, but actually not all that related tag, and combining the words it stands for as tags does not result in the correct results.

these are the two that first came to mind but there are many other such examples.

specificity has been destroyed in the name of streamlining and minimizing the aforementioned overhead of similar terms. it is not a price worth paying.

ah, yep, for sure. tag wrangling wasn't something i thought about until i learned AO3 does it, but after that, i couldn't stop thinking "ah, this tag-based site doesn't do tag wrangling" whenever i saw a new site with tags. i mean, it's understandable, since AO3 has paid staff to do that, but yeah.

i also don't think danbooru's model would work 1:1 in other places, but you probably already knew that, and having been using the site for some 10 years by now, yeah i can say that it would DEFINITELY be better than what most other tag-based sites have going on lol. ah, to not have to put like 3 different tags in my armored core 6 shitpost, or to be able to find pictures on pixiv as easily as i can on danbooru.

I wanted to make you saw in the recent Cohost financials update that they mentioned, "...tagging is currently a wild west in general, and we believe that some sort of tag grouping or synonym system (which is on our next-six-months roadmap)..."

in reply to @atomicthumbs's post: