May. 29th, 2022

doranwen: female nerds, rare and precious (Default)
At last, the database of Yahoo Groups metadata has reached the point of readiness so that tagging can begin in earnest - and done properly this time, unlike the attempt last summer (which I have kept the info for, so the efforts were not in vain).

Big pluses this time:

- Tabs are organized by one or more cat_id numbers, each of which is unique to the combination of category path, category id, and category name. This means that, on the whole, the groups on a tab should be roughly similar in theme/content, and in some cases, tabs can be limited to a specific fandom (generally super popular fandoms of the early 2000s such as LOTR, HP, Buffy, Sailor Moon, or Backstreet Boys).

- Tabs are separated by language, so the average volunteer doesn't have to deal with identifying languages at all, nor do they have to tag groups in a language other than English if they don't want to. (Of course, that means we are definitely eager for volunteers who ARE able and willing to tag groups in other languages!)

- Nonfandom tagging is done alongside the fandom tagging by the same volunteer, so there's no second pass that has to be done later. (There's a preset list of nonfandom categories to copy/paste from, to make it simple for taggers.)

- Each tab only needs to be looked at twice (once for initial tagging, once for checking), but checkers have a slightly different set of tasks, and should be able to breeze through most tabs more quickly than the initial tagging.


If you're interested in helping, here are the relevant links:

Tagging guidelines: https://docs.google.com/document/d/1AWFSmXLH-KsVU7N1EGkmbrLyv1N_fYoRlWLCEWLxtX4/edit?usp=sharing
Category list with cat_id and groups count: https://mega.nz/file/EZ9xCY4b#N8l9_LTJ-mV4KsMRO0DT4J--ws1fackYRcLZHPJ370o
Discord server (Save Yahoo Groups): https://discord.gg/fqsNqdpF7r

You do NOT have to be on the server, if you don't want to do Discord. It is helpful, as I update the last processed cat_id so you know which ones are already on tabs and can be requested, but it's not necessary.


Especially needed:

- anyone who can read a language other than English (the first 1000+ cat_ids are all Italian category paths, for instance, and there are tons of groups later on in Spanish, French, German, Portuguese, Chinese, Turkish, Indonesian, Arabic, some Hebrew - and we will particularly need to find someone who can read Persian written with the Latin alphabet)

- anyone with specialist knowledge in particular areas, whether it's a specific fandom or general area of fandom you know well, or a nonfandom area like computers, biology, or various cultures

- anyone willing to download and import mbox files in order to identify language and/or fandom/category for groups on the "unknown" tabs (groups where the metadata is not sufficient for tagging); we've got a visual tutorial for a lightweight free software program, so it's not hard!

Profile

doranwen: female nerds, rare and precious (Default)
Doranwen

June 2025

S M T W T F S
1234567
8910111213 14
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 20th, 2025 05:42 pm
Powered by Dreamwidth Studios