Commons:Village pump/Proposals
This page is used for proposals relating to the operations, technical issues, and policies of Wikimedia Commons; it is distinguished from the main Village pump, which handles community-wide discussion of all kinds. The page may also be used to advertise significant discussions taking place elsewhere, such as on the talk page of a Commons policy. Recent sections with no replies for 30 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; the latest archive is Commons:Village pump/Proposals/Archive/2023/10.
- One of Wikimedia Commons’ basic principles is: "Only free content is allowed." Please do not ask why unfree material is not allowed on Wikimedia Commons or suggest that allowing it would be a good thing.
- Have you read the FAQ?
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 5 days and sections whose most recent comment is older than 30 days. | |
Shutdown of Computer-aided tagging: Mass revert? edit
After the WMF team evaluated the quality of edits made through the Computer-aided tagging tool they decided to shut it down.
With this there is also the question if we want to revert all edits made through the tool. This would affect one and a half million edits made through the tool. We could except edits made by users with autopatrol rights from the revert to reduce the amount of potential good edits getting lost in the revert. GPSLeo (talk) 12:49, 16 September 2023 (UTC)
- I come across these mistakes very frequently. And the bot tags are completely inaccurate. When I look at the file's history, no one but the bot has edited it. What the solution is, I do not know, but I belief is that the Commons has been massively harmed by bot tagging. Krok6kola (talk) 15:25, 16 September 2023 (UTC)
- The WMF classed 73.4% of such values as "bad". Absent an alternative proposal, I think this is inevitable. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:29, 16 September 2023 (UTC)
- If we mass revert, then the bot should leave an edit summary that encourages anyone watching the file to check to see if what it has reverted should be restored. After all, 26.6% of 1.5 million is not small. If they are right in their count, we would be having a bot revert about 400,000 good edits to get rid of 1.1 million bad ones. (BTW, I think the numbers are a bit misleading, because thousands of these edits were things like two people edit warring over the depicts on a file.) - Jmabel ! talk 17:14, 16 September 2023 (UTC)
- Do you refer to the ISA tool disaster? These edits are not marked as done with Computer-aided tagging. We should only include edits with the "computer-aided-tagging" tag, the ones with "computer-aided-tagging-manual" tag should also not be included. GPSLeo (talk) 17:22, 16 September 2023 (UTC)
- I doubt that any such edit wars were tagged as being by the Computer-aided tagging tool, so they won't be included in the figure given. Do you have any examples to the contrary? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:53, 23 September 2023 (UTC)
- If we mass revert, then the bot should leave an edit summary that encourages anyone watching the file to check to see if what it has reverted should be restored. After all, 26.6% of 1.5 million is not small. If they are right in their count, we would be having a bot revert about 400,000 good edits to get rid of 1.1 million bad ones. (BTW, I think the numbers are a bit misleading, because thousands of these edits were things like two people edit warring over the depicts on a file.) - Jmabel ! talk 17:14, 16 September 2023 (UTC)
- Support bulk revert. Up to a simple bulk deletion of everything, if we have no better way to separate out the trash. Yes, it's that bad. Andy Dingley (talk) 15:01, 23 September 2023 (UTC)
- Support I don't see any other way. Of course, the bot reverting the tags should leave a proper entry in the history. --Robert Flogaus-Faust (talk) 15:08, 23 September 2023 (UTC)
- Comment - It would be worth of save the added values to file or something before bulk reverting them so if somebody would like try to filter out useful ones (using machine vision for example) I think something like open_clip could work for finding useful tags and I could could do a practical test if the idea works at october. --Zache (talk) 18:28, 23 September 2023 (UTC)
- Support bulk revert. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 13:49, 24 September 2023 (UTC)
- Comment has anyone asked the tech team to share the list of what they determined to be "good" edits so we can assay whether it looks like there would be a fair amount worth keeping? But I wouldn't object to just deleting it all. One ham-handed mass edit deserves another.
- Edit summary should make clear that this is "without prejudice" and if you think the item was correct you should feel free to re-add. - Jmabel ! talk 15:58, 24 September 2023 (UTC)
- The criteria are detailed in the linked Phabricator ticket. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:15, 24 September 2023 (UTC)
- Support (with an appropriate edit summary that encourages people to re-revert bad reverts) El Grafo (talk) 14:43, 2 October 2023 (UTC)
[restored from archive] This needs to be properly closed - do we have consensus? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:45, 2 November 2023 (UTC)
Increase of file size limit on Commons for future-proof purposes edit
Hey folks!
The current file size limit is 4 GiB (approx. 4.3 Gigabytes), see COM:MAXSIZE. I want to propose a increased file size limit. The limit was increased in April 2016 from 2 to 4 GiB.
Since then, the sizes of files increased over time due to larger video resolutions.
I want to give some examples when files exceed the 4 GiB threshold:
- 4K YouTube videos after 25-35 minutes
- FHD DSLR/DSLM videos 8-15 minutes
- 4K DSLR/DSLM videos after 2.5-8 minutes
- 8K DSLR/DSLM videos after 1.25-4 minutes
Videos for example exceed the size limit of 4 GiB quite fast, but also high-resolution scans of 3D objects from organizations like the Smithsonian Institution may offer files that are larger than the limit (and where file splitting is very problematic). I have a large aerial image of Munich that is also too large right now, but offers many details. Over time, more and more files will come into conflict with this limit, as cameras etc. will become more capable. I would like to propose an increase to 32 or 64 GiB if possible. When colored meshes on Commons will be available, a higher file size limit would also be very appreciated.
What do you think?
Greetings and thank you a lot, --PantheraLeo1359531 😺 (talk) 17:17, 9 October 2023 (UTC)
- This has already been requested multiple times, but till now the WMF team did not work on a solution for the current technical limitations. GPSLeo (talk) 18:19, 9 October 2023 (UTC)
- Thank you for mentioning, I hope this issue will be served soon :) --PantheraLeo1359531 😺 (talk) 20:21, 9 October 2023 (UTC)
- Comment It was recently made possible to upload files up to 5 GB. I don't know when this will be live. See phab:T191805. Yann (talk) 17:29, 2 November 2023 (UTC)
- While the filesize is likely to increase a bit in the short term (year?) to 5GB (as mentioned by yann), a further rise is very unlikely to happen any time soon. Reasons:
- 1. It costs a TON of money. Big file handling is expensive. Much more expensive than text. Not just in storage (originals, backups), but also in network cost (all files have to be moved over the internal network), hardware costs (plain server capacity). There is currently 440TB of originals, 22TB of original video. And then a not exactly known amount of derivatives of those originals (thumbnails and smaller versions of the big videos). In many ways, video is already creating a 'disproportionate' impact, compared to how much it is used by the users.
- 2. Anything over 5MB basically cannot be send to a browser. Therefore you need to thumbnail, postprocess etc. So CPU, and yet more datastorage to store the results of that. Specifically in the above example you are speaking about things like 8K, that not even Youtube is doing (publicly). And that is for a good reason. Handling such big files essentially requires purpose designed hardware to do the decoding and encoding (GPU, or like Youtube, who design and manufacture their own hardware for this). It cannot really be done with the generic compute power that we have available within Wikimedia. I advise watching this video by LTT, where they comment on last year's Youtube experiment of charging for 4K. LTT runs their own video website called https://www.floatplane.com so they have some experience with the cost of high res video.
- 3. Media handling is unresourced by WMF. As in, there are 0 engineers dedicated to improving and modernizing the multimedia stack. The only effort spend is on keeping the current multimedia stack alive.
- 4. It is pretty hard to be AND wordpress AND the internet archive AND youtube AND thingiverse AND Wolfram Alpha on a shoestring budget. Wikimedia always has been 10+ years behind these websites and is likely to remain so. We can't scale like them, because we don't have a narrow focus, and because expertise at the bleeding edge is very expensive. Just waiting 10 years is the affordable way to get there.
- 5. The entire infrastructure for filestorage currently has a 5GB limit. Files above that size cannot be addressed, without major re-architecting of the storage layer used by Wikimedia. This rearchitecturing is unlikely to happen due to the earlier mentioned points.
- The best solution for this is still to host these on a separate archive site, and upload a smaller more websuitable version to Commons. —TheDJ (talk • contribs) 19:54, 2 November 2023 (UTC)
- Thank you very much for giving insights into this issue! The 5 GiB limit is a good thing to hear! Maybe we get a solution some time that is a compromise between resources and upload limits :) --PantheraLeo1359531 😺 (talk) 18:27, 4 November 2023 (UTC)
- Otherwise we have to ask Google for server (technology) support ;D --PantheraLeo1359531 😺 (talk) 18:31, 4 November 2023 (UTC)
- Thank you very much for giving insights into this issue! The 5 GiB limit is a good thing to hear! Maybe we get a solution some time that is a compromise between resources and upload limits :) --PantheraLeo1359531 😺 (talk) 18:27, 4 November 2023 (UTC)
Leonore Template ? edit
Hello, I quite often use the Gallica Template to source my uploads. Is there anything like that for the Léonore Database? If not, could this be done? Thanks in advance. William C. Minor (talk) 05:14, 15 October 2023 (UTC)
- We have templates for all types of sources (e.g. France-related ones), but I couldn't find one for Léonore. If this is a source we regularly use, it certainly makes sense to create a template. --rimshottalk 07:18, 3 November 2023 (UTC)
image-reviewer → license-reviewer edit
Hi. I propose to change the technical name of the license reviewer user group from image-reviewer
to license-reviewer
. This will make it in consistent with the group's real name, and also in general everyone uses the term "license reviewer" and not "image reviewer", and it is correct too, as the group members review all type of files and not just images. I see no reason why this shouldn't be done. We'll have to do following things for this:
- Request on Phabricator for technical stuff (config, WikimediaMessages, migration...)
- Update abuse filtes
- Update MediaWiki pages (includes Gadgets)
- Update user scripts if any
- Update required Commons, Template, Help pages (like COM:LR)
Please point out other things if any. Thanks! -- CptViraj (talk) 05:13, 2 November 2023 (UTC)
Discussion edit
I do not think this is a problem. The technical term for Admins is sysop and for Oversighters it is suppress. But if we do a change here we should also consider the proposal I made some month ago (Should we simplify user rights?) to remove the dedicated rollback right and give this right to all patrollers and license reviewers. GPSLeo (talk) 15:27, 2 November 2023 (UTC)
- I personally dislike sysop for admins as it is no longer relevant and can be confused with system administrators but it is a MediaWiki core setting so a big mess and not in our hands, while the license reviewer is a local Commons group, so it makes sense to keep it consistent and updated, and it is in our hands so easy to manipulate. For the proposal you linked, I personally agree with Mdaniels5757 there. For now, I'd say let's please keep that seperate as then this proposal will be wider in scope then it's original non-controversial (?) subject. -- CptViraj (talk) 17:26, 2 November 2023 (UTC)
- Not to say I'm against the idea, but don't you think "license-reviewer" is to close in wording (if not purpose) to the Volunteer Response Team or that there is a chance people will mix them up? --Adamant1 (talk) 20:10, 2 November 2023 (UTC)
- Well, non/new users have always confused them and will continue to do so. But I have never seem them calling it image reviewer, as our help and project pages use the term license reviewer and established users also use the latter term in general except while talking in technical way, so I believe it won't make a much difference for them. Even if they are using/reading/understanding the former term then it could be misleading for them as it suggests that the group members review only images but that's not the case, they can also get confused believing license-reviewer and image-reviewer are two different user groups, so I think it makes sense to change it and make it consistent. And for us, established contributors, who understand workings, It was license reviewer with VRT access or without VRT access, and will remain the same. -- CptViraj (talk) 09:13, 3 November 2023 (UTC)
- @CptViraj: Wouldn't that be solved by calling it "file reviewer" then? I don't think just because "image" doesn't work that it necessarily means "license" does. --Adamant1 (talk) 20:33, 4 November 2023 (UTC)
- @Adamant1: The sole role of the user group is to review licenses, and that's what it makes different from patrollers, otherwise files can be reviewed by patrollers, too. So I believe 'file reviewer' could cause confusion. Also 'license reviewer' has become a common name now, changing it completely won't be a good idea IMHO. -- CptViraj (talk) 21:12, 4 November 2023 (UTC)
- @CptViraj: Wouldn't that be solved by calling it "file reviewer" then? I don't think just because "image" doesn't work that it necessarily means "license" does. --Adamant1 (talk) 20:33, 4 November 2023 (UTC)
- Well, non/new users have always confused them and will continue to do so. But I have never seem them calling it image reviewer, as our help and project pages use the term license reviewer and established users also use the latter term in general except while talking in technical way, so I believe it won't make a much difference for them. Even if they are using/reading/understanding the former term then it could be misleading for them as it suggests that the group members review only images but that's not the case, they can also get confused believing license-reviewer and image-reviewer are two different user groups, so I think it makes sense to change it and make it consistent. And for us, established contributors, who understand workings, It was license reviewer with VRT access or without VRT access, and will remain the same. -- CptViraj (talk) 09:13, 3 November 2023 (UTC)
- Not to say I'm against the idea, but don't you think "license-reviewer" is to close in wording (if not purpose) to the Volunteer Response Team or that there is a chance people will mix them up? --Adamant1 (talk) 20:10, 2 November 2023 (UTC)