Commons:Village pump/Technical

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Shortcuts: COM:VP/T • COM:VPT

Welcome to the Village pump technical section
Technical discussion
Village pump/Technical
 Bug reports
 Code review
Tools
 Tools/Directory
 Idea Lab



This page is used for technical questions relating to the tools, gadgets, or other technical issues about Commons; it is distinguished from the main Village pump, which handles community-wide discussion of all kinds. The page may also be used to advertise significant discussions taking place elsewhere, such as on the talk page of a Commons policy. Recent sections with no replies for 30 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; recent archives: /Archive/2024/06 /Archive/2024/07.

Please note
 
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose most recent comment is older than 30 days.

Can someone perhaps remove all the deleted and renamed files from User:Dispenser/Double extension so that only the ones that still need to be fixed are left? Jonteemil (talk) 20:46, 28 May 2024 (UTC)[reply]

Anyone can do so by forking the Quarry query mentioned in the SQL snippet (look for Fork in the upper right corner of the Quarry page), but I’m not comfortable editing a random other user’s subpage without seeing an explicit permission for doing so. —Tacsipacsi (talk) 17:37, 9 June 2024 (UTC)[reply]
I tried forking it, see quarry:query/83409, but it just keeps running. It has run for a week now, yet still running. Jonteemil (talk) 17:21, 11 June 2024 (UTC)[reply]
You could try to do just jpg ones or recent uploads.
At File:BSicon_numN327.svg.svg, I noticed the gui at Wikipedia lets you enter an extension and then adds it as well.
Before fixing individual files, I'd try to identify as many tools that create them and fix these first. Otherwise, more files will just be created. Enhancing999 (talk) 12:06, 1 July 2024 (UTC)[reply]

Protection level

[edit]

File:Kalocsaizsuzsa.jpg is autopatrol protected so why is there (protectedpagetext: editprotected, edit) system message on it? Is the Wikibase part of the page indeed under full (sysop) protection? --Geohakkeri (talk) 21:40, 12 June 2024 (UTC)[reply]

That is weird. The text This page is currently protected, and can be edited only by administrators. comes from Template:Protectedpagetext/PageProtected, but the expected text is at Template:Protectedpagetext/PageAutopatrolProtected. Both of these are transcluded by MediaWiki:Protectedpagetext depending on its first parameter $1. Here's the wikitext:
{{#switch: {{{1|$1}}}
 | editprotected = {{Protectedpagetext/PageProtected}} <!-- Fully protected -->
 | templateeditor = {{Protectedpagetext/PageTemplateProtected}} <!-- Template protected -->
 | editautopatrolprotected = {{Protectedpagetext/PageAutopatrolProtected}} <!-- editautopatrolprotected -->
 | #default = {{Protectedpagetext/PageSemiProtected}} <!-- Semi-protected -->
}}
Per mw:Manual:Interface/Protectedpagetext: $1 - the raw name of the right which is needed to edit the page. Special:ExpandTemplates for page File:Kalocsaizsuzsa.jpg and wikitext {{PROTECTIONLEVEL:edit}} gives editautopatrolprotected, as expected, but "protection level" and "name of the right" might not be the same. —⁠andrybak (talk) 20:08, 15 June 2024 (UTC)[reply]
editautopatrolprotected was added to MediaWiki:Protectedpagetext in Special:Diff/853065284 by User:GPSLeo, who is also the author of Template:Protectedpagetext/PageAutopatrolProtected. Perhaps they can check what went wrong. —⁠andrybak (talk) 20:15, 15 June 2024 (UTC)[reply]
This is the relevant code, I guess. There are editprotected and editsemiprotected hardcoded as the only options there. --Geohakkeri (talk) 20:51, 15 June 2024 (UTC)[reply]
So, if MediaWiki:Protectedpagetext depended on {{PROTECTIONLEVEL:edit}} rather than the proper parametre, it would be a quick fix at least. --Geohakkeri (talk) 21:11, 15 June 2024 (UTC)[reply]
Hmm. For reference, English Wikipedia's en:MediaWiki:Protectedpagetext has a similar #switch, with protect, editprotected, templateeditor, and extendedconfirmed.
Searching the code of MediaWiki,[1] I also found mentions of Protectedpagetext in PermissionManager.php,[2] which passes as the first parameter $1 either the string protect or a variable $right, which comes from function getRestrictions of RestrictionStore. My knowledge of PHP is limited, but I'd guess that possible values for restrictions come from $wgRestrictionLevels, hence templateeditor and editautopatrolprotected in Commons' version and templateeditor and extendedconfirmed in enwiki's version. —⁠andrybak (talk) 22:15, 15 June 2024 (UTC)[reply]
{{MediaWiki:Protectedpagetext|{{PROTECTIONLEVEL:edit}}}} on File:Kalocsaizsuzsa.jpg would display Template:Protectedpagetext/PageAutopatrolProtected.
Wonder if it works correctly on enwiki, w:Special:WhatLinksHere/Template:Protected_page_text/extendedconfirmed has no uses.
https://commons.wikimedia.org/wiki/File:Kalocsaizsuzsa.jpg?uselang=qqx shows (protectedpagetext: editprotected, edit)
Not sure what {{CASCADINGSOURCES}} is meant to do.
Maybe we could insert a switch based on Protectionlevel after "|editprotected =". Enhancing999 (talk) 15:15, 30 June 2024 (UTC)[reply]
I tried to look into this at test.wikipedia.org, but the settings there are different.
Here what is displayed on a file description page comes from javascript var "wbmiProtectionMsg". This might be filled with the actual protection level of the javascript source. Enhancing999 (talk) 12:16, 1 July 2024 (UTC)[reply]
I think CASCADINGSOURCES means sources of cascade protection. Alfa-ketosav (talk) 16:17, 19 July 2024 (UTC)[reply]

Footnotes

  1. git grep -i protectedpagetext -- '.' ':^languages/'
  2. note the manual mapping of sysop and autoconfirmed needed for backwards compatibility

Interface administrator requests at MediaWiki talk:Gadget-Cat-a-lot.js

[edit]

There are several edit requests for interface administrators at MediaWiki talk:Gadget-Cat-a-lot.js. The following edit requests have diffs with proposals. In order of importance:

  1. Bug fix: MediaWiki talk:Gadget-Cat-a-lot.js/Archive 3#Minor edit unmarking feature not working (Special thanks to User:Miraclepine for reporting the bug.)
  2. Localization fix: MediaWiki talk:Gadget-Cat-a-lot.js/Archive 3#Mobile-frontend-return-to-page
  3. UI tweak: MediaWiki talk:Gadget-Cat-a-lot.js/Archive 3#Please add link to Help:Gadget-Cat-a-lot in the box

The page MediaWiki talk:Gadget-Cat-a-lot.js already has instances of {{Edit request}}. Because of it, these new requests won't show up in watchlists of those watching Category:Commons protected edit requests for interface administrators. Hence this additional message at Village pump. —⁠andrybak (talk) 16:31, 15 June 2024 (UTC)[reply]

Lucas Werkmeister, as the most recently active interface administrator with recent edits in Gadgets, could you please take a look? —⁠andrybak (talk) 19:46, 16 June 2024 (UTC)[reply]
Did two of them, leaving the third one open for feedback for a moment. And yeah, the watchlist issue is a general problem with the current edit request system – MediaWiki talk:Copyupload-allowed-domains also suffers from it from time to time. Lucas Werkmeister (talk) 21:03, 16 June 2024 (UTC)[reply]
Thank you! I've struck out the completed requests above. —⁠andrybak (talk) 21:20, 16 June 2024 (UTC)[reply]
Third one also done, and I’ll see if I can deal with Valerio’s edit request too, to get this out of the category. Lucas Werkmeister (talk) 20:03, 19 June 2024 (UTC)[reply]
I've disabled Valerio's request. Nardog proposed a bugfix two days ago in MediaWiki talk:Gadget-Cat-a-lot.js § Random unexpected failures at enwiki. —⁠andrybak (talk) 00:01, 25 June 2024 (UTC)[reply]
Updated the links to the archived sections. Struck out the third request, which was implemented in Special:Diff/885487790. —⁠andrybak (talk) 19:24, 27 June 2024 (UTC)[reply]
Section MediaWiki talk:Gadget-Cat-a-lot.js#Random unexpected failures at enwiki has a patch, which is already tested. Could an interface administrator please take a look? —⁠andrybak (talk) 20:24, 8 July 2024 (UTC)[reply]
User:AntiCompositeNumber or User:Mike Peel, could you please take a look at the edit request by Nardog: MediaWiki talk:Gadget-Cat-a-lot.js#Random unexpected failures at enwiki? —⁠andrybak (talk) 13:28, 13 July 2024 (UTC)[reply]

Upload functions used by various tools

[edit]

Just wondering, is there a technical difference in the backend between the following ways:

Some observations:

  • I'd expect #1 and #2 to be the same, but somehow uploads are less likely to fail if one creates the file description page first and then uses the "upload" link there (#2).
  • The documentation for #4 mentions the api. Presumably this is the same being used by #5. The test I did with #4 seemed to work better than #3 usually does.

If ask for ranking the reliability of these tools, I'd say #5/#4, #2, #1, #3. Enhancing999 (talk) 15:29, 24 June 2024 (UTC)[reply]

Still curious about this. Maybe #2 works better than #1 as it doesn't involve creating the page. Enhancing999 (talk) 11:40, 24 July 2024 (UTC)[reply]
  • 1 and 2 are exactly the same. But indeed if there is no page yet, more operations are involved. And if you upload a new version of the same file, there will also be more operations that are involved and all need to succeed (archiving the old file).
  • 3 uses Chunked uploading, which is a lot more complex than 1 and 2, but can also support much larger files
  • 4 uses JS to upload to the api. This is another entry point into 1/2, but behind the entry point it works identical.
  • 5 uses the same apis as 4 and 3 (and can do both chunked and non-chunked)
The backend is not the full story however. Each frontend/entrypoint has to implement multiple 'recovery' procedures that may improved reliability of uploading. Session expiration, dropped connection, token refreshing etc. all can be handled by each entrypoint (or not). —TheDJ (talkcontribs) 13:38, 24 July 2024 (UTC)[reply]

Tech News: 2024-26

[edit]

MediaWiki message delivery 22:29, 24 June 2024 (UTC)[reply]

Tech News: 2024-27

[edit]

MediaWiki message delivery 23:56, 1 July 2024 (UTC)[reply]

Cleaning up list of followed pages / How to limit an extended watchlist?

[edit]

Hi, is there a way to do a mass cleanup of the list followed pages? My list is so long that it frequently refuses to load (it times out). I want to keep a much smaller collection of files on the list. How can I mass remove selections of files, like the complete contents of some categories and their subcategories. What's the best practice to do this? Or even only follow what happens to my personal uploads. Peli (talk) 12:20, 2 July 2024 (UTC)[reply]

have you reviewed Special:Preferences#mw-prefsection-watchlist? so that you dont keep adding much more to your list. RZuo (talk) 13:59, 2 July 2024 (UTC)[reply]
@Pelikana: You can get the raw contents of your watchlist from Special:EditWatchlist/raw, and you can then edit it either in your browser or in the text editor of your choice. I can't immediately think of a way to winnow it by category, but it looks like it's sorted by when items were added so it might be that the items you want to remove are conveniently close together. --bjh21 (talk) 23:17, 2 July 2024 (UTC)[reply]
Thanks both, Now I have limited the new additions to my list by following the first suggestion. But for suggestion #2: any time I try to edit my watchlist I receive some kind of fatal error like this "Wikimedia\Rdbms\DBQueryDisconnectedError". So I never was able to actually edit something of it. Peli (talk) 13:58, 7 July 2024 (UTC)[reply]
I got the raw list now and worked on it and want to know some details. Can it ignore lines with typos or does that give an error? And how to get the cleaned up list back in place, since the old list is so long that it times out in the replacing process with cut and paste. Peli (talk) 22:43, 9 July 2024 (UTC)[reply]
SOLVED. By clearing the list by the red button, and pasting the limited one, it seemed to have worked out well. The manual clearing was very tedious and time consuming tho even in notepad++. But I found the options to add and remove to watchlist by buttons on the watchlist now, great. Thanks. Peli (talk) 16:41, 14 July 2024 (UTC)[reply]

It appears that the most recent version of this file (which, according to the talk page is a 4K restored version of the film) was not uploaded properly and cannot be played: "No compatible source was found for this media." Can someone please fix this? Johnj1995 (talk) 03:16, 4 July 2024 (UTC)[reply]

@Johnj1995: Hmm, the raw webm file seems to work fine, but it won't play in the Media player. I would suggest filing a Phabricator ticket about it. You may need to revert it to the previous version for now. Nosferattus (talk) 22:32, 7 July 2024 (UTC)[reply]
@Nosferattus: Per the uploader's comment on a featured media nomination for another film that cannot be played, the error is related to this Phabricator ticket: https://phabricator.wikimedia.org/T357215 Johnj1995 (talk) 03:27, 8 July 2024 (UTC)[reply]

SVG rendering on election maps

[edit]

I just uploaded a series of new maps for Icelandic parliamentary elections. I am seeing that despite the files being very similar, there are some inconsistencies with rendering of certain text elements. The circles should have abbreviations of the district names, these only appear in the 2021 map. In front of the party names there are boxes with the letters used to identify the parties, these sometimes don't show up. I have no idea why this happens. The font used is DejaVu sans which should work fine with Wikimedia. Bjarki S (talk) 09:41, 4 July 2024 (UTC)[reply]

I have identified the problem. For what ever reason, Inkscape decided to leave the coordinates (Y and X) of the missing elements as 0 in the tag tspan id. I'm fixing this manually in the XML editor. Bjarki S (talk) 10:16, 4 July 2024 (UTC)[reply]

Occupation "greek-catholic priest" instead of "politician" in Wikidata Infobox

[edit]

Is it just me or is the Wikidata Infobox at Category:Iriana saying that Iriana's occupation is "greek-catholic priest" instead of "politician"? I checked the Wikidata entry on her and on "politician" and it says correctly "politician". Where is the "greek-catholic priest" coming from? (note: I'm accessing the page via mobile browser. I've checked mobile view and desktop view on mobile browser but the infobox display is the same.) Nakonana (talk) 20:12, 4 July 2024 (UTC)[reply]

Just checked infoboxes of other politicians on Commons and they all list "greek-catholic priest" as occupation instead of "politician". Nakonana (talk) 20:14, 4 July 2024 (UTC)[reply]
Maybe mention it at Template talk:Wikidata Infobox. Seems to come from [14]. Enhancing999 (talk) 08:12, 5 July 2024 (UTC)[reply]
As this is already reverted purging the page to clean the cache should solve this. GPSLeo (talk) 08:32, 5 July 2024 (UTC)[reply]
Would you kindly do so? Enhancing999 (talk) 08:33, 5 July 2024 (UTC)[reply]
Up to 147559 category pages are concerned: [15], but it seems to be better now. Enhancing999 (talk) 05:58, 6 July 2024 (UTC)[reply]
Looks like it got fixed now. Nakonana (talk) 11:42, 6 July 2024 (UTC)[reply]
A manual purge of Category:Iriana does not seem to do the trick. Nakonana (talk) 15:56, 5 July 2024 (UTC)[reply]

Harvest coord from metadata

[edit]

somehow coord of File:Ccmhj.jpg from an iphone 14 pro was not detected by commons. a bot to check metadata and fill the coords into sdc would be nice. RZuo (talk) 08:57, 6 July 2024 (UTC)[reply]

Annotations not showing

[edit]

It seems I was able to add image notes in the past here but now I am unable to see them or the add note button - File:Coleman_Bangalore_entomologists.jpg - any way to turn on the annotation button which shows up on other images? Shyamal L. 11:10, 6 July 2024 (UTC)[reply]

Hi Shyamal, I am not sure if your problem is related but Fix the Image Annotator may be relevant. Commander Keane (talk) 05:59, 8 July 2024 (UTC)[reply]
@Commander Keane: Added my support. Jeez, never knew we could be that helpless in the open source world. Shyamal L. 06:02, 8 July 2024 (UTC)[reply]
@Shyamal: I think voting has closed for that RfC. I support a techical needs survey that is always open to suggestions and voting on Commons though. Commander Keane (talk) 06:09, 8 July 2024 (UTC)[reply]

Automatic categorization of subtitles needs to be renamed

[edit]

If a video (e.g. File:1952. Аленький цветочек.webm) has Slovene subtitles (e.g. TimedText:1952._Аленький_цветочек.webm.sl.srt), then it is categorized in Category:Files with closed captioning in Slovenian, but the main category (and English Wikipedia article, for what it's worth) are called "Slovene", not "Slovenian", cf. Category:Slovene language. —Justin (koavf)TCM 05:50, 8 July 2024 (UTC)[reply]

✓ Done Special:Diff/894319705 --Geohakkeri (talk) 06:21, 8 July 2024 (UTC)[reply]
hvala. —Justin (koavf)TCM 16:22, 8 July 2024 (UTC)[reply]

Tech News: 2024-28

[edit]

MediaWiki message delivery 21:28, 8 July 2024 (UTC)[reply]

Help needed from admins speaking javascript

[edit]

I am working on a backlog of {{Edit request}}s. I can handle most file, template and Lua requests but I do not speak javascript. Can an admin help with requests at Category:Commons_protected_edit_requests_for_interface_administrators? Jarekt (talk) 17:23, 9 July 2024 (UTC)[reply]

A gadget to mute audio of a video with one click

[edit]

Is there any gadget/tool/proposal for such a button on pages for videos that have audio?

I think many files in Category:Videos featuring unidentified music need their audio muted and one example case of a video that (as far as I can see) needs to be muted is File:Beijing to Shanghai by train timelapse.webm.

It would be very cumbersome if one first needs to download a large video, modify it somehow (which most users can't readily, don't bother doing, or would take them long), and then reupload as a new version before tagging the page with {{Overwritten revdel}} which probably even most active users don't know about (and adding Category:Videos without audio).

Instead, it should be just a click that makes the server run some ffmpeg command to remove the audio or similar. I don't know if this has been proposed somewhere if it doesn't yet exist. Prototyperspective (talk) 22:14, 12 July 2024 (UTC)[reply]

you imported the example video.
if you were not sure that the music is free, then you should have imported only the video using v2c! RZuo (talk) 05:43, 13 July 2024 (UTC)[reply]
Yes, I noticed it only afterwards and this made me wonder about such a button; your comment is not helpful. Prototyperspective (talk) 10:22, 13 July 2024 (UTC)[reply]
But why do you trust that the copyright statement at the source is correct for the video but not for the audio? GPSLeo (talk) 12:26, 13 July 2024 (UTC)[reply]
Because it was self-recorded by the youtuber who set this license? Also not helpful and offtopic. Prototyperspective (talk) 12:28, 13 July 2024 (UTC)[reply]

DelReqHandler broken for April requests?

[edit]

It seems that something broke the DelReqHandler tool on Commons:Deletion requests/2024/04, the usual links for closing requests don't appear there, any idea how to fix? Gestumblindi (talk) 11:18, 14 July 2024 (UTC)[reply]

DelReqHandler links appear for requests from April 18 and newer, but not for older April requests. I suppose something around April 18 went wrong? Gestumblindi (talk) 18:59, 15 July 2024 (UTC)[reply]
Issue still persists. Gestumblindi (talk) 09:19, 24 July 2024 (UTC)[reply]

New technical problem with generation of SVG preview images

[edit]

The preview images of File:MitigationOptions costs potentials IPCCAR6WGIII rotated-de.svg are broken. They used to be rendered and shown correctly. Since the graphic hasn't changed sind March 2023, it appears something with the SVG renderer ist broken. Does anyone know what happened? --DeWikiMan (talk) 14:56, 14 July 2024 (UTC)[reply]

Possibly the use of fill:currentColor and stroke:currentColor. WMF supports SVG 1.1. File claims to be SVG 1.1 (which uses a subset of CSS 2), but currentColor is from CSS 3. The value is supposed to select the current value of the color property. GNUPlot is not emitting SVG 1.1. The WMF renderer was changed (April 2024?) to a version that is a few years behind the latest release. Maybe a more recent version of librsvg (the WMF renderer) supports the property. Glrx (talk) 01:40, 15 July 2024 (UTC)[reply]
Thank you Glrx for the suggestion.
I substituted all occurences of "currentColor" with "black". The SVG 1.1 validator basically says that it is correct now (except for the RDF metadata and inkscape elements, see |validator.nu. I also tried to save it as "plain svg" from Inkscape. Uploaded both to Commons. Neither did help.
I ran rsvg-convert (version 2.52.5) on it and it gave a "rendering error: InvalidMatrix", whatever that means...
Do you have any further suggestions? I'd really appreciate it.
--DeWikiMan (talk) 17:58, 15 July 2024 (UTC)[reply]
Creating "optimized SVG" from Inkscape did the trick. I don't know exactly why. I believe, the problem could be related to this librsvg problem [16]. Probably, one of the transform matrices was not invertible. In such a case, the librsvg version which is now used on Commons, possibly does no longer ignore the transform, but fails and stops rendering.
--DeWikiMan (talk) 19:16, 15 July 2024 (UTC)[reply]
@DeWikiMan: Looks like you found the answer 30 minutes later. Glrx (talk) 19:21, 15 July 2024 (UTC)[reply]

MediaWiki internal error

[edit]

Accidentally set the license tag to {{|cc-by-sa-4.0-sikander}} instead of {{cc-by-sa-4.0-sikander}} on File:LCBO strike - Market street - 20240713C.jpg and got this error:
MediaWiki internal error.
Original exception: [a752adf9-f969-4f5e-b251-829dc2d1186e] 2024-07-14 20:57:43: Fatal exception of type "Wikimedia\Rdbms\DBUnexpectedError"
Exception caught inside exception handler.
Set $wgShowExceptionDetails = true; at the bottom of LocalSettings.php to show detailed debugging information.

Should I report this somewhere other than here? Regards // sikander { talk } 🦖 21:03, 14 July 2024 (UTC)[reply]

@PantheraLeo1359531: No, not happening now. Got that error a few times when updating the files but after a few minutes it started working fine. // sikander { talk } 🦖 16:54, 15 July 2024 (UTC)[reply]
Good, I assume it was only a shorter temporarily error ;) --PantheraLeo1359531 😺 (talk) 18:18, 15 July 2024 (UTC)[reply]

Tech News: 2024-29

[edit]

MediaWiki message delivery 01:28, 16 July 2024 (UTC)[reply]

Query to find dates of DR items

[edit]

Hi, the first list of files in the DR Commons:Deletion requests/Professional wrestling magazines has copyright issues depending on the date; if published after ~23 October 1987 then they are likely to be deleted. Many of the files only have year in the filename (and some are missing year in the filename), but most seem to have a specific date on the file itself - e.g. File:Ric Flair, circa Spring 1987 (cropped).jpg has 1987 in filename but a date of 1 March 1987 on the file. Is it possible for someone to run a query or join to bring the date from each file into the DR, so that the closer can identify which fall before 23 October 1987 and are eligible for deletion (depending on the DR decision)? Consigned (talk) 13:06, 17 July 2024 (UTC)[reply]

✓ Done, thank you Geohakkeri! Consigned (talk) 16:52, 17 July 2024 (UTC)[reply]

Problems with PDF Preview

[edit]

Hello, I noticed since a few hours ago the pdf preview function is bugging. I've been uploading slides for sometimes today and noticed that I neither could see the thumbnail nor see the preview in file pages. I did check with my friends, one who used same network as I used, and other who worked in other location, and also using my phone with different networks. All of them reported the same problem. Is this known bug? Or its the problem with my files? Thank you Hisyam Athaya (WMID) (talk) 09:25, 18 July 2024 (UTC)[reply]

It seems to be a known problem. @Sannita (WMF): is there a plan to fix it? Enhancing999 (talk) 17:48, 22 July 2024 (UTC)[reply]
The preview worked again, I asked other people who worked a lot with Commons and they confirmed this is a known problem. Hisyam Athaya (WMID) (talk) 02:19, 23 July 2024 (UTC)[reply]
It still needs to be fixed. Enhancing999 (talk) 07:18, 23 July 2024 (UTC)[reply]
AFAIK there are several tickets on Phabricator on the topic, so it is a known bug. I don't know which team has it, though, and I'm afraid the priority is not high on this. I'll try to investigate this. Sannita (WMF) (talk) 15:34, 23 July 2024 (UTC)[reply]

Hosting of free fonts in Commons

[edit]

As technical aspects of the following RfC, I thought it can be a good idea to crosspost link of this RfC Commons:Requests for comment/Hosting of free fonts in Commons in technical village pump. Pardon me if you see unsuitable or already visible enough. Thanks 😊 −Ebrahimtalk 12:48, 18 July 2024 (UTC)[reply]

Good idea, thank you for bringing this up :) --PantheraLeo1359531 😺 (talk) 07:25, 19 July 2024 (UTC)[reply]

SVG abruptly not displaying

[edit]

At some point within the last month File:LGBTCannabis_white.svg stopped displaying, and it's unclear to me why as this file was uploaded in 2020 and hasn't recently been changed other than being added to an additional category. Clicking 'Original file' gives a 'XML Parsing Error: prefix not bound to a namespace' error, while clicking any of the resolution PNG previews gives a 429 error. This error doesn't seem to be something on my end as I asked someone else on a different computer from a different internet connection to take a look and they confirmed that it's broken for them as well. Apologies if this is a known issue that's being worked on or something a-la graphs extension - I don't frequent Commons. Waxworker (talk) 05:08, 19 July 2024 (UTC)[reply]

@Waxworker: File:LGBTCannabis_white.svg is not a valid XML file. The Commons SVG renderer librsvg was recently upgraded from 2.44 to 2.50, which uses a stricter XML parser, resulting in this error. The fix here is to add a xmlns:sodipodi namespace declaration or just remove the sodipodi:nodetypes attribute. Other files affected by this:
Dexxor (talk) 09:49, 19 July 2024 (UTC)[reply]

The link on this template showing the copyright notice does not function, perhaps it is outdated. I mean the link in the sentence "The text of permission is available here." The current link on "here" is https://mosreg.ru/about/, which is not accesible. The correct link would be https://mk.mosreg.ru/o-sayte . Please can some template expert correct the link, I am not aware of all template technicallities. Regards, Ellywa (talk) 08:44, 19 July 2024 (UTC)[reply]

OCR to auto-categorize maps / charts by year shown

[edit]

Is there any gadget/tool for optical character recognition (OCR) of files on Wikimedia Commons?

If there is no such thing it would be really great if somebody could give it a try, it could be very useful.


I'd like to categorize Our World in Data maps by the year of the data into Category:Maps of the world by year as well as OWID charts by the latest data point into Category:Charts by year of latest data.

This is useful for many reasons such as making things in the image explicit as metadata, making things queryable (for example combining cats using petscan), statistics, search (see the search box), better enabling people to find the latest version for some data, better WMC search engine results, and (probably most importantly) updating outdated/old datagraphics that are in use (GLAMorgan can be used for that).

The issue there is that there are really many OWID files (which should now all be in the OWID category) and there may be even far more once people upload "image stacks" for the OWID Gadget if that is the way used to display more interactive OWID data (which I oppose as suboptimal).

One could go through the former manually which also has the advantage that many of these are missing one or a few other categories but the second one really has too many items to do that manually and again more OWID datagraphics keep getting uploaded and this isn't only about OWID datagraphics (there's also other cats one could scan).

See also my related comment here that is about machine vision on WMC more generally or automated species identification: …open letter…#Image recognition software for categorisers.

In my example usecase, an OCR Commons tool could for example OCR read all numbers in a file (files of the petscan results) and then (if it found one or a plausible one) set the category for the latest year that is ≤ current year. Prototyperspective (talk) 11:43, 19 July 2024 (UTC)[reply]

For Category:Images by text that could be helpful too. Ideally one could choose
  • a word, group of words, or category tree
  • define a maximum number of words or characters that should be on an image (sample: less than 5 words). This to avoid doing OCR on lengthy texts.
Then confirm suggestions made by OCR. Enhancing999 (talk) 12:21, 19 July 2024 (UTC)[reply]
SVG file to OCR
I do not know about gadgets.
There is an OCR tool.
See https://ocr.wmcloud.org/ for direct interface and API documentation.
It will work with PNG files but not SVG files (which can be converted to PNG and then OCR'd).
One can get the URL for a PNG rendering of an SVG file. Here's a conversion that is 887 pixels wide
Here's a Polish OCR run on that PNG:
So the Polish text is (converting Unicode code points to Unicode)
  • Typ ściągający
  • Typ naciskający
  • Typ obustronny
But why OCR an SVG file? The PetScan query shows SVG files that have text elements.
With JavaScript, read the SVG file with the Fetch API, grab the text elements with getElementsByTagNS(nsSVG, "text"), ask for the .textContent of each text element, and then search that string for the years or terms you want.
I do not know about the rest of the task.
Glrx (talk) 14:57, 19 July 2024 (UTC)[reply]
Wow great so around 70% of this already exists! Thanks a lot for this info. Now it basically only needs a way to make it scan files in petscan results.
SVG files always have a PNG file linked beneath them so they don't need to be converted again.
However, SVG files already have the text as plain text in them so rather than OCRing them it would be better if they the text contained in them was read somehow. However, that (which you also described in your bottom paragraph) is not needed here:
I tested it like so with a PNG render underneath File:Death-rate-smoking,1996.svg and it worked very well.
If there was a tool where one can e.g. enter a petscan ID and it makes these requests the other thing needed would be
  1. the small code that checks for the latest plausible year-number (and either in the first few lines / title or not in the same line as Data source)
  2. a bot that adds the categories to the files accordingly.
Is there a developer here who is interested in building these three missing parts assuming they don't also exist already? Prototyperspective (talk) 15:37, 19 July 2024 (UTC)[reply]
https://ocr.wmcloud.org/ interesting tool. Quite surprising what OCR on photos actually gives. I tried:
Both found "rue des lauriers", but the first also a motto and the second part of sticker from a key service on the pole ;)
Maybe OCR could be added automatically on upload and stored somehow to be searchable. Possibly, as structured data so it's editable. Enhancing999 (talk) 10:49, 22 July 2024 (UTC)[reply]
About SVG: ideally the text would be rendered on the file description page separately. Maybe that's something that can be added through LUA directly on Template:Information Enhancing999 (talk) 17:46, 22 July 2024 (UTC)[reply]

Characters Not Entering Properly

[edit]

I have been having a strange error where certain characters will not enter properly when editing Wikimedia Commons. For example, typing two left brackets ("[[") converts both of them into a "ʽ". The same happens in reverse for two right brackets. However, it only happens when typing them sequentially. In other words, if I type one, then move the caret to the left and type the second one they remain brackets. Similarly, copy-pasting them from somewhere else also doesn't cause any issues. In another case, typing an asterisk ("*") results in what is apparently a diaeresis (It won't reproduce ). This only happens on Wikimedia Commons and not Wikipedia or any computer program. However, it does occur on both my userpage and this post. Any idea what is causing this and how to fix it? –Noha307 (talk) 17:33, 22 July 2024 (UTC)[reply]

@Noha307: Focus the Commons search bar. If you see a little keyboard icon as depicted, click it and select “Disable input tools.” --Geohakkeri (talk) 19:13, 22 July 2024 (UTC)[reply]
Hey, that fixed it! Thank you! Noha307 (talk) 21:53, 22 July 2024 (UTC)[reply]

Tech News: 2024-30

[edit]

MediaWiki message delivery 00:01, 23 July 2024 (UTC)[reply]

PD template error: author "I, John Doe"

[edit]

Hi, I found an irritating error in early PD template (2007-2008) and assume there are more than 23K instances of it's faulty use. check

The template creates a set of three lines that adds an "I" to the author's name in two of them. Obviously derrived from " I, John Doe, the copyright holder" it mentions author's name as "I, John Doe". I am not exactly sure where this error lives (I can see that it is on the pages now, inside the PD|self template). I see it as very irritating and kind of disrespectful towards the creators to missspell their names this way adding a random "I" to their name. Does anyone have a good approach to fix thse instances and check the template? Thanks, I saw this in the Dutch translation template. Peli (talk) 23:08, 23 July 2024 (UTC)[reply]

This is a good candidate for Commons:Bots/Work requests. I went ahead and made a request to fix this issue at here. —CalendulaAsteraceae (talkcontribs) 06:53, 24 July 2024 (UTC)[reply]
Thanks, great move. But I'd like to add that the 'list' is just a kind of educated guess, created by a certain search key, I was not able to check the text in a all or in a significant number of the real pages. The test was just confirmed by looking at a very small number of pages in the first page of the results. Peli (talk) 07:13, 24 July 2024 (UTC)[reply]