English Wikipedia @ Freddythechick:Administrators' noticeboard/CXT/Pages to review/Regex test 1
Testing regexes per WP:AN#X2-nuke interim period.
May 7 ver #779254187
As of version 779254187 of Wikipedia:Administrators' noticeboard/CXT/Pages to review (21:56, 7 May 2017) I count:
- 738 <s> tags, 732 </s> tags, and 587 non-User space articles per this pattern:
Keepers
^\|\s*<s>\s*\[\[((?!User).*?)\s*\]\]</s>
What I learned
- This pattern does not exclude Draft-space articles and (of course) cannot recognize articles that don't exist (red-links) but maybe there's a magic word the script can use for that?
- Are we really keeping articles where the title is entirely in non-latin script (e.g., 三国)?
Nukers
^\|\s*\[\[((?!User)(?!Draft).*?)\s*\]\]
What I learned
- Doesn't yet exclude Template or Template talk spaces; doesn't exclude subpages (titles with slash in them).
- Pattern above requires colons to avoid false positives like Draft board or User story.
- 587 + 2785 = 3372, which is < 3602. There are a handful of malformed <s> tags, but not enough to account for the discrepancy.
May 14 ver #780319854
Malformed strikeout test
As of version 780319854 of WP:CXT/PTR ( 09:02, 14 May 2017) I count 43 malformed strikeouts, with an ending </s>-tag not immediately following the double close-brackets ( ]] ) of the linked article title.
These are items matching ^\|\s*<s>\[\[([^]]+)\]\](?!</s>)
:
(the pattern ^\|\s*<s>\[\[([^]]+)\]\]\s*(?!</s>)
would be more robust but wasn't used for this try):
@Mathglot: These should all be fixed now. Tazerdadog (talk) 10:35, 14 May 2017 (UTC)