Author |
Message |
mosc
Site Admin

Joined: Jan 31, 2003 Posts: 18235 Location: Durham, NC
Audio files: 222
G2 patch files: 60
|
Posted: Thu Sep 23, 2004 2:49 pm Post subject:
Regular expression problem |
 |
|
If you don't know what a regular expression is, then you won't get anything out of this topic. But there are some hackers who read the forum.
We're experiencing a problem on the Nord Modular mailing list. There are "topics" that you can subscribe to using the Mailman software. One is "G2" and one is "Not G2". People who subscribe to the NotG2 topic are getting mail to the G2 topic.
Mailman uses regular expressions to select topics. People who send mail about the G2 are asked to put G2 in the subject. Most put it in brackets like this [G2]. The regep for this is simply /[gG][2]/. Works fine.
The regep for the NotG2 is /[^gG][^2]/ which is, of course, useless. It detects any message where there are two consecutive characters G2 or g2.
Anyone have any ideas on how to construct a regep which matches a line that doesn't have G2 or g2 in it? At first it seems like a trivial problem, but it's tricky.
 _________________ --Howard
my music and other stuff |
|
Back to top
|
|
 |
DrJustice

Joined: Sep 13, 2004 Posts: 2112 Location: Morokulien
Audio files: 4
|
Posted: Thu Sep 23, 2004 4:09 pm Post subject:
|
 |
|
Don't know if the following could help:
click here
DJ
--
[editor's note: DJ, I edited this post to make the URL link shorter so we wouldn't have wide pages. Use the second from for the URL bbcode as indicated with the help popup when you put the mouse over the URL button. --mosc] |
|
Back to top
|
|
 |
oXo

Joined: Sep 20, 2004 Posts: 36 Location: Paris
|
Posted: Fri Sep 24, 2004 2:09 am Post subject:
|
 |
|
Hello,
May be you can try this :
for the G2, your expression is not really correct since it match only G2 or g2 but it miss the bracket ! so the right expression will be /\[[gG]2\]/
you must use \ to protect special caracter you can also write it like this
/\[G2\]|\[g2\]/
The next one is more tricky
/^(?:(?!G2|g2).)*$/
Normally it should work.
Have fun. Last edited by oXo on Fri Sep 24, 2004 2:13 am; edited 1 time in total |
|
Back to top
|
|
 |
mosc
Site Admin

Joined: Jan 31, 2003 Posts: 18235 Location: Durham, NC
Audio files: 222
G2 patch files: 60
|
Posted: Fri Sep 24, 2004 1:59 pm Post subject:
|
 |
|
Thanks, but I'm still not having much luck. Maybe I'm not doing something correctly. I have a test file called junk. In that file is this:
My favorite martian
My [G2] is good
My [g2] is not good
My g45 is 2 good
My x2 is good 2
[NM][G2]this is a test
Re:[NM][G2]this is a test
G2
nothing here
a
I run egrep like this:
egrep '^(?:(?!G2|g2).)*$' junk
I get nothing. Maybe the ?: and the ?! aren't standard regeps, but extended regeps that Perl added and now is supported in some programs, but not universally. _________________ --Howard
my music and other stuff |
|
Back to top
|
|
 |
play

Joined: Feb 08, 2004 Posts: 489 Location: behind the mustard
Audio files: 2
|
Posted: Fri Sep 24, 2004 8:11 pm Post subject:
|
 |
|
well ya know if you are using php for this you can just go:
Code: |
if (preg_match("/g2/i", $subject) === false) {
echo $subject;
}
|
|
|
Back to top
|
|
 |
Dovdimus Prime

Joined: Jul 26, 2004 Posts: 664 Location: Bristol, UK
Audio files: 6
|
Posted: Sat Sep 25, 2004 2:20 am Post subject:
|
 |
|
=== to test equality?? What a strange language!
What is == used for? _________________ This message was brought to you from Beyond The Grave. |
|
Back to top
|
|
 |
play

Joined: Feb 08, 2004 Posts: 489 Location: behind the mustard
Audio files: 2
|
Posted: Sat Sep 25, 2004 8:08 am Post subject:
|
 |
|
actually I messed that up in my haste. it should be
if (preg_match("/g2/i", $subject) == 0){
echo $subject;
}
php has implicit types so when '==' is used it only checks if the evaluation is equal. 0 == false is true because they evaluate to the same thing. '===' also checks type so '0 === false' is false. Only 'false === false' is true. It's necessary with the strpos() function because it can find a string a position 0 and 0 evaluates to false.
this silly regex thing has kept me busy for several hours last night. Here's my best effort so far, very buggy.
/^(g[^2]|[^g]2|[^g][^2])*/i
it doesn't work with "[NM][G2]" but it does with "G2:" and "[G2]"
This regex also will not match a string with a single character.
you're gonna have to use some kind of scripting language regardless because the regex, even if it worked, would still return the rest of the subject string, minus the G2 part. Last edited by play on Sat Sep 25, 2004 11:05 am; edited 1 time in total |
|
Back to top
|
|
 |
mosc
Site Admin

Joined: Jan 31, 2003 Posts: 18235 Location: Durham, NC
Audio files: 222
G2 patch files: 60
|
Posted: Sat Sep 25, 2004 9:02 am Post subject:
|
 |
|
Thanks for trying. I can't use a scripting language becasue the mailing lise is already written in Python. I don't want to hack that code. I'm thinking to ask people to put [G1] in the subject if they are about not G2 topics. Or maybe more simply, if it is G2 then the subject would start [NM][G2]. If not, it would just be [NM] with no right bracket following it. Then I could do a positive match (what regeps were designed for) for '^*\[NM\] *[^\[]*$'
Or something like that. _________________ --Howard
my music and other stuff |
|
Back to top
|
|
 |
diskonext

Joined: Aug 26, 2004 Posts: 306 Location: London, UK
|
Posted: Sun Oct 10, 2004 4:36 am Post subject:
|
 |
|
Hi Mosc,
I don't know if this could help, but there is a setting to explicitly block messages with a certain header-content:
[from http://staff.imsa.edu/~ckolar/mailman/mailman-administration-v2.html]
Hold posts with header value matching a specified regexp.
Allows you to filter out known addresses or domains that function primarily as spam providers.
So maybe you could post-filter with this (negative) filter, instead of using a positive one? The example is rather specific, but the 'hold posts with header value' seems broad enough to allow tweaking.
I never used Mailman, so this might be totally off...
-diskonext _________________ :wq |
|
Back to top
|
|
 |
cappy2112

Joined: Dec 24, 2004 Posts: 2490 Location: San Jose, California
Audio files: 2
G2 patch files: 1
|
Posted: Fri Nov 18, 2005 7:09 pm Post subject:
|
 |
|
mosc wrote: | Thanks for trying. I can't use a scripting language becasue the mailing lise is already written in Python. I don't want to hack that code. I'm thinking to ask people to put [G1] in the subject if they are about not G2 topics. Or maybe more simply, if it is G2 then the subject would start [NM][G2]. If not, it would just be [NM] with no right bracket following it. Then I could do a positive match (what regeps were designed for) for '^*\[NM\] *[^\[]*$'
Or something like that. |
Mosc, this is an old question- and you've probably got this solved by now
You can post this on the Python Tutor mail list.
I've gotten help with regex questions before.
There are also regex newsgroups- but you gotta watch many of them are perl-centric, and Python's regex engine is slightly different.
I used a Regex debugger called Kodos, but there are many others on the web. |
|
Back to top
|
|
 |
mosc
Site Admin

Joined: Jan 31, 2003 Posts: 18235 Location: Durham, NC
Audio files: 222
G2 patch files: 60
|
Posted: Tue Nov 22, 2005 10:19 am Post subject:
|
 |
|
Thanks for the tip, Tony. If this gets back on my radar screen, I'll check out tha Python list for sure. _________________ --Howard
my music and other stuff |
|
Back to top
|
|
 |
|