php - Regex for names with special characters (Unicode) -
okay, have read regex day now, , still don't understand properly. i'm trying validate name, functions can find on internet use [a-za-z]
, leaving characters out need accept to.
i need regex checks name @ least 2 words, , not contain numbers or special characters !"#¤%&/()=...
, words can contain characters æ, é, Â , on...
an example of accepted name be: "john elkjærd" or "andré svenson"
an non-accepted name be: "hans", "h4nn3 andersen" or "martin henriksen!"
if matters use javascript .match()
function client side , want use php's preg_replace()
"in negative" server side. (removing non-matching characters).
any appreciated.
update:
okay, alix axel's answer have important part down, server side one.
but page lightwing's answer suggests, i'm unable find unicode support javascript, ended half solution client side, checking @ least 2 words , minimum 5 characters this:
if(name.match(/\s+/g).length >= minwords && name.length >= 5) { //valid }
an alternative specify unicode characters suggested in shifty's answer, might end doing like, along solution above, bit unpractical though.
try following regular expression:
^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$
in php translates to:
if (preg_match('~^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$~u', $name) > 0) { // valid }
you should read this:
^ # start of subject (?: # match this: [ # match a: \p{l} # unicode letter, or \p{mn} # unicode accents, or \p{pd} # unicode hyphens, or \' # single quote, or \x{2019} # single quote (alternative) ]+ # 1 or more times \s # kind of space [ #match a: \p{l} # unicode letter, or \p{mn} # unicode accents, or \p{pd} # unicode hyphens, or \' # single quote, or \x{2019} # single quote (alternative) ]+ # 1 or more times \s? # kind of space (0 or more times) )+ # 1 or more times $ # end of subject
i don't know how port javascript, i'm not sure javascript supports unicode properties in php pcre seems work flawlessly @ ideone.com:
$names = array ( 'alix', 'andré svenson', 'h4nn3 andersen', 'hans', 'john elkjærd', 'kristoffer la cour', 'marco d\'almeida', 'martin henriksen!', ); foreach ($names $name) { echo sprintf('%s %s' . "\n", $name, (preg_match('~^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$~u', $name) > 0) ? 'valid' : 'invalid'); }
i'm sorry can't regarding javascript part here will.
validates:
- john elkjærd
- andré svenson
- marco d'almeida
- kristoffer la cour
invalidates:
- hans
- h4nn3 andersen
- martin henriksen!
to replace invalid characters, though i'm not sure why need this, need change slightly:
$name = preg_replace('~[^\p{l}\p{mn}\p{pd}\'\x{2019}\s]~u', '$1', $name);
examples:
- h4nn3 andersen -> hnn andersen
- martin henriksen! -> martin henriksen
note need use u modifier.
Comments
Post a Comment