php - Regex for names with special characters (Unicode) -


okay, have read regex day now, , still don't understand properly. i'm trying validate name, functions can find on internet use [a-za-z], leaving characters out need accept to.

i need regex checks name @ least 2 words, , not contain numbers or special characters !"#¤%&/()=..., words can contain characters æ, é, Â , on...

an example of accepted name be: "john elkjærd" or "andré svenson"
an non-accepted name be: "hans", "h4nn3 andersen" or "martin henriksen!"

if matters use javascript .match() function client side , want use php's preg_replace() "in negative" server side. (removing non-matching characters).

any appreciated.

update:
okay, alix axel's answer have important part down, server side one.

but page lightwing's answer suggests, i'm unable find unicode support javascript, ended half solution client side, checking @ least 2 words , minimum 5 characters this:

if(name.match(/\s+/g).length >= minwords && name.length >= 5) {   //valid } 

an alternative specify unicode characters suggested in shifty's answer, might end doing like, along solution above, bit unpractical though.

try following regular expression:

^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$ 

in php translates to:

if (preg_match('~^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$~u', $name) > 0) {     // valid } 

you should read this:

^   # start of subject     (?:     # match this:         [           # match a:             \p{l}       # unicode letter, or             \p{mn}      # unicode accents, or             \p{pd}      # unicode hyphens, or             \'          # single quote, or             \x{2019}    # single quote (alternative)         ]+              # 1 or more times         \s          # kind of space         [               #match a:             \p{l}       # unicode letter, or             \p{mn}      # unicode accents, or             \p{pd}      # unicode hyphens, or             \'          # single quote, or             \x{2019}    # single quote (alternative)         ]+              # 1 or more times         \s?         # kind of space (0 or more times)     )+      # 1 or more times $   # end of subject 

i don't know how port javascript, i'm not sure javascript supports unicode properties in php pcre seems work flawlessly @ ideone.com:

$names = array (     'alix',     'andré svenson',     'h4nn3 andersen',     'hans',     'john elkjærd',     'kristoffer la cour',     'marco d\'almeida',     'martin henriksen!', );  foreach ($names $name) {     echo sprintf('%s %s' . "\n", $name, (preg_match('~^(?:[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s[\p{l}\p{mn}\p{pd}\'\x{2019}]+\s?)+$~u', $name) > 0) ? 'valid' : 'invalid'); } 

i'm sorry can't regarding javascript part here will.


validates:

  • john elkjærd
  • andré svenson
  • marco d'almeida
  • kristoffer la cour

invalidates:

  • hans
  • h4nn3 andersen
  • martin henriksen!

to replace invalid characters, though i'm not sure why need this, need change slightly:

$name = preg_replace('~[^\p{l}\p{mn}\p{pd}\'\x{2019}\s]~u', '$1', $name); 

examples:

  • h4nn3 andersen -> hnn andersen
  • martin henriksen! -> martin henriksen

note need use u modifier.


Comments

Popular posts from this blog

c# - how to write client side events functions for the combobox items -

exception - Python, pyPdf OCR error: pyPdf.utils.PdfReadError: EOF marker not found -