php - search through array of strings for fuzzy string match -


i have 2 arrays following:

$arr1 = ("stringtype1andsomerandomstuff",          "stringtype2andsomerandomstuff",          "stringtype3andsomerandomstuff",          "stringtype1andsomerandomstuff",          "stringtype2andsomerandomstuff",          "i don't belong here @ all!",          "stringtype4andsomerandomstuff"); 

in first array ($arr1), of keys have sort of common attribute. in example text above, stringtypex. 'common factor' need search by. each string has sort of data exemplified andsomerandomstuff.

the second array looks this:

$arr2 = ("stringtype1" => "category1",          "stringtype2" => "category2",          "stringtype3" => "category3",          "stringtype4" => "category4"); 

i need go through each string in $arr1 , see if closely matches of keys in $arr2. if matches 1 of keys, need value of key $arr2.

how can iterate through each of strings in $arr1 , determine (if any) of keys in $arr2 apply? basically, need go through every string in $arr1 , perform partial match on all of keys in $arr2, find closest match. immediate solution comes mind use 2 loops (outer in $arr1 , inner each in $arr2), there function in php can take string , see if matches string in existing array? know of more performant way this?

map $arr1 function calculates string-edit-distance keys in $arr2, , returns closest match. take @ this levenshtein distance function. or, startswith comparison in mapping function.

you'll have looks this:

$stringeditdistancethreshold = 5; // greater means rejected  // define mapping function function findclosestmatchingstring($s) {     $closestdistancethusfar = $stringeditdistancethreshold + 1;     $closestmatchvalue      = null;      foreach ($arr2 $key => $value) {         $editdistance = levenshtein($key, $s);          // exact match         if ($editdistance == 0) {             return $value;          // best match far, update values compare against/return         } elseif ($editdistance < $closestdistancethusfar) {             $closestdistancethusfar = $editdistance;             $closestmatchvalue      = $value;         }     }      return $closestmatch; // possible return null if threshold hasn't been met }  // mapping $matchingvalues = array_map('findclosestmatchingstring', $arr1); 

you'll have tune $stringeditdistancethreshold until values you're happy with. or use startswith function, simplify findclosestmatchingstring has do.

finally, isn't efficient. it's ugly nested loop. may able pruning or else clever, suspect if arrays small, may not care.

edit: stated @ohgodwhy in comment below, preg_grep work better you. in case, map function following:

function findfirstmatchingstring($s) {     $matchingkeys = preg_grep($s, array_keys($arr2));      if (!empty($matchingkeys) {         // return value of first match         return $arr2[$matches[0]];     }      return null; } 

Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -