Computes a number that is intended to reflect the “distance” between two strings.
Parameters
$string1
stringrequired$string2
stringrequired
Source
public function compute_string_distance( $string1, $string2 ) {
// Use an md5 hash of the strings for a count cache, as it's fast to generate, and collisions aren't a concern.
$count_key1 = md5( $string1 );
$count_key2 = md5( $string2 );
// Cache vectors containing character frequency for all chars in each string.
if ( ! isset( $this->count_cache[ $count_key1 ] ) ) {
$this->count_cache[ $count_key1 ] = count_chars( $string1 );
}
if ( ! isset( $this->count_cache[ $count_key2 ] ) ) {
$this->count_cache[ $count_key2 ] = count_chars( $string2 );
}
$chars1 = $this->count_cache[ $count_key1 ];
$chars2 = $this->count_cache[ $count_key2 ];
$difference_key = md5( implode( ',', $chars1 ) . ':' . implode( ',', $chars2 ) );
if ( ! isset( $this->difference_cache[ $difference_key ] ) ) {
// L1-norm of difference vector.
$this->difference_cache[ $difference_key ] = array_sum( array_map( array( $this, 'difference' ), $chars1, $chars2 ) );
}
$difference = $this->difference_cache[ $difference_key ];
// $string1 has zero length? Odd. Give huge penalty by not dividing.
if ( ! $string1 ) {
return $difference;
}
// Return distance per character (of string1).
return $difference / strlen( $string1 );
}
Changelog
Version | Description |
---|---|
2.6.0 | Introduced. |
User Contributed Notes
You must log in before being able to contribute a note or feedback.