If the text starting at a given offset is a lookup key in the map, return the corresponding transformation from the map, else false
.
Description
This function returns the translated string, but accepts an optional parameter $matched_token_byte_length
, which communicates how many bytes long the lookup key was, if it found one. This can be used to advance a cursor in calling code if a lookup key was found.
Example:
false === $smilies->read_token( 'Not sure :?.', 0, $token_byte_length );
'😕' === $smilies->read_token( 'Not sure :?.', 9, $token_byte_length );
2 === $token_byte_length;
Example:
while ( $at < strlen( $input ) ) {
$next_at = strpos( $input, ':', $at );
if ( false === $next_at ) {
break;
}
$smily = $smilies->read_token( $input, $next_at, $token_byte_length );
if ( false === $next_at ) {
++$at;
continue;
}
$prefix = substr( $input, $at, $next_at - $at );
$at += $token_byte_length;
$output .= "{$prefix}{$smily}";
}
Parameters
$text
stringrequired- String in which to search for a lookup key.
$offset
intoptional- How many bytes into the string where the lookup key ought to start. Default 0.
- &$matched_token_byte_length Optional. Holds byte-length of found token matched, otherwise not set. Default null.
$case_sensitivity
stringoptional- Pass
'ascii-case-insensitive'
to ignore ASCII case when matching. Default'case-sensitive'
.Default:
'case-sensitive'
Source
public function read_token( $text, $offset = 0, &$matched_token_byte_length = null, $case_sensitivity = 'case-sensitive' ) {
$ignore_case = 'ascii-case-insensitive' === $case_sensitivity;
$text_length = strlen( $text );
// Search for a long word first, if the text is long enough, and if that fails, a short one.
if ( $text_length > $this->key_length ) {
$group_key = substr( $text, $offset, $this->key_length );
$group_at = $ignore_case ? stripos( $this->groups, $group_key ) : strpos( $this->groups, $group_key );
if ( false === $group_at ) {
// Perhaps a short word then.
return strlen( $this->small_words ) > 0
? $this->read_small_token( $text, $offset, $matched_token_byte_length, $case_sensitivity )
: null;
}
$group = $this->large_words[ $group_at / ( $this->key_length + 1 ) ];
$group_length = strlen( $group );
$at = 0;
while ( $at < $group_length ) {
$token_length = unpack( 'C', $group[ $at++ ] )[1];
$token = substr( $group, $at, $token_length );
$at += $token_length;
$mapping_length = unpack( 'C', $group[ $at++ ] )[1];
$mapping_at = $at;
if ( 0 === substr_compare( $text, $token, $offset + $this->key_length, $token_length, $ignore_case ) ) {
$matched_token_byte_length = $this->key_length + $token_length;
return substr( $group, $mapping_at, $mapping_length );
}
$at = $mapping_at + $mapping_length;
}
}
// Perhaps a short word then.
return strlen( $this->small_words ) > 0
? $this->read_small_token( $text, $offset, $matched_token_byte_length, $case_sensitivity )
: null;
}
Changelog
Version | Description |
---|---|
6.6.0 | Introduced. |
User Contributed Notes
You must log in before being able to contribute a note or feedback.