WP_HTML_Tag_Processor::get_modifiable_text(): string

Returns the modifiable text for a matched token, or an empty string.

Description

Modifiable text is text content that may be read and changed without changing the HTML structure of the document around it. This includes the contents of #text nodes in the HTML as well as the inner contents of HTML comments, Processing Instructions, and others, even though these nodes aren’t part of a parsed DOM tree. They also contain the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any other section in an HTML document which cannot contain HTML markup (DATA).

If a token has no modifiable text then an empty string is returned to avoid needless crashing or type errors. An empty string does not mean that a token has modifiable text, and a token with modifiable text may have an empty string (e.g. a comment with no contents).

Return

string

Source

public function get_modifiable_text() {
	if ( null === $this->text_starts_at ) {
		return '';
	}

	$text = substr( $this->html, $this->text_starts_at, $this->text_length );

	// Comment data is not decoded.
	if (
		self::STATE_CDATA_NODE === $this->parser_state ||
		self::STATE_COMMENT === $this->parser_state ||
		self::STATE_DOCTYPE === $this->parser_state ||
		self::STATE_FUNKY_COMMENT === $this->parser_state
	) {
		return $text;
	}

	$tag_name = $this->get_tag();
	if (
		// Script data is not decoded.
		'SCRIPT' === $tag_name ||

		// RAWTEXT data is not decoded.
		'IFRAME' === $tag_name ||
		'NOEMBED' === $tag_name ||
		'NOFRAMES' === $tag_name ||
		'STYLE' === $tag_name ||
		'XMP' === $tag_name
	) {
		return $text;
	}

	$decoded = WP_HTML_Decoder::decode_text_node( $text );

	/*
	 * TEXTAREA skips a leading newline, but this newline may appear not only as the
	 * literal character `\n`, but also as a character reference, such as in the
	 * following markup: `<textarea>&#x0a;Content</textarea>`.
	 *
	 * For these cases it's important to first decode the text content before checking
	 * for a leading newline and removing it.
	 */
	if (
		self::STATE_MATCHED_TAG === $this->parser_state &&
		'TEXTAREA' === $tag_name &&
		strlen( $decoded ) > 0 &&
		"\n" === $decoded[0]
	) {
		return substr( $decoded, 1 );
	}

	return $decoded;
}

Changelog

VersionDescription
6.5.0Introduced.

User Contributed Notes

  1. Skip to note 2 content

    Since a #text node is not part of the tag itself, get_modifiable_text() can’t be used directly.
    After selecting the desired tag you must first pass to the next_token().

    function wpdocs_get_text_from_block( $block_content, $block ) {
    
    	// $block_content = "<div>Lorem Ipsum</div>"
    	$processor = new WP_HTML_Tag_Processor( $block_content );
    
    	if ( $processor->next_tag( 'div' ) ) {
    		$processor->next_token();
    		$node_text = $processor->get_modifiable_text();
    		error_log( $node_text ); // output: "Lorem Ipsum"
      	}
    
    	return $processor->get_updated_html();
    }
    add_filter( 'render_block_custom/div', 'wpdocs_get_text_from_block', 10, 2 )

You must log in before being able to contribute a note or feedback.