WP_HTML_Tag_Processor::get_modifiable_text(): string

In this article

Returns the modifiable text for a matched token, or an empty string.


Modifiable text is text content that may be read and changed without changing the HTML structure of the document around it. This includes the contents of #text nodes in the HTML as well as the inner contents of HTML comments, Processing Instructions, and others, even though these nodes aren’t part of a parsed DOM tree. They also contain the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any other section in an HTML document which cannot contain HTML markup (DATA).

If a token has no modifiable text then an empty string is returned to avoid needless crashing or type errors. An empty string does not mean that a token has modifiable text, and a token with modifiable text may have an empty string (e.g. a comment with no contents).




public function get_modifiable_text() {
	if ( null === $this->text_starts_at ) {
		return '';

	$text = substr( $this->html, $this->text_starts_at, $this->text_length );

	// Comment data is not decoded.
	if (
		self::STATE_CDATA_NODE === $this->parser_state ||
		self::STATE_COMMENT === $this->parser_state ||
		self::STATE_DOCTYPE === $this->parser_state ||
		self::STATE_FUNKY_COMMENT === $this->parser_state
	) {
		return $text;

	$tag_name = $this->get_tag();
	if (
		// Script data is not decoded.
		'SCRIPT' === $tag_name ||

		// RAWTEXT data is not decoded.
		'IFRAME' === $tag_name ||
		'NOEMBED' === $tag_name ||
		'NOFRAMES' === $tag_name ||
		'STYLE' === $tag_name ||
		'XMP' === $tag_name
	) {
		return $text;

	$decoded = html_entity_decode( $text, ENT_QUOTES | ENT_HTML5 | ENT_SUBSTITUTE );

	 * TEXTAREA skips a leading newline, but this newline may appear not only as the
	 * literal character `\n`, but also as a character reference, such as in the
	 * following markup: `<textarea>&#x0a;Content</textarea>`.
	 * For these cases it's important to first decode the text content before checking
	 * for a leading newline and removing it.
	if (
		self::STATE_MATCHED_TAG === $this->parser_state &&
		'TEXTAREA' === $tag_name &&
		strlen( $decoded ) > 0 &&
		"\n" === $decoded[0]
	) {
		return substr( $decoded, 1 );

	return $decoded;



User Contributed Notes

You must log in before being able to contribute a note or feedback.