Uses RegEx to extract URLs from arbitrary content.
Parameters
$content
stringrequired- Content to extract URLs from.
Source
function wp_extract_urls( $content ) {
preg_match_all(
"#([\"']?)("
. '(?:([\w-]+:)?//?)'
. '[^\s()<>]+'
. '[.]'
. '(?:'
. '\([\w\d]+\)|'
. '(?:'
. "[^`!()\[\]{}:'\".,<>«»“”‘’\s]|"
. '(?:[:]\d+)?/?'
. ')+'
. ')'
. ")\\1#",
$content,
$post_links
);
$post_links = array_unique(
array_map(
static function ( $link ) {
// Decode to replace valid entities, like &.
$link = html_entity_decode( $link );
// Maintain backward compatibility by removing extraneous semi-colons (`;`).
return str_replace( ';', '', $link );
},
$post_links[2]
)
);
return array_values( $post_links );
}
This doesn’t work for localhost URLs without TLDs:
(See this ticket.)
Example
This Code:
Will return an array like this: