Requests::compatible_gzinflate( string $gz_data ): string|bool

In this article

Decompression of deflated string while staying compatible with the majority of servers.

Description

Certain Servers will return deflated data with headers which PHP’s gzinflate() function cannot handle out of the box. The following function has been created from various snippets on the gzinflate() PHP documentation.

Warning: Magic numbers within. Due to the potential different formats that the compressed data may be returned in, some "magic offsets" are needed to ensure proper decompression takes place. For a simple progmatic way to determine the magic offset in use, see: https://core.trac.wordpress.org/ticket/18273

Parameters

$gz_datastringrequired
String to decompress.

Return

string|bool False on failure.

Source

public static function compatible_gzinflate($gz_data) {
	if (is_string($gz_data) === false) {
		throw InvalidArgument::create(1, '$gz_data', 'string', gettype($gz_data));
	}

	if (trim($gz_data) === '') {
		return false;
	}

	// Compressed data might contain a full zlib header, if so strip it for
	// gzinflate()
	if (substr($gz_data, 0, 3) === "\x1f\x8b\x08") {
		$i   = 10;
		$flg = ord(substr($gz_data, 3, 1));
		if ($flg > 0) {
			if ($flg & 4) {
				list($xlen) = unpack('v', substr($gz_data, $i, 2));
				$i         += 2 + $xlen;
			}

			if ($flg & 8) {
				$i = strpos($gz_data, "\0", $i) + 1;
			}

			if ($flg & 16) {
				$i = strpos($gz_data, "\0", $i) + 1;
			}

			if ($flg & 2) {
				$i += 2;
			}
		}

		$decompressed = self::compatible_gzinflate(substr($gz_data, $i));
		if ($decompressed !== false) {
			return $decompressed;
		}
	}

	// If the data is Huffman Encoded, we must first strip the leading 2
	// byte Huffman marker for gzinflate()
	// The response is Huffman coded by many compressors such as
	// java.util.zip.Deflater, Ruby's Zlib::Deflate, and .NET's
	// System.IO.Compression.DeflateStream.
	//
	// See https://decompres.blogspot.com/ for a quick explanation of this
	// data type
	$huffman_encoded = false;

	// low nibble of first byte should be 0x08
	list(, $first_nibble) = unpack('h', $gz_data);

	// First 2 bytes should be divisible by 0x1F
	list(, $first_two_bytes) = unpack('n', $gz_data);

	if ($first_nibble === 0x08 && ($first_two_bytes % 0x1F) === 0) {
		$huffman_encoded = true;
	}

	if ($huffman_encoded) {
		$decompressed = @gzinflate(substr($gz_data, 2));
		if ($decompressed !== false) {
			return $decompressed;
		}
	}

	if (substr($gz_data, 0, 4) === "\x50\x4b\x03\x04") {
		// ZIP file format header
		// Offset 6: 2 bytes, General-purpose field
		// Offset 26: 2 bytes, filename length
		// Offset 28: 2 bytes, optional field length
		// Offset 30: Filename field, followed by optional field, followed
		// immediately by data
		list(, $general_purpose_flag) = unpack('v', substr($gz_data, 6, 2));

		// If the file has been compressed on the fly, 0x08 bit is set of
		// the general purpose field. We can use this to differentiate
		// between a compressed document, and a ZIP file
		$zip_compressed_on_the_fly = ((0x08 & $general_purpose_flag) === 0x08);

		if (!$zip_compressed_on_the_fly) {
			// Don't attempt to decode a compressed zip file
			return $gz_data;
		}

		// Determine the first byte of data, based on the above ZIP header
		// offsets:
		$first_file_start = array_sum(unpack('v2', substr($gz_data, 26, 4)));
		$decompressed     = @gzinflate(substr($gz_data, 30 + $first_file_start));
		if ($decompressed !== false) {
			return $decompressed;
		}

		return false;
	}

	// Finally fall back to straight gzinflate
	$decompressed = @gzinflate($gz_data);
	if ($decompressed !== false) {
		return $decompressed;
	}

	// Fallback for all above failing, not expected, but included for
	// debugging and preventing regressions and to track stats
	$decompressed = @gzinflate(substr($gz_data, 2));
	if ($decompressed !== false) {
		return $decompressed;
	}

	return false;
}

Changelog

VersionDescription
1.6.0Introduced.

User Contributed Notes

You must log in before being able to contribute a note or feedback.