utf8_decode
(PHP 4, PHP 5, PHP 7, PHP 8)
utf8_decode — Converts a string from UTF-8 to ISO-8859-1, replacing invalid or unrepresentable characters
This function has been DEPRECATED as of PHP 8.2.0. Relying on this function is highly discouraged.
Description
This function converts the string string from the UTF-8 encoding to ISO-8859-1. Bytes in the string which are not valid UTF-8, and UTF-8 characters which do not exist in ISO-8859-1 (that is, code points above U+00FF) are replaced with ?.
Note:
Many web pages marked as using the
ISO-8859-1character encoding actually use the similarWindows-1252encoding, and web browsers will interpretISO-8859-1web pages asWindows-1252.Windows-1252features additional printable characters, such as the Euro sign (€) and curly quotes (“”), instead of certainISO-8859-1control characters. This function will not convert suchWindows-1252characters correctly. Use a different function ifWindows-1252conversion is required.
Parameters
stringA UTF-8 encoded string.
Return Values
Returns the ISO-8859-1 translation of string.
Changelog
| Version | Description |
|---|---|
| 8.2.0 | This function has been deprecated. |
| 7.2.0 | This function has been moved from the XML extension to the core of PHP. In previous versions, it was only available if the XML extension was installed. |
Examples
Example #1 Basic examples
<?php
// Convert the string 'Zoë' from UTF-8 to ISO 8859-1
$utf8_string = "\x5A\x6F\xC3\xAB";
$iso8859_1_string = utf8_decode($utf8_string);
echo bin2hex($iso8859_1_string), "\n";
// Invalid UTF-8 sequences are replaced with '?'
$invalid_utf8_string = "\xC3";
$iso8859_1_string = utf8_decode($invalid_utf8_string);
var_dump($iso8859_1_string);
// Characters which don't exist in ISO 8859-1, such as
// '€' (Euro Sign) are also replaced with '?'
$utf8_string = "\xE2\x82\xAC";
$iso8859_1_string = utf8_decode($utf8_string);
var_dump($iso8859_1_string);
?>The above example will output:
5a6feb string(1) "?" string(1) "?"
Notes
Note: Deprecation and alternatives
This function is deprecated as of PHP 8.2.0, and will be removed in a future version. Existing uses should be checked and replaced with appropriate alternatives.
Similar functionality can be achieved with mb_convert_encoding(), which supports ISO-8859-1 and many other character encodings.
<?php
$utf8_string = "\xC3\xAB"; // 'ë' (e with diaeresis) in UTF-8
$iso8859_1_string = mb_convert_encoding($utf8_string, 'ISO-8859-1', 'UTF-8');
echo bin2hex($iso8859_1_string), "\n";
$utf8_string = "\xCE\xBB"; // 'λ' (Greek lower-case lambda) in UTF-8
$iso8859_7_string = mb_convert_encoding($utf8_string, 'ISO-8859-7', 'UTF-8');
echo bin2hex($iso8859_7_string), "\n";
$utf8_string = "\xE2\x82\xAC"; // '€' (Euro sign) in UTF-8 (not present in ISO-8859-1)
$windows_1252_string = mb_convert_encoding($utf8_string, 'Windows-1252', 'UTF-8');
echo bin2hex($windows_1252_string), "\n";
?>The above example will output:
eb eb 80Other options which may be available depending on the extensions installed are UConverter::transcode() and iconv().
The following all give the same result:
Specifying<?php
$utf8_string = "\x5A\x6F\xC3\xAB"; // 'Zoë' in UTF-8
$iso8859_1_string = utf8_decode($utf8_string);
echo bin2hex($iso8859_1_string), "\n";
$iso8859_1_string = mb_convert_encoding($utf8_string, 'ISO-8859-1', 'UTF-8');
echo bin2hex($iso8859_1_string), "\n";
$iso8859_1_string = iconv('UTF-8', 'ISO-8859-1', $utf8_string);
echo bin2hex($iso8859_1_string), "\n";
$iso8859_1_string = UConverter::transcode($utf8_string, 'ISO-8859-1', 'UTF8');
echo bin2hex($iso8859_1_string), "\n";
?>The above example will output:
5a6feb 5a6feb 5a6feb 5a6feb'?'as the'to_subst'option to UConverter::transcode() gives the same result as utf8_decode() for strings which are invalid or which can not be represented in ISO 8859-1.<?php
$utf8_string = "\xE2\x82\xAC"; // € (Euro Sign) does not exist in ISO 8859-1
$iso8859_1_string = UConverter::transcode(
$utf8_string, 'ISO-8859-1', 'UTF-8', ['to_subst' => '?']
);
var_dump($iso8859_1_string);
?>The above example will output:
sring(1) "?"
See Also
- utf8_encode() - Converts a string from ISO-8859-1 to UTF-8
- mb_convert_encoding() - Convert a string from one character encoding to another
- UConverter::transcode() - Convert a string from one character encoding to another
- iconv() - Convert a string from one character encoding to another