IRI
composer require php-standard-library/iri
The IRI component provides RFC 3987 compliant Internationalized Resource Identifier parsing with full Unicode support, Punycode encoding/decoding, and IDNA 2008 domain name processing.
Parsing
Parse IRI strings containing Unicode characters:
use Psl\IO;
use Psl\IRI;
$iri = IRI\parse('https://münchen.de/straße?q=ünited#§ion');
// "https"
IO\write_line('%s', $iri->scheme ?? '<unknown>');
// "münchen.de"
IO\write_line('%s', $iri->authority?->host?->toString() ?? '<unknown>');
// "/straße"
IO\write_line('%s', $iri->path);
// "q=ünited"
IO\write_line('%s', $iri->query ?? '<unknown>');
// "§ion"
IO\write_line('%s', $iri->fragment ?? '<unknown>');
// "https://münchen.de/straße?q=ünited#§ion"
IO\write_line('%s', $iri->toString());
Input is NFC-normalized and validated against RFC 3987 character ranges. Private-use characters are only permitted in the query component.
Converting to URI
Convert an IRI to an ASCII-only RFC 3986 URI. International domain names are Punycode-encoded and Unicode path/query/fragment characters are percent-encoded:
use Psl\IO;
use Psl\IRI;
$iri = IRI\parse('https://münchen.de/straße');
$uri = $iri->toURI();
// "https://xn--mnchen-3ya.de/stra%C3%9Fe"
IO\write_line('%s', $uri->toString());
Converting from URI
Reverse the process - decode a URI back to an IRI with Unicode characters restored:
use Psl\IO;
use Psl\IRI;
use Psl\URI;
$uri = URI\parse('https://xn--mnchen-3ya.de/stra%C3%9Fe');
$iri = IRI\from_uri($uri);
// "https://münchen.de/straße"
IO\write_line('%s', $iri->toString());
Standards
| RFC | Title |
|---|---|
| RFC 3987 | Internationalized Resource Identifiers (IRIs) |
| RFC 3492 | Punycode: Bootstring Encoding for IDNA |
| RFC 5891 | IDNA 2008: Protocol |
| RFC 5892 | IDNA 2008: Unicode Code Points |
See src/Psl/IRI/ for the full API.