- Joined
- Apr 21, 2005
- Messages
- 2,246
- Reaction score
- 7
International Domain Names in IE7
Hi, I am Vishu Gupta, a developer on the IE team. For the past year, I have been working primarily on CURI and International Domain Names (IDN) support. Browser support for navigating to URLs written in usersâ native languages is critical for making the Internet truly international. IDN relies upon a standardized mechanism known as âPunycodeâ for encoding Unicode domain names using only the ASCII characters that are permitted by the DNS system.
After XPSP2 was released, I was asked to study and evaluate what it would take to implement IDN support in Internet Explorer. We determined that the workitems involved in implementing IDN support in IE were:
Converting the Unicode domain names to Punycode before sending them over the wire.
Maintaining consistency within IE for handling domain names which enter IE in Punycode, and treating them equivalent to their Unicode counterparts.
Handling compatibility for existing scenarios.
Providing security against homograph-spoofing attacks without giving a bad user-experience for IDN URLs.
Conversion to Punycode
This is accomplished by using the APIs provided by the recently released âMicrosoft Internationalized Domain Names (IDN) Mitigation APIs 1.0â; these APIs will ship with Windows Vista and IE7 and are also available for download here. You can learn more about these APIs by reading the MSDN documentation.
Maintaining consistency within IE
Many websites work around the limitation that IE6 does not support IDN by linking to the Punycoded URL. To improve user experience with those websites and to ensure that IE behaves consistently for equivalent Punycode and Unicode domain names, IE7 handles the URL as Nameprep Unicode internally (as suggested by RFC 3490). IE converts Unicode domain names to Punycode just before the domain name is resolved or sent to the proxy. This ensures, for example, that if the user added ŧēśŧ.example.com to the Restricted Sites zone, http://xn--hea8l8ac.example.com is also treated as a restricted site.
Maintaining compatibility
Using Punycode for name resolution is the default behavior for IE7. A new âInternationalâ section in the Internet Control Panel offers permits disabling IDN when sending the domain name either to the proxy or to the DNS resolver. Disabling both options will revert IE7 to IE6 behavior when handling Unicode domain names.
Blocking IDN spoofing
Lookalike attacks (sometimes called âhomographâ attacks) are possible within the ASCII character set (the usual examples are www.example.com vs. www.examp1e.com). But, with IDN, the character repertoire expands from a few dozen characters to many thousands of characters from all of the worldâs languages, thereby increasing the attack surface for spoofing attacks immensely.
There is little doubt that showing the Punycode form leaves no ground for spoofing using the full range of Unicode characters; however, showing Punycode isnât very user-friendly. The design of our anti-spoofing mitigation for IDN aims to:
Reduce attack surface
Treat Unicode domain names fairly
Offer a good user-experience for users worldwide
Offer simple, logical options to enable the user to fine-tune the IDN-experience
Given these considerations, IE7 imposes restrictions on the scripts allowed to be displayed inside the address bar. These restrictions are based on the userâs configured browser language settings. Using APIs from the aforementioned idndl.dll, IE will detect what scripts (character sets) are used by the current domain name. If the domain name is contains characters outside of the userâs chosen languages, it is displayed in Punycode form to help prevent spoofing.
A domain name is displayed in Punycode if any of the following are true:
The domain name contains characters which are not a part of any language (e.g. www.▯.com)
Any one of its labels* contains a mix of scripts that do not appear together within a single language. For instance, Greek characters cannot mix with Cyrillic within a single label.
Any of its labels* contains characters that appear only in languages other than the userâs list of chosen languages. Note that ASCII-only labels are always permitted for compatibility with existing sites.
(* A label is a segment of a domain name, delimited by dots. www.microsoft.com contains three labels, âwwwâ, âmicrosoftâ and âcomâ.)
If none of the above conditions apply, the domain name is displayed in Unicode. Note that different languages are allowed to appear in different labels, so long as all of the languages are in the list chosen by the user. This is to support domain names like name.example.com where âexampleâ and ânameâ are composed of different languages.
We do not describe âother languageâ URLs as âsuspiciousâ because such URLs are completely harmless when displayed in Punycode form. Whenever IE7 has prevented an IDN domain name from displaying in Unicode, an Information Bar notifies the user that the domain name contains characters IE is not configured to display. It is easy to add additional languages to the Allow List using the IDN Information Bar. By default, the userâs list of languages will usually only contain the currently-configured Windows language.
Attack Surface Reduction
Our language-aware mitigation does two things:
It disallows non-standard combinations of scripts from being displayed inside a label. This takes care of attacks like http://bạnk.example.com. That domain name will always be displayed as http://xn--bnk-sgz.example.com, because two scripts (Cyrillic and Latin) are mixed inside a label. This reduces the attack-surface to âsingle-language attacksâ.
It further reduces the surface attack for single-language attacks to only those users who have chosen to permit the target language.
Defense-in-Depth
Users who allow Greek in their language-settings are as susceptible to Greek-only spoofs as the population using English is susceptible to pure-ASCII based spoofs. Thatâs where IE7âs Phishing Filter kicks in for both Unicode and ASCII urls. If the user has opted into the Phishing Filter, a real-time check is performed during navigation to see if the target domain name is a reported phishing site. If so, navigation is blocked. For additional defense-in-depth, the Phishing Filterâs web service can apply additional heuristics to determine if the domain name is visually ambiguous. If so, the Phishing Filter will warn the user via the indicator in the IE address bar.
Whenever viewing a site addressed by an International Domain Name, an indicator will appear in the IE address bar to notify the user that IDN is in use. The user can click on the IDN indicator to view more information about the current domain name.
Users who do not wish to see Unicode addresses may set an Internet Control Panel option to âAlways show encoded addressesâ.
Call to Action
Internet Explorer 7 Beta 2 will include IDN support in nearly-final form and we would greatly appreciate feedback on the design. If you see a scenario not working properly (for example, if adding native language URLs to favorites was broken), please let us know.
- Vishu Gupta
Site Link
Get Ready for the revolution.