Getting a Better Handle on International Domain Names and Punycode

    Published: 2025-08-26. Last Updated: 2025-08-26 16:22:52 UTC
    by Johannes Ullrich (Version: 1)
    0 comment(s)

    International domain names (IDN) continue to be an interesting topic. For the most part, they are probably less of an issue than some people make them out to be, given that popular browsers like Google Chrome are quite selective in displaying them. But on the other hand, they are still used legitimately or not, and it is interesting to keep a handle on them.

    When analyzing DNS traffic, you should see the Punycode encoding for these domain names. Punycode is defined in RFC 3492 [1]. Punycode encoded domain names start with "xn--", which makes it easy to identify them. 

    There are a number of anomalies that may happen with Punnycode, and luckily, there are some Python modules that can help us identify them.

    1 - Invalid Punycode

    The Punycode standard is complex, and you may end up with domains that use invalid Punycode.

    2 - Mixed Script

    That is probably the most interesting issue. You are detecting if a domain name mixes different languages. There is no easy way to identify the "language", but instead, we are using the "Script". The Latin script can be used for most European languages. The "Script" identifies a group of languages using the same characters. In Python, the "unicodedata2" module can be used to identify the script of a particular character.

    The Python "unicodedata2" module can be used to look up the Unicode name of a character, and the first word in a Unicode name identifies the script the character is a part of. Mixing different scripts in a domain name is suspect as legit international domain names should only use one language.

    You can find a quick Python implementation on GitHub: https://github.com/jullrich/idntest

    [1] https://datatracker.ietf.org/doc/html/rfc3492


    Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
    Twitter|

    Keywords:
    0 comment(s)
    ISC Stormcast For Tuesday, August 26th, 2025 https://isc.sans.edu/podcastdetail/9586

      Comments


      Diary Archives