Internationalized Domain Names for Python (IDNA 2008 and UTS #46)

Overview

Internationalized Domain Names in Applications (IDNA)

Support for the Internationalised Domain Names in Applications (IDNA) protocol as specified in RFC 5891. This is the latest version of the protocol and is sometimes referred to as “IDNA 2008”.

This library also provides support for Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.

This acts as a suitable replacement for the “encodings.idna” module that comes with the Python standard library, but which only supports the older superseded IDNA specification (RFC 3490).

Basic functions are simply executed:

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

Installation

To install this library, you can use pip:

$ pip install idna

Alternatively, you can install the package using the bundled setup script:

$ python setup.py install

Usage

For typical usage, the encode and decode functions will take a domain name argument and perform a conversion to A-labels or U-labels respectively.

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

You may use the codec encoding and decoding methods using the idna.codec module:

>>> import idna.codec
>>> print('домен.испытание'.encode('idna'))
b'xn--d1acufc.xn--80akhbyknj4f'
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
домен.испытание

Conversions can be applied at a per-label basis using the ulabel or alabel functions if necessary:

>>> idna.alabel('测试')
b'xn--0zwm56d'

Compatibility Mapping (UTS #46)

As described in RFC 5895, the IDNA specification does not normalize input from different potential ways a user may input a domain name. This functionality, known as a “mapping”, is considered by the specification to be a local user-interface issue distinct from IDNA conversion functionality.

This library provides one such mapping, that was developed by the Unicode Consortium. Known as Unicode IDNA Compatibility Processing, it provides for both a regular mapping for typical applications, as well as a transitional mapping to help migrate from older IDNA 2003 applications.

For example, “Königsgäßchen” is not a permissible label as LATIN CAPITAL LETTER K is not allowed (nor are capital letters in general). UTS 46 will convert this into lower case prior to applying the IDNA conversion.

>>> import idna
>>> idna.encode('Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen

Transitional processing provides conversions to help transition from the older 2003 standard to the current standard. For example, in the original IDNA specification, the LATIN SMALL LETTER SHARP S (ß) was converted into two LATIN SMALL LETTER S (ss), whereas in the current IDNA specification this conversion is not performed.

>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
'xn--knigsgsschen-lcb0w'

Implementors should use transitional processing with caution, only in rare cases where conversion from legacy labels to current labels must be performed (i.e. IDNA implementations that pre-date 2008). For typical applications that just need to convert labels, transitional processing is unlikely to be beneficial and could produce unexpected incompatible results.

encodings.idna Compatibility

Function calls from the Python built-in encodings.idna module are mapped to their IDNA 2008 equivalents using the idna.compat module. Simply substitute the import clause in your code to refer to the new module name.

Exceptions

All errors raised during the conversion following the specification should raise an exception derived from the idna.IDNAError base class.

More specific exceptions that may be generated as idna.IDNABidiError when the error reflects an illegal combination of left-to-right and right-to-left characters in a label; idna.InvalidCodepoint when a specific codepoint is an illegal character in an IDN label (i.e. INVALID); and idna.InvalidCodepointContext when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ but the contextual requirements are not satisfied.)

Building and Diagnostics

The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for performance. These tables are derived from computing against eligibility criteria in the respective standards. These tables are computed using the command-line script tools/idna-data.

This tool will fetch relevant codepoint data from the Unicode repository and perform the required calculations to identify eligibility. There are three main modes:

  • idna-data make-libdata. Generates idnadata.py and uts46data.py, the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors who wish to track this library against a different Unicode version may use this tool to manually generate a different version of the idnadata.py and uts46data.py files.
  • idna-data make-table. Generate a table of the IDNA disposition (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC 5892 and the pre-computed tables published by IANA.
  • idna-data U+0061. Prints debugging output on the various properties associated with an individual Unicode codepoint (in this case, U+0061), that are used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging or analysis.

The tool accepts a number of arguments, described using idna-data -h. Most notably, the --version argument allows the specification of the version of Unicode to use in computing the table data. For example, idna-data --version 9.0.0 make-libdata will generate library data against Unicode 9.0.0.

Additional Notes

  • Packages. The latest tagged release version is published in the Python Package Index.
  • Version support. This library supports Python 3.5 and higher. As this library serves as a low-level toolkit for a variety of applications, many of which strive for broad compatibility with older Python versions, there is no rush to remove older intepreter support. Removing support for older versions should be well justified in that the maintenance burden has become too high.
  • Python 2. Python 2 is supported by version 2.x of this library. While active development of the version 2.x series has ended, notable issues being corrected may be backported to 2.x. Use "idna<3" in your requirements file if you need this library for a Python 2 application.
  • Testing. The library has a test suite based on each rule of the IDNA specification, as well as tests that are provided as part of the Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.
  • Emoji. It is an occasional request to support emoji domains in this library. Encoding of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and emoji domains are broadly phased out across the domain industry due to associated security risks. For now, applications that wish need to support these non-compliant labels may wish to consider trying the encode/decode operation in this library first, and then falling back to using encodings.idna. See the Github project for more discussion.
Owner
Kim Davies
Kim Davies
Vuln Scanner With Python

VulnScanner Features Web Application Firewall (WAF) detection. Cross Site Scripting (XSS) tests. SQL injection time based test. SQL injection error ba

< / N u l l S 0 U L > 1 Dec 25, 2021
CodeTest信息收集和漏洞利用工具

CodeTest信息收集和漏洞利用工具,可在进行渗透测试之时方便利用相关信息收集脚本进行信息的获取和验证工作,漏洞利用模块可选择需要测试的漏洞模块,或者选择所有模块测试,包含CVE-2020-14882, CVE-2020-2555等,可自己收集脚本后按照模板进行修改。

23 Mar 18, 2021
NexScanner is a tool which allows you to scan a website and find the admin login panel and sub-domains

NexScanner NexScanner is a tool which helps you scan a website for sub-domains and also to find login pages in the website like the admin login panel

8 Sep 03, 2022
A tool to crack a wifi password with a help of wordlist

A tool to crack a wifi password with a help of wordlist. This may take long to crack a wifi depending upon number of passwords your wordlist contains. Also it is slower as compared to social media ac

Saad 144 Dec 29, 2022
Uses Sharphound, Bloodhound and Neo4j to produce an actionable list of attack paths for targeted remediation.

GoodHound ______ ____ __ __ / ____/___ ____ ____/ / / / /___ __ ______ ____/ / / / __/ __ \/ __ \/ __

idna 352 Jan 02, 2023
Learning to compose soft prompts for compositional zero-shot learning.

Compositional Soft Prompting (CSP) Compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositional

Bats Research 32 Jan 02, 2023
Python bindings to LibreSSL library

LibreSSL bindings for Python using CFFI Python3 bindings to LibreSSL using CFFI. It aims to provide interface to the most important bits of LibreSSL o

Alexander Kiselyov 1 Aug 02, 2022
Huskee: Malware made in Python for Educational purposes

𝐇𝐔𝐒𝐊𝐄𝐄 Caracteristicas: Discord Token Grabber Wifi Passwords Grabber Googl

chew 4 Aug 17, 2022
The self-hostable proxy tunnel

TTUN Server The self-hostable proxy tunnel. Running Running: docker run -e TUNNEL_DOMAIN=Your tunnel domain -e SECURE=True if using SSL ghcr.io/to

Tom van der Lee 2 Jan 11, 2022
聚合Github上已有的Poc或者Exp,CVE信息来自CVE官网。Auto Collect Poc Or CVE from Github by CVE ID.

PocOrExp in Github 聚合Github上已有的Poc或者Exp,CVE信息来自CVE官网 注意:只通过通用的CVE号聚合,因此对于MS17-010等Windows编号漏洞以及著名的有绰号的漏洞,还是自己检索一下比较好 Usage python3 exp.py -h usage: ex

567 Dec 30, 2022
MozDef: Mozilla Enterprise Defense Platform

MozDef: Documentation: https://mozdef.readthedocs.org/en/latest/ Give MozDef a Try in AWS: The following button will launch the Mozilla Enterprise Def

Mozilla 2.2k Jan 08, 2023
User-friendly reference finder in IDA

IDARefHunter Updated: This project's been introduced on IDA Plugin Contest 2021! Why do we need RefHunter? Getting reference information in one specif

Jiwon 29 Dec 04, 2022
一款辅助探测Orderby注入漏洞的BurpSuite插件,Python3编写,适用于上xray等扫描器被ban的场景

OrderbyHunter 一款辅助探测Orderby注入漏洞的BurpSuite插件,Python3编写,适用于上xray等扫描器被ban的场景 1. 支持Get/Post型请求参数的探测,被动探测,对于存在Orderby注入的请求将会在HTTP Histroy里标红 2. 自定义排序参数list

Automne 21 Aug 12, 2022
Getting my gitlab commit history into github

🔰 ᵀᴱᴸᴱᴳᴿᴬᴹ ᴴᴬᶜᴷ ᴮᴼᵀ 🔰 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🛠️ Lᴀɴɢᴜᴀɢᴇs Aɴᴅ Tᴏᴏʟs 🔰 • If

Santiago Chiesa 1 Dec 24, 2021
Automated tool to find & created Exploit Poc for Clickjacking Vulnerability

ClickJackPoc This tool will help you automate finding Clickjacking Vulnerability by just passing a file containing list of Targets . Once the Target i

Chirag Agrawal 24 Dec 19, 2022
Simples brute forcer de diretorios para web pentest.

🦑 dirbruter Simples brute forcer de diretorios para web pentest. ❕ Atenção Não ataque sites privados. Isto é illegal. 🖥️ Pré-requisitos Ultima versã

Dio brando 6 Jan 22, 2022
GRR Rapid Response: remote live forensics for incident response

GRR Rapid Response is an incident response framework focused on remote live forensics. Build Type Status Tests End-to-end Tests Windows Templates Linu

Google 4.3k Jan 05, 2023
HatSploit native powerful payload generation and shellcode injection tool that provides support for common platforms and architectures.

HatVenom HatSploit native powerful payload generation and shellcode injection tool that provides support for common platforms and architectures. Featu

EntySec 100 Dec 23, 2022
This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

Varun Jagtap 5 Oct 08, 2022
DCSync - DCSync Attack from Outside using Impacket

Adding DCSync Permissions Mostly copypasta from https://github.com/tothi/rbcd-at

n00py 77 Dec 16, 2022