Water Information Code
Encoding Structure
The water information code is the unique identifier for water conservancy elements. It adopts a three-segment division coding paradigm of "proprietary identification domain + standard domain + extended domain" to standardize the structure and content of river entity identity coding. Identification domains are separated by "/".
The "proprietary identification domain" consists of 2-bit root identifier code, 4-bit river entity-specific code, 4-bit registration service node, and 6-bit coding rules and version number, etc., used to characterize the proprietary identification, service area, and version characteristics of river entities under the MA international identification system.
The "standard domain" implements unique coding for each river segment entity, consisting of 1-bit continent code, 3-bit basin or basin area code, several sub-basin codes, and several river segment binary tree codes, used to achieve unique identification of each river segment entity.
The "extended domain" is a variable-length code used to record the main attributes of river segments and the coding of other related elements, meeting the "one code, multiple forms" usage requirements of river entity identity coding.
River Entity Proprietary Identification Domain Coding Rules
The proprietary identification domain consists of 4 parts, separated by "." between each part, as shown in the figure below.
The root identifier code is set according to international standard ISO/IEC 15459, which is a 2-bit alphabetic code with the value MA. The MA identification system is China's first independently controllable international standard identification system with global root node management rights and code resource allocation rights, used for global unique identity identification of any type of object.
The river entity-specific code is a 4-bit digital code, which is the exclusive code of river entities in the MA identification system, with the value 1002.
The river entity's registration service node code is a 4-bit digital code. The 1st bit is the continent code, and the coding rules are shown in Table 1; the 2nd to 4th bits are the basin or basin area code, with coding rules the same as the corresponding parts in the standard domain. When the registration service node is a continental root node, the 2nd to 4th bits of the registration service node code are 000.
| Europe | Africa | Asia | Oceania | North America | South America | Antarctica |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
The coding rules and version number code is a 6-bit digital code. The 1st-2nd bits are coding rules, indicating the specific rules adopted by the current river entity identity coding. Tsinghua University's binary tree river network coding rules take the value 10; the 3rd to 4th bits are the spatial resolution of surface elevation data from which the river network originates, retaining the integer meter value, such as using 12.5-meter resolution surface elevation data, the value is 12; the 5th to 6th bits are the version number, such as the first version taking the value 01. The version description should include the spatiotemporal reference adopted by the data and be published on the river entity identity coding registration and resolution website. According to the above coding rules, the coding rules and version number code for the first version of the river network using Tsinghua University's binary tree coding and originating from 12.5-meter resolution surface elevation data is 101201.
River Entity Standard Domain Coding Rules
Basin Coding
The river entity standard domain coding is the global unique code for each river segment entity, consisting of 4 parts: 1-bit continent code, 3-bit basin or basin area code, several sub-basin codes, and several river segment binary tree codes. Before the river segment binary tree coding and between its binaries, "." is used for separation respectively, as shown in the figure below.

The continent coding rules and values are the same as the 1st bit of the river entity's registration service node code. The basin or basin area code is a 3-bit digital code, the same as the 2nd to 4th bits of the river entity's registration service node code, which is the unique code of independent basins or basin areas within their respective continents. The coding principle is shown in the figure below.
Within a continent, independent basins or basin areas are coded in clockwise order starting from due north, starting from 001 or 002. Odd numbers represent independent basins, flowing into the sea through concentrated basin outlets; even numbers represent basin areas, which are areas between two independent basins with multiple unconsolidated outlets flowing into the sea. Endorheic areas and island (or archipelago) basins use even numbers greater than 800 for coding.
The sub-basin coding is a several-bit digital code used to further divide independent basins, basin areas, endorheic areas, and island basins, which is the unique coding for a certain main stream basin and its different levels of tributaries within such an area. The sub-basin coding rules for the main stream and its different levels of tributaries of independent basins are shown in the figure below.
For the main stream of independent basins, the sub-basin coding digits are 0. In this example, the displayed 001 is the basin code for this independent basin. Starting from the basin outlet, first-level sub-basins of a certain scale are selected from downstream to upstream in sequence, and 3-bit digital codes are assigned, which are first-level sub-basin codes.
For a certain first-level sub-basin, second-level sub-basins of a certain scale are continued to be selected and assigned 3-bit second-level sub-basin codes; this continues to divide smaller sub-basins until all sub-basins are sub-basins that can be expressed by river segment binary tree coding. The above sub-basin codes at all levels are connected in sequence to form the sub-basin code for each different level sub-basin, and the coding digits are related to the sub-basin level, that is, first-level sub-basins are 3 digits, second-level are 6 digits, and so on.
For areas with even basin or basin area codes such as basin areas, endorheic areas, and island basins, the method of dividing independent basins, basin areas, endorheic areas, and island basins should first be continued to gradually divide sub-regions at all levels, gradually assign 3-bit sub-basin or basin area codes and connect them in sequence. When identified as an independent basin at a certain level, the aforementioned sub-basin coding rules for independent basins are used for subsequent sub-basin coding until the sub-basin scale meets requirements.
Binary Tree Coding
The river segment binary tree coding is a binary digital coding with unfixed length, separated by "." between binaries, achieving unique coding for each river segment within a certain independent basin or sub-basin. This method abstracts the river network as a binary tree structure, defines the outlet river segment of independent basins or sub-basins as the root of the binary tree, and adopts length component and value component for coding based on the topological logic of main and tributary intersections and upstream-downstream inheritance, as shown in the figure below. For each independent basin or sub-basin, the length component of the river segment binary tree coding starts from 1, and the value component starts from 0.

It is estimated that the sum of the digits of global basin or basin area coding and sub-basin coding is generally less than 30 digits; the length component of river segment binary tree coding does not exceed the number of river segments in the basin depth, generally less than 5 digits, and the value component does not exceed the value range of 64-bit long integer, less than 20 digits.
River Entity Extended Domain Coding Rules
The extended domain consists of several extended codes used to record the main attributes of this river segment and the coding of related entities, achieving the extension of river entity attributes and functions. The extended domain is coded through markers and marker content, with different extended content separated by ".".
River segment attribute extended codes consist of two-bit markers and several-bit marker content, and the rules should be published on the river entity identity coding registration and resolution website. Extended attributes can include: river hierarchy and level, both using Horton-Strahler river level representation, and using HS as the marker, with content as two groups of two digits, such as HS0303; downstream outlet longitude and latitude coordinates of river segments, using OP (Outlet Position) as the marker, with marker content as longitude and latitude coordinates, respectively using 1-bit direction letter and 7-bit degree-minute-second representation, such as OPE1154841N0412331.
The extended domain can also record other forms of river coding, or record the coding of water-related elements (water conservancy projects, monitoring stations, and other management objects) on corresponding river segments, achieving coding associations between objects. When applied in China, water conservancy object types and classification codes should refer to the "General Principles for Classification and Coding of Water Conservancy Objects" (SL/T213-2020), but with SL as the starting marker; specific coding rules for various water conservancy objects should refer to current standards, such as "Chinese River Codes" (SL249-2012), "Lake Codes" (SL261-2017), etc.
Representation Form
The three coding domains of the river entity identity code are connected in sequence to form its complete representation, as shown in the figure below.
