Background Ever since the release of , has been providing strong dynamic management capabilities to the ShardingSphere ecosystem. ShardingSphere 5.0.0 DistSQL Thanks to DistSQL, users have been empowered to do the following: Create logical databases online. Dynamically configure rules (i.e. sharding, data encryption, read/write splitting, database discovery, shadow DB, and global rules). Adjust storage resources in real-time. Switch transaction types instantly. Turn SQL logs on and off at any time. Preview the SQL routing results. At the same time, in the context of increasingly diversified scenarios, more and more DistSQL features are being created and a variety of valuable syntaxes have been gaining popularity among users. Overview This post takes data sharding as an example to illustrate DistSQL’s application scenarios and related sharding methods. A series of DistSQL statements are concatenated through practical cases, to give you a complete set of practical DistSQL sharding management schemes, which create and maintain distributed databases through dynamic management. The following DistSQL will be used in this example: Practical Case Required scenarios Create two sharding tables: and . t_order t_order_item For both tables, database shards are carried out with the field, and table shards with the field. user_id order_id The number of shards is 2 databases * 3 tables. As shown in the figure below: Setting up the environment Prepare a MySQL database instance for access. Create two new databases: and . demo_ds_0 demo_ds_1 Here we take MySQL as an example, but you can also use PostgreSQL or openGauss databases. Deploy and . ZooKeeper acts as a governance center and stores ShardingSphere metadata information. Apache ShardingSphere-Proxy 5.1.2 Apache ZooKeeper Configure in the directory as follows: server.yaml Proxy conf mode: type: Cluster repository: type: ZooKeeper props: namespace: governance_ds server-lists: localhost:2181 # ZooKeeper address retryIntervalMilliseconds: 500 timeToLiveSeconds: 60 maxRetries: 3 operationTimeoutMilliseconds: 500 overwrite: false rules: - !AUTHORITY users: - root@%:root Start ShardingSphere-Proxy and connect it to Proxy using a client, for example: mysql -h 127.0.0.1 -P 3307 -u root -p Creating a distributed database CREATE DATABASE sharding_db; USE sharding_db; Adding storage resources Add storage resources corresponding to the prepared database. MySQL ADD RESOURCE ds_0 ( HOST=127.0.0.1, PORT=3306, DB=demo_ds_0, USER=root, PASSWORD=123456 ), ds_1( HOST=127.0.0.1, PORT=3306, DB=demo_ds_1, USER=root, PASSWORD=123456 ); View storage resources mysql> SHOW DATABASE RESOURCES\G; *************************** 1. row *************************** name: ds_1 type: MySQL host: 127.0.0.1 port: 3306 db: demo_ds_1 -- Omit partial attributes *************************** 2. row *************************** name: ds_0 type: MySQL host: 127.0.0.1 port: 3306 db: demo_ds_0 -- Omit partial attributes Adding \G to the query statement aims to make the output format more readable, and it is not a must. ShardingSphere’s sharding rules support regular sharding and automatic sharding. Both sharding methods have the same effect. The difference is that the configuration of automatic sharding is more concise, while regular sharding is more flexible and independent. Creating sharding rules Please refer to the following links for more details on automatic sharding: Intro to DistSQL-An Open Source and More Powerful SQL AutoTable: Your Butler-Like Sharding Configuration Tool Next, we’ll adopt regular sharding and use the expression algorithm to implement the sharding scenarios described in the requirements. INLINE Primary key generator The primary key generator can generate a secure and unique primary key for a data table in a distributed scenario. For details, refer to the document . Distributed Primary Key Create the primary key generator. CREATE SHARDING KEY GENERATOR snowflake_key_generator ( TYPE(NAME=SNOWFLAKE) ); Query primary key generator mysql> SHOW SHARDING KEY GENERATORS; +-------------------------+-----------+-------+ | name | type | props | +-------------------------+-----------+-------+ | snowflake_key_generator | snowflake | {} | +-------------------------+-----------+-------+ 1 row in set (0.01 sec) Sharding algorithm Create a database sharding algorithm, used by and in common. t_order t_order_item -- Modulo 2 based on user_id in database sharding CREATE SHARDING ALGORITHM database_inline ( TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}")) ); Create different table shards algorithms for and . t_order t_order_item -- Modulo 3 based on order_id in table sharding CREATE SHARDING ALGORITHM t_order_inline ( TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_${order_id % 3}")) ); CREATE SHARDING ALGORITHM t_order_item_inline ( TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_item_${order_id % 3}")) ); Query sharding algorithm mysql> SHOW SHARDING ALGORITHMS; +---------------------+--------+---------------------------------------------------+ | name | type | props | +---------------------+--------+---------------------------------------------------+ | database_inline | inline | algorithm-expression=ds_${user_id % 2} | | t_order_inline | inline | algorithm-expression=t_order_${order_id % 3} | | t_order_item_inline | inline | algorithm-expression=t_order_item_${order_id % 3} | +---------------------+--------+---------------------------------------------------+ 3 rows in set (0.00 sec) Default sharding strategy Sharding strategy consists of sharding key and sharding algorithm. Please refer to for its concept. Sharding Strategy Sharding strategy consists of and . databaseStrategy tableStrategy Since and have the same database sharding field and sharding algorithm, we create a default strategy that will be used by all shard tables with no sharding strategy configured: t_order t_order_item Create a default database sharding strategy CREATE DEFAULT SHARDING DATABASE STRATEGY ( TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM=database_inline ); Query default strategy mysql> SHOW DEFAULT SHARDING STRATEGY\G; *************************** 1. row *************************** name: TABLE type: NONE sharding_column: sharding_algorithm_name: sharding_algorithm_type: sharding_algorithm_props: *************************** 2. row *************************** name: DATABASE type: STANDARD sharding_column: user_id sharding_algorithm_name: database_inline sharding_algorithm_type: inline sharding_algorithm_props: {algorithm-expression=ds_${user_id % 2}} 2 rows in set (0.00 sec) The default table sharding strategy is not configured, so the default strategy of is . TABLE NONE Sharding rules The primary key generator and sharding algorithm are both ready. Now create sharding rules. 1. t_order CREATE SHARDING TABLE RULE t_order ( DATANODES("ds_${0..1}.t_order_${0..2}"), TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_inline), KEY_GENERATE_STRATEGY(COLUMN=order_id,KEY_GENERATOR=snowflake_key_generator) ); specifies the data nodes of shard tables. specifies the table strategy, among which uses created sharding algorithm ; specifies the primary key generation strategy of the table. Skip this configuration if primary key generation is not required. DATANODES TABLE_STRATEGY SHARDING_ALGORITHM t_order_inline KEY_GENERATE_STRATEGY 2. t_order_item CREATE SHARDING TABLE RULE t_order_item ( DATANODES("ds_${0..1}.t_order_item_${0..2}"), TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_item_inline), KEY_GENERATE_STRATEGY(COLUMN=order_item_id,KEY_GENERATOR=snowflake_key_generator) ); Query sharding rules mysql> SHOW SHARDING TABLE RULES\G; *************************** 1. row *************************** table: t_order actual_data_nodes: ds_${0..1}.t_order_${0..2} actual_data_sources: database_strategy_type: STANDARD database_sharding_column: user_id database_sharding_algorithm_type: inline database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2} table_strategy_type: STANDARD table_sharding_column: order_id table_sharding_algorithm_type: inline table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3} key_generate_column: order_id key_generator_type: snowflake key_generator_props: *************************** 2. row *************************** table: t_order_item actual_data_nodes: ds_${0..1}.t_order_item_${0..2} actual_data_sources: database_strategy_type: STANDARD database_sharding_column: user_id database_sharding_algorithm_type: inline database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2} table_strategy_type: STANDARD table_sharding_column: order_id table_sharding_algorithm_type: inline table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3} key_generate_column: order_item_id key_generator_type: snowflake key_generator_props: 2 rows in set (0.00 sec) 💡So far, the sharding rules for and have been configured. t_order t_order_item A bit complicated? Well, you can also skip the steps of creating the primary key generator, sharding algorithm, and default strategy, and complete the sharding rules in one step. Let's see how to make it easier. Syntax Now, if we have to add a shard table , we can create sharding rules as follows: t_order_detail CREATE SHARDING TABLE RULE t_order_detail ( DATANODES("ds_${0..1}.t_order_detail_${0..1}"), DATABASE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}")))), TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_detail_${order_id % 3}")))), KEY_GENERATE_STRATEGY(COLUMN=detail_id,TYPE(NAME=snowflake)) ); **Note: **The above statement specified database sharding strategy, table strategy, and primary key generation strategy, but it didn’t use existing algorithms. Therefore, the DistSQL engine automatically uses the input expression to create an algorithm for the sharding rules of . Now the primary key generator, sharding algorithm, and sharding rules are as follows: t_order_detail Primary key generator mysql> SHOW SHARDING KEY GENERATORS; +--------------------------+-----------+-------+ | name | type | props | +--------------------------+-----------+-------+ | snowflake_key_generator | snowflake | {} | | t_order_detail_snowflake | snowflake | {} | +--------------------------+-----------+-------+ 2 rows in set (0.00 sec) Sharding algorithm mysql> SHOW SHARDING ALGORITHMS; +--------------------------------+--------+-----------------------------------------------------+ | name | type | props | +--------------------------------+--------+-----------------------------------------------------+ | database_inline | inline | algorithm-expression=ds_${user_id % 2} | | t_order_inline | inline | algorithm-expression=t_order_${order_id % 3} | | t_order_item_inline | inline | algorithm-expression=t_order_item_${order_id % 3} | | t_order_detail_database_inline | inline | algorithm-expression=ds_${user_id % 2} | | t_order_detail_table_inline | inline | algorithm-expression=t_order_detail_${order_id % 3} | +--------------------------------+--------+-----------------------------------------------------+ 5 rows in set (0.00 sec) Sharding rules mysql> SHOW SHARDING TABLE RULES\G; *************************** 1. row *************************** table: t_order actual_data_nodes: ds_${0..1}.t_order_${0..2} actual_data_sources: database_strategy_type: STANDARD database_sharding_column: user_id database_sharding_algorithm_type: inline database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2} table_strategy_type: STANDARD table_sharding_column: order_id table_sharding_algorithm_type: inline table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3} key_generate_column: order_id key_generator_type: snowflake key_generator_props: *************************** 2. row *************************** table: t_order_item actual_data_nodes: ds_${0..1}.t_order_item_${0..2} actual_data_sources: database_strategy_type: STANDARD database_sharding_column: user_id database_sharding_algorithm_type: inline database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2} table_strategy_type: STANDARD table_sharding_column: order_id table_sharding_algorithm_type: inline table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3} key_generate_column: order_item_id key_generator_type: snowflake key_generator_props: *************************** 3. row *************************** table: t_order_detail actual_data_nodes: ds_${0..1}.t_order_detail_${0..1} actual_data_sources: database_strategy_type: STANDARD database_sharding_column: user_id database_sharding_algorithm_type: inline database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2} table_strategy_type: STANDARD table_sharding_column: order_id table_sharding_algorithm_type: inline table_sharding_algorithm_props: algorithm-expression=t_order_detail_${order_id % 3} key_generate_column: detail_id key_generator_type: snowflake key_generator_props: 3 rows in set (0.01 sec) In the statement, , , and can all reuse existing algorithms. Note: CREATE SHARDING TABLE RULE DATABASE_STRATEGY TABLE_STRATEGY KEY_GENERATE_STRATEGY Alternatively, they can be defined quickly through syntax. The difference is that additional algorithm objects are created. Users can use it flexibly based on scenarios. After the configuration verification rules are created, you can verify them in the following ways: Checking node distribution DistSQL provides for checking node distribution and users can quickly learn the distribution of shard tables. SHOW SHARDING TABLE NODES We can see the node distribution of the shard table is consistent with what is described in the requirement. SQL Preview Previewing SQL is also an easy way to verify configurations. Its syntax is : PREVIEW SQL 1. Query with no shard key with all routes Specify to query with a single database route user_id Specify and with a single table route user_id order_id Single-table routes scan the least shard tables and offer the highest efficiency. DistSQL auxiliary query During the system maintenance, algorithms or storage resources that are no longer in use may need to be released, or resources that need to be released may have been referenced and cannot be deleted. The following DistSQL can solve these problems. Query unused resources Syntax: SHOW UNUSED RESOURCES 2. Sample: Query unused primary key generator Syntax: SHOW UNUSED SHARDING KEY GENERATORS Sample: Query unused sharding algorithm Syntax: SHOW UNUSED SHARDING ALGORITHMS Sample: Query rules that use the target storage resources 1.Syntax: SHOW RULES USED RESOURCE 2.Sample: All rules that use the resource can be queried, not limited to the . Sharding Rule Query sharding rules that use the target primary key generator Syntax: SHOW SHARDING TABLE RULES USED KEY GENERATOR 2. Sample: Query sharding rules that use the target algorithm Syntax: SHOW SHARDING TABLE RULES USED ALGORITHM 2. Sample: Conclusion This post takes the data sharding scenario as an example to introduce ’s applications and methods. DistSQL DistSQL provides flexible syntax to help simplify operations. In addition to the algorithm, DistSQL supports standard sharding, compound sharding, Hint sharding, and custom sharding algorithms. More examples will be covered in the coming future. INLINE If you have any questions or suggestions about , please feel free to post them on the GitHub Issue list. Apache ShardingSphere Project Links: ShardingSphere Github ShardingSphere Twitter ShardingSphere Slack Contributor Guide GitHub Issues Contributor Guide References Concept-DistSQL Concept-Distributed Primary Key Concept-Sharding Strategy Concept INLINE Expression Built-in Sharding Algorithm User Manual: DistSQL Author Middleware R&D Engineer & Apache ShardingSphere Committer. Jiang Longtao SphereEx Longtao focuses on the R&D of DistSQL and related features. Also Published Here