5 new improvements in Apache ShardingSphere

ShardingSphere supports multiple databases, and each improvement in the 5.2.0 release takes more work off your hands.
1 reader likes this.
Person standing in front of a giant computer screen with numbers, data

Opensource.com

Apache ShardingSphere, a powerful distributed database, recently released a major update to optimize and enhance its features, performance, testing, documentation, and examples. In short, the project continues to work hard at development to make it easier for you to manage your organization's data.

1. SQL audit for data sharding

The problem: When a user executes an SQL query without the sharding feature in large-scale data sharding scenarios, the SQL query is routed to the underlying database for execution. As a result, many database connections are occupied, and businesses are severely affected by a timeout or other issues. Worse still, should the user perform an UPDATE/DELETE operation, a large amount of data may be incorrectly updated or deleted.

ShardingSphere's solution: As of version 5.2.0, ShardingSphere provides the SQL audit for data sharding feature and allows users to configure audit strategies. The strategy specifies multiple audit algorithms, and users can decide whether audit rules should be disabled. SQL execution is strictly prohibited if any audit algorithm fails to pass.

Here's the configuration of an SQL audit for data sharding:

rules:
- !SHARDING
  tables:
    t_order:
      actualDataNodes: ds_${0..1}.t_order_${0..1}
      tableStrategy:
        standard:
          shardingColumn: order_id
          shardingAlgorithmName: t_order_inline
      auditStrategy:
        auditorNames:
          - sharding_key_required_auditor
        allowHintDisable: true
  defaultAuditStrategy:
    auditorNames:
      - sharding_key_required_auditor
    allowHintDisable: true  auditors:
    sharding_key_required_auditor:
      type: DML_SHARDING_CONDITIONS

Given complex business scenarios, this new feature allows you to dynamically disable the audit algorithm by using SQL hints so that partial business SQL operations can be executed.

ShardingSphere has a built-in DML disables full-route audit algorithm. You can also implement a ShardingAuditAlgorithm interface to gain advanced SQL audit functions:

/* ShardingSphere hint: disableAuditNames=sharding_key_required_auditor */ SELECT * FROM t_order;

2. SQL execution process management

The ShardingSphere MySQL database provides a SHOW PROCESSLIST statement, allowing you to view the currently running thread. You can kill the thread with the KILL statement for SQL that takes too long to be temporarily terminated.

MySQL KILL command result

(Duan Zhengqiang, CC BY-SA 4.0)

The SHOW PROCESSLIST and KILL statements are widely used in daily operation and maintenance management. To enhance your ability to manage ShardingSphere, version 5.2.0 supports the MySQL SHOW PROCESSLIST and KILL statements. When you execute a DDL/DML statement through ShardingSphere, ShardingSphere automatically generates a unique UUID identifier and stores the SQL execution information in each instance.

When you execute the SHOW PROCESSLIST statement, ShardingSphere processes the SQL execution information based on the current operating mode.

If the current mode is cluster mode, ShardingSphere collects and synchronizes the SQL execution information of each compute node through the governance center and then returns the summary to the user. If the current mode is the standalone mode, ShardingSphere only returns SQL execution information in the current compute node.

You get to determine whether to execute the KILL statement based on the result returned by SHOW PROCESSLIST, and ShardingSphere cancels the SQL in execution based on the ID in the KILL statement.

3. Shardingsphere-on-cloud

Shardingsphere-on-cloud is a project of Apache ShardingSphere providing cloud-oriented solutions. Version 0.1.0 has been released, and it has been officially voted as a sub-project of Apache ShardingSphere.

Shardinsphere-on-cloud will continue releasing configuration templates, deployment scripts, and other automation tools for ShardingSphere on the cloud.

It will also polish the engineering practices in terms of high availability, data migration, observability, shadow DB, security, and audit, optimize the delivery mode of Helm Charts, and continue to enhance its cloud-native management capabilities through Kubernetes Operator. There are already introductory issues in the project repository to help those interested in getting Go, Database, and Cloud up and running quickly.

4. Access port

In version 5.2.0, ShardingSphere-Proxy can monitor specified IP addresses and integrate openGauss database drivers by default. ShardingSphere-JDBC supports c3p0 data sources, and a Connection.prepareStatement can specify the columns.

5. Distributed transaction

The original logical database-level transaction manager has been adjusted to a global manager, supporting distributed transactions across multiple logical databases. XA transactions are now automatically managed by ShardingSphere, which removes the XA statement's ability to control distributed transactions.

Use ShardingSphere for distributed data

ShardingSphere supports multiple databases, and each improvement takes more work off your hands. The establishment of the shardingsphere-on-cloud sub-project shows ShardingSphere's commitment to being cloud-native. The greater ShardingSphere community welcomes anyone interested in Go, databases, and the cloud to join the shardingsphere-on-cloud sub-project!


The article was first published on Medium.com and has been republished with permission.

Tags
Yacine Si Tayeb
I am passionate about technology and innovation. I moved to Beijing to pursue my Ph.D. in Management and fell in awe of the local startup and tech scene. My career path has so far been shaped by opportunities at the intersection of technology and business.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.