Overview Metadata is the data that constitutes the data. In database terms, any data that describes the database is metadata. Column names, database names, usernames, table names, etc., and data customization library tables that store information about database objects are metadata. ShardingSphere core functions such as data sharding, encryption, and decryption are all based on the database metadata. This shows that metadata is the core of the ShardingSphere system and is also the core data of every data storage-related middleware or component. With the injection of metadata, it is equivalent to having a nerve center for the whole system, which can be combined with metadata to perform personalized operations on libraries, tables, and columns, such as data sharding, data encryption, SQL rewriting, etc. For the ShardingSphere metadata loading process, it is first necessary to clarify the type and hierarchy of metadata in ShardingSphere. The metadata in ShardingSphere is mainly based around the , the core of which is the , which is the metadata of the database and is also the top-level object of the data source metadata. The structure of the database metadata in ShardingSphere is shown as below, for each layer, the upper layer data comes from the assembly of the lower layer data, so we use the following bottom-up hierarchy to analyze one by one. ShardingSphereMetaData ShardingSphereSchema ColumMetaData and IndexMetaData and are the basic elements that make up . In the following, we will analyze the structure of the two metadata types and the loading process separately. has the following main structure: ColumMetaData IndexMetaData TableMetaData ColumMetaData public final class ColumnMetaData { // private final String name; // private final int dataType; // private final boolean primaryKey; // private final boolean generated; // private final boolean caseSensitive; } The loading process is mainly encapsulated in the method, and its main process is to load the metadata of all the columns under a table name by getting the metadata matching the table name through the database link. The core code is as follows: org.apache.shardingsphere.infra.metadata.schema.builder.loader.ColumnMetaDataLoader#load /** * Load column meta data list. * * @param connection connection * @param tableNamePattern table name pattern * @param databaseType database type * @return column meta data list * @throws SQLException SQL exception */ public static Collection<ColumnMetaData> load(final Connection connection, final String tableNamePattern, final DatabaseType databaseType) throws SQLException { Collection<ColumnMetaData> result = new LinkedList<>(); Collection<String> primaryKeys = loadPrimaryKeys(connection, tableNamePattern); List<String> columnNames = new ArrayList<>(); List<Integer> columnTypes = new ArrayList<>(); List<String> columnTypeNames = new ArrayList<>(); List<Boolean> isPrimaryKeys = new ArrayList<>(); List<Boolean> isCaseSensitives = new ArrayList<>(); try (ResultSet resultSet = connection.getMetaData().getColumns(connection.getCatalog(), connection.getSchema(), tableNamePattern, "%")) { while (resultSet.next()) { String tableName = resultSet.getString(TABLE_NAME); if (Objects.equals(tableNamePattern, tableName)) { String columnName = resultSet.getString(COLUMN_NAME); columnTypes.add(resultSet.getInt(DATA_TYPE)); columnTypeNames.add(resultSet.getString(TYPE_NAME)); isPrimaryKeys.add(primaryKeys.contains(columnName)); columnNames.add(columnName); } } } try (Statement statement = connection.createStatement(); ResultSet resultSet = statement.executeQuery(generateEmptyResultSQL(tableNamePattern, databaseType))) { for (int i = 0; i < columnNames.size(); i++) { isCaseSensitives.add(resultSet.getMetaData().isCaseSensitive(resultS et.findColumn(columnNames.get(i)))); result.add(new ColumnMetaData(columnNames.get(i), columnTypes.get(i), isPrimaryKeys.get(i), resultSet.getMetaData().isAutoIncrement(i + 1), isCaseSensitives.get(i))); } } return result; } is the name of the index in the table, so there are no complex structural properties, just a name. Instead of going into detail, we rather focus on the loading process. Its loading process is similar to that of the column, and the main process is in the method. The basic process is also through the database link to obtain core in the organization of the relevant database and table metadata, and the implementation code is as follows: IndexMetaData org.apache.shardingsphere.infra.metadata.schema.builder.loader.IndexMetaDataLoader#load IndexMetaData IndexInfo public static Collection<IndexMetaData> load(final Connection connection, final String table) throws SQLException { Collection<IndexMetaData> result = new HashSet<>(); try (ResultSet resultSet = connection.getMetaData().getIndexInfo( connection.getCatalog(), connection.getSchema(), table, false, false)) { while (resultSet.next()) { String indexName = resultSet.getString(INDEX_NAME); if (null != indexName) { result.add(new IndexMetaData(indexName)); } } } catch (final SQLException ex) { if (ORACLE_VIEW_NOT_APPROPRIATE_VENDOR_CODE != ex.getErrorCode()) { throw ex; } } return result; } TableMetaData This class is the basic element of ShardingSphereMetaData and has the following structure: public final class TableMetaData { // Table Name private final String name; // Column Metadata private final Map<String, ColumnMetaData> columns; // Index Metadata private final Map<String, IndexMetaData> indexes; // Omit other methods } From the above structure, we can see that is assembled from and , so the loading process of can be understood as an intermediate layer, and the specific implementation still depends on and to get the table name and related links for data loading. So the relatively simple loading process is mainly in the method, and its core loading process is as follows: TableMetaData ColumnMetaData IndexMetaData TableMetaData ColumnMetaDataLoader IndexMetaDataLoader TableMetaData org.apache.shardingsphere.infra.metadata.schema.builder.loader.TableMetaDataLoader#load public static Optional<TableMetaData> load(final DataSource dataSource, final String tableNamePattern, final DatabaseType databaseType) throws SQLException { // Get the Link try (MetaDataLoaderConnectionAdapter connectionAdapter = new MetaDataLoaderConnectionAdapter(databaseType, dataSource.getConnection())) { // Format fuzzy matching field of the table name, according to database type String formattedTableNamePattern = databaseType.formatTableNamePattern(tableNamePattern); // Load ColumnMetaData and IndexMetaData to assemble TableMetaData return isTableExist(connectionAdapter, formattedTableNamePattern) ? Optional.of(new TableMetaData(tableNamePattern, ColumnMetaDataLoader.load( connectionAdapter, formattedTableNamePattern, databaseType), IndexMetaDataLoader.load(connectionAdapter, formattedTableNamePattern))) : Optional.empty(); } } SchemaMetaData According to the analysis of the two lower layers, it’s clear that this layer is the outermost layer of metadata exposure, and the outermost layer is structured as a with the following main structure: ShardingSphereSchema /** * ShardingSphere schema. */ @Getter public final class ShardingSphereSchema { private final Map<String, TableMetaData> tables; @SuppressWarnings("CollectionWithoutInitialCapacity") public ShardingSphereSchema() { tables = new ConcurrentHashMap<>(); } public ShardingSphereSchema(final Map<String, TableMetaData> tables) { this.tables = new ConcurrentHashMap<>(tables.size(), 1); tables.forEach((key, value) -> this.tables.put(key.toLowerCase(), value)); } In line with the schema concept, it contains several tables. The attribute of is a map structure, the key is , and the value is the metadata of the table corresponding to the . Initialization is done primarily through the constructor. So again, the focus is on the table metadata loading, let’s follow up from the entry. The core entry point for the entire metadata load is in In the build, we assemble and load the corresponding metadata through configuration rules. The core code is as follows: ShardingSphereSchema tableName tableName org.apache.shardingsphere.infra.context.metadata.MetaDataContextsBuilder#build. /** * Build meta data contexts. * * @exception SQLException SQL exception * @return meta data contexts */ public StandardMetaDataContexts build() throws SQLException { Map<String, ShardingSphereMetaData> metaDataMap = new HashMap<> (schemaRuleConfigs.size(), 1); Map<String, ShardingSphereMetaData> actualMetaDataMap = new HashMap<>(schemaRuleConfigs.size(), 1); for (String each : schemaRuleConfigs.keySet()) { Map<String, DataSource> dataSourceMap = dataSources.get(each); Collection<RuleConfiguration> ruleConfigs = schemaRuleConfigs.get(each); DatabaseType databaseType = DatabaseTypeRecognizer.getDatabaseType(dataSourceMap.values()); // Obtain configuration rules Collection<ShardingSphereRule> rules = ShardingSphereRulesBuilder.buildSchemaRules(each, ruleConfigs, databaseType, dataSourceMap); // Load actualTableMetaData andlogicTableMetaData Map<TableMetaData, TableMetaData> tableMetaDatas = SchemaBuilder.build(new SchemaBuilderMaterials(databaseType, dataSourceMap, rules, props)); // Assemble rule metadata ShardingSphereRuleMetaData ruleMetaData = new ShardingSphereRuleMetaData(ruleConfigs, rules); // Assemble data source metadata ShardingSphereResource resource = buildResource(databaseType, dataSourceMap); // Assemble database metdadata ShardingSphereSchema actualSchema = new ShardingSphereSchema(tableMetaDatas.keySet().stream().filter(Objects ::nonNull).collect(Collectors.toMap(TableMetaData::getName, v -> v))); actualMetaDataMap.put(each, new ShardingSphereMetaData(each, resource, ruleMetaData, actualSchema)); metaDataMap.put(each, new ShardingSphereMetaData(each, resource, ruleMetaData, buildSchema(tableMetaDatas))); } // OptimizeContextFactory optimizeContextFactory = new OptimizeContextFactory(actualMetaDataMap); return new StandardMetaDataContexts(metaDataMap, buildGlobalSchemaMetaData(metaDataMap), executorEngine, props, optimizeContextFactory); } The above code shows that in the build method, the basic database data such as database type, database connection pool, etc. are loaded based on the configured , through which the assembly of the is completed; the assembly of the such as configuration rules, encryption rules, authentication rules, etc. are assembled; the necessary database metadata in the are loaded. Trace to find the method for loading table metadata, namely , in which the and are loaded respectively. schemarule ShardingSphereResource ShardingSphereRuleMetaData ShardingSphereSchema org.apache.shardingsphere.infra.metadata.schema.builder.SchemaBuilder#build actualTableMetaData logicTableMetaData Then what is an and what is a ? Simply put for , is considered a node of , so in the concept of analysis, is , while and is . With these two concepts clearly defined, we then look at the build method together, mainly divided into the following two steps. actualTable logicTable t_order_1 t_order_2 t_order t_order logicTable t_order_1 t_order_2 actualTable 1) actualTableMetaData loading is the basic table of system sharding. In the 5.0 beta version, we adopt the method of database dialect to use SQL to query and load metadata, so the basic process is to query and load database metadata through SQL first. If no database dialect loader is found, the JDBC driver connection is used to obtain it, and then the metadata of the configuration table is loaded in combination with the table name configured in ShardingSphereRule. The core code is shown below. ActualTableMetaData private static Map<String, TableMetaData> buildActualTableMetaDataMap(final SchemaBuilderMaterials materials) throws SQLException { Map<String, TableMetaData> result = new HashMap<> (materials.getRules().size(), 1); // Database SQL Loading Metadata appendRemainTables(materials, result); for (ShardingSphereRule rule : materials.getRules()) { if (rule instanceof TableContainedRule) { for (String table : ((TableContainedRule) rule).getTables()) { if (!result.containsKey(table)) { TableMetaDataBuilder.load(table, materials).map(optional -> result.put(table, optional)); } } } } return result; } 2) logicTableMetaData loading From the above concept we can see that is an actual logical node assembled from an based on different rules, which may be a sharded node or a cryptographic node or something else. Therefore, the is based on the , combined with specific configuration rules such as library and table rules and other associated nodes. In terms of the specific process, it first obtains the table name of the configuration rule, then determines whether the has been loaded, and generates the metadata of the relevant logical node by combining the configuration rule with the method. The core code flow is shown below: logicTable actualTable logicTableMetaData actualTableMetaData actualTableMetaData TableMetaDataBuilder#decorate private static Map<String, TableMetaData> buildLogicTableMetaDataMap(final SchemaBuilderMaterials materials, final Map<String, TableMetaData> tables) throws SQLException { Map<String, TableMetaData> result = new HashMap<> (materials.getRules().size(), 1); for (ShardingSphereRule rule : materials.getRules()) { if (rule instanceof TableContainedRule) { for (String table : ((TableContainedRule) rule).getTables()) { if (tables.containsKey(table)) { TableMetaData metaData = TableMetaDataBuilder.decorate(table, tables.get(table), materials.getRules()); result.put(table, metaData); } } } } return result; } At this point, the core metadata is loaded and encapsulated into a Map for return, for use in each requirement scenario. Metadata Loading Optimization Analysis Although metadata is the essential core of our system, data loading during system startup will inevitably increase system load and lower system startup efficiency. Therefore, we need to optimize the loading process. At present, we are exploring the following two ways: A. Replace Native JDBC Driver Connections with SQL Queries Prior to the 5.0 beta version, the approach used was to load via the native JDBC driver. In 5.0 beta, we have gradually adopted a multi-threaded approach to metadata loading using a database dialect, via SQL queries. The speed of loading system data has been further improved. A detailed dialect loader can be found in the related implementation of . org.apache.shardingsphere.infra.metadata.schema.builder.spi.DialectTableMetaDataLoader B. Reduce Metadata Load Times For the loading of resources common to the system, we follow the concept of “one-time loading for multiple uses”. Of course, we must consider space and time in this process. As a result, we are constantly optimizing to reduce duplicate loading of metadata to enhance overall system efficiency. ShardingSphere Community: ShardingSphere Github: https://github.com/apache/shardingsphere