Segment-Page Feature of openGauss for Solving File Storage Problems
Segment-Page Feature of openGauss for Solving File Storage Problems
In modern society, data is growing explosively, and service requirements in the industry are complex. The amount of data to be stored and the number of tables to be created keep increasing. Each common data table of openGauss corresponds to a logical large file (maximum size: 32 TB). The logical file is divided into multiple actual files based on the fixed size and stored in the corresponding database directory. Therefore, as the data volume of each data table increases, the number of files required for underlying data storage increases gradually. In addition, openGauss provides features such as hash bucket tables and large partitioned tables. Each data table is split into several sub-tables, and the number of files required at the bottom layer increases exponentially. Therefore, this storage management mode has the following problems:
- It depends greatly on the file system and cannot perform fine-grained control to improve maintainability.
- There are too many file handles in the case of a large amount of data. Currently, only virtual handles can be used to solve the problem, which affects the system performance.
- Too many small files may cause random I/O problems in scenarios such as full build and full backup, affecting performance.
To resolve the foregoing problems, openGauss introduces a segment-page storage management mechanism, which is similar to segment-page memory management of an operating system, but differs greatly in implementation mechanisms.
Implementation Principle of the Segment-Page Mechanism
With the segment-page storage management, tablespaces and data files are logically organized into segments, extents, and pages/blocks for storage allocation and management, as shown in the following figure. Specifically, a database (in a tablespace) has only one segment space. The actual physical storage may be a file or may be split into multiple files. Data is allocated to all tables in the database from this space. Therefore, the number of tables is irrelevant to the number of physical files. Each table has a logical segment, and all data in the table is stored in the segment. Multiple extents are mounted to each segment. Each extent is a continuous physical page. Extent sizes can be flexibly adjusted based on service requirements to avoid storage space waste.
Figure 1 Segment-page storage design
Segment-page files can be automatically expanded until the disk space is used up or the limit threshold for the tablespace is reached. Segment-page storage does not automatically reclaim disk space. After some data tables are deleted, the space occupied by the data tables in the segment-page file is reserved and the disk space is not released. These reserved spaces will be reused by tables that are expanded or created later. If you do not need to reuse the space, you can manually call system functions to recycle and then release disk space.
In internal implementation, each segment corresponds to a physical file that is originally stored in page mode. For example, each partitioned table and a bucket in each hash bucket table have an independent segment. Multiple extents are mounted to each segment. Each extent is consecutive in a file, but extents may not be consecutive between each other. A segment can be dynamically expanded by adding new extents, but an extent cannot be directly reclaimed. You can reclaim storage space by segment by truncating or clustering the entire table.
Currently, four sizes of extents are supported: 64 KB, 1 MB, 8 MB, and 64 MB. For a segment, the size of the extent expanded each time is fixed. The size of the first 16 extents is 64 KB, the size of the 17th to 143th extents is 1 MB. The same rule applies to other extents. The following figure lists the parameters.
Table 1 Classification of extents stored in a segment
Guide to Using Segment-Page Tables
When using the CREATE TABLE statement to create a table, you can specify segment=on to enable a row-store table to store data in segment-page mode. If hashbucket=on is specified, segment=on is forcibly used by default. Currently, segment-page storage does not support column-store tables. Segment-page tablespaces are automatically created and do not require additional commands.
Set segment to on to create a segment-page common table.
create table t1(a int, b int, PRIMARY KEY(a,b)) with(segment=on);
Set hashbucket to on to create a segment-page hash bucket table.
create table t1(a int, b int, PRIMARY KEY(a,b)) with(hashbucket=on);
To help users better use the segment-page storage function, openGauss provides two built-in system functions to display the extent usage. Users can use the two views to determine whether to reclaim data and which part of the data to be reclaimed.
**pg_stat_segment_space_info(Oid tablespace, Oid database);**specifies the OIDs of the tablespace and database to display the usage information about all extent groups in the tablespace.
Table 2 pg_stat_segment_space_info view column information
**pg_stat_segment_extent_usage(Oid tablespace, Oid databse, uint32 extent_type);**specifies the usage information of each allocated extent in an extent group returned each time.extent_type indicates the type of the extent group. The value is an integer ranging from 1 to 5.If the value is not within the range, an error is reported.
Table 3 pg_stat_segment_extent_usage view column information
**gs_spc_shrink(Oid tablespace, Oid database, uint32 extent_type);**specifies that one extent group is cleared at a time. The target size in shrinking is automatically calculated as follows: Active data volume + 128 MB. The value is rounded up and aligned with 128 MB.
Summary
openGauss provides the segment-page solution to solve the problem that there are too many underlying file handles when there are a large number of hash bucket tables and large partitioned tables. In the segment-page solution, a table corresponds to a logical segment. Different segments at the bottom layer are stored in a physical file, greatly reducing the number of handles of the physical file at the bottom layer. Even in the case of a large amount of data, the scenario where there are too many file handles such as common tables is avoided, and the system maintainability is improved. In addition, in scenarios such as full build and full backup, random I/Os caused by too many small files can be reduced to improve system I/O performance. The parameters related to the current segment-page table are fixed. In the future, openGauss can use the AI technology to automatically adjust parameters for the segment-page storage mechanism, providing users with more intelligent and better-performance segment-page storage policies.