Table/HTable |
A collection of related data with a column-based format within HBase. |
Region |
HBase Tables are divided horizontally by row key range into “Regions.” A region contains all rows in the table between the region’s start key and end key. |
Store |
Data storage unit of HBase region. |
HFile/Storefiles |
The unit of Store, which is collocated with a Hadoop datanode and stored on HDFS. |
memStore |
When write data is uploaded to a HTable, it is initially saved in a cache as memStore. Once the cache size exceeds a pre-defined threshold, the memStore is flushed to HDFS and saved as HFile. |
HMaster |
HBase cluster master to monitor a RegionServer’s behavior for load balancing.Table operator. e.g., create,delete and update a table. |
Regionserver |
Serves read/write I/O of all regions in a cluster node. When Regionservers collocate with Hadoop datanode, it can achieve data locality. Subsequently, most reads are served by the RegionServer from the local disk and memory cache, and short circuit reads are enabled. |
Rowkey |
A unique identifier of a row record in table. |
Column family |
Columns in Apache HBase are grouped into column families. |
Column identifier |
The member in column family, also called as column qualifier. Multiple column identifiers can be used within one column family. |