Appearance
Delta Lake Z Ordering (multi-dimensional clustering)
13.10.2023
Data skipping for Delta Lake is using min and max of each column in a file and the predicates on columns in a query to speed up queries and save resources. However, skipping effectiveness is only high for the first column, but rapidly drops for subsequent ones. Z Ordering data reorganizes the data and allows certain queries to read less data, so they run faster and to save resources. Z Ordering is particularly important for the ordering of multiple columns.
Delta Lake Z Ordering (multi-dimensional clustering)
Resources
- Delta Lake Z Order, Delta Lake
- Delta Lake Z-Ordering from A to Z, Yousry Mohamed
- Support for Data Clustering using Z-Order Curves - Design Sketch, Delta Lake