Data orientation
From Wikipedia, the free encyclopedia
Data Orientation refers to how tabular data is represented in a linear memory model such as in-disk or in-memory. The two most common representations are column-oriented (columnar format) and row-oriented (row format).[1][2]
The choice of data orientation is a trade-off and a architectural decision in databases, query engines, and numerical simulations.[1] As a result of these tradeoffs, row-oriented formats are more commonly used in Online transaction processing (OLTP) and column-oriented formats are more commonly used in Online analytical processing (OLAP).[2]
Examples of column-oriented formats include Apache ORC[3], Apache Parquet[4], Apache Arrow[5], formats used by BigQuery, Amazon Redshift and Snowflake. Predominant examples of row-oriented formats include CSV, formats used in most relational databases, in-memory format of Apache Spark, and Apache Avro[6].