Distributed Systems | A collection of distributed systems concepts, papers, and implementations

File Storage

Heap File

Page directory

Page Layout

Approach 1

Slotted Pages

Ref: https://www.cs.swarthmore.edu/~soni/cs44/f18/Labs/lab2.html

More concrete example

Ref: Internal Layout of a Heap Table File of Postgres SQL https://www.interdb.jp/pg/pgsql01/03.html

Writing of a Tuple

Suppose a table composed of one page that contains just one heap tuple. The pd_lower of this page points to the first line pointer, and both the line pointer and the pd_upper point to the first heap tuple.

When the second tuple is inserted, it is placed after the first one. The second line pointer is appended to the first one, and it points to the second tuple. The pd_lower changes to point to the second line pointer, and the pd_upper to the second heap tuple.

Reading Heap tuples

Two typical access methods, sequential scan and B-tree index scan, are outlined here:

(a) Sequential scan – It reads all tuples in all pages sequentially by scanning all line pointers in each page.
(b) B-tree index scan – It reads an index file that contains index tuples, each of which is composed of an index key and a TID that points to the target heap tuple. If the index tuple with the key that you are looking for has been found1, PostgreSQL reads the desired heap tuple using the obtained TID value.

Record IDs

Tuple Layout

DENORMALIZED TUPLE DATA

CONCLUSION

Database is organized in pages.
Different ways to track pages.
Different ways to store pages.
Different ways to store tuples.