Lecture Notes by Anthony Zhang.

\[ \newcommand{\set}[1]{\left\{ #1 \right\}} \newcommand{\tup}[1]{\left\langle #1 \right\rangle} \newcommand{\abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\floor}[1]{\left\lfloor #1 \right\rfloor} \newcommand{\ceil}[1]{\left\lceil#1 \right\rceil} \newcommand{\mb}[1]{\mathbb{#1}} \newcommand{\rem}{\operatorname{rem}} \newcommand{\sign}{\operatorname{sign}} \newcommand{\imag}{\boldsymbol{i}} \newcommand{\dee}{\mathop{}\!\mathrm{d}} \newcommand{\lH}{\overset{\text{l'H}}{=}} \newcommand{\evalat}[1]{\left.\left(#1\right)\right|} \newcommand{\sech}{\operatorname{sech}} \newcommand{\spn}{\operatorname{Span}} \newcommand{\proj}{\operatorname{proj}} \newcommand{\prp}{\operatorname{perp}} \newcommand{\refl}{\operatorname{refl}} \newcommand{\magn}[1]{\left\lVert #1 \right\rVert} \newcommand{\rank}{\operatorname{rank}} \newcommand{\sys}[2]{\left[ #1 \mid #2\hskip2pt \right]} \newcommand{\range}{\operatorname{Range}} \newcommand{\adj}{\operatorname{adj}} \newcommand{\cof}{\operatorname{cof}} \newcommand{\diag}{\operatorname{diag}} \newcommand{\formlp}{\operatorname{Form}(\mathcal{L}^P)} \]


Introduction to Database Management.

Edward Chan
Section 002
Office Hours: Monday, Wednesday 12:30 PM-1:15 PM, DC2338


Databases, data modelling, and relational algebra, in theory and in practice.

Assignments are due before 1 PM on their respective due dates. No late assignments. Midterm on November 6, 4:30 PM-6 PM.


Elements of Databases

Databases are needed to manage data because we want to store and retrieve it in increasingly elaborate ways. Data is stored information.

The goal of data management is to model part of the real world in a way that is practically useful. We want to do this efficiently and accurately, and hiding irrelevant details so we can concentrate on the common properties of the set (abstraction).

Intension of data is the definition of data, like the definition of employee records. Extension of data is the actual instances of the data, such as the set of employee records - it changes with the state of the world. In databases, the intension of data is the database schema. and the extension of data are the actual records in the database.

A database management system (DBMS) dictates how data is organized and accessed - the structure of the data. A DBMS also exposes interfaces to store, retrieve, and update data efficiently. A modern DBMS must handle concurrent operations, ensure integrity (database constraints, such as a column being unique or a superset of another), enforce security (access control), and support versioning and recovery (ensuring data can be restored after failures such as hard drive crashes and data corruption).


Logical files are those seen by application programmers, such as database tables. Physical files are those seen by system programmers, such as the actual files organized by the database.

The goal of the modern DBMS is to make the data and the applications that use it more independent. This allows us to do things like swapping out the physical storage without changing any applications, or adding and removing indices.

DBMSs can require people like operators (data entry, machine operators), system developers, and database administrators. DBMSs also have associated procedures such as what to do in case of failures and how to use the interfaces

DBMSs have a 3-level architecture - the external level (the application programs and views), the conceptual level (the conceptual schema/data model), and the internal level (data storage, OS, hardware).

DBMSs also have various language interfaces, such as data definition language (DDL), which is used for defining schemas. The DDL is compiled into a system catalog, which is essentially just schema metadata. There is also the data manipulation language (DML), which is the instructions from the application that allows data to be manipulated, and the query language, which allows data to be manipulated.

A data model is the logical organization of data. On top of the data model, we define operations to manipulate it, and constraints to ensure data integrity and enforce logical restrictions. The entity-relational model is the most common data model.

Objects in DBMSs can be stored by value or by reference. Storing by reference supports everything storing by value does, but can also have multiple entries reference the same object.


An entity is an object that has properties and can be uniquely identified. An entity set is a set of entities with similar properties - an extension. An entity type is an abstraction over an entity set - an intension. An attribute is a property of an entity, and an attribute value is an instance of an attribute. Attributes can be single-valued or multi-valued (such as arrays of other values), and simple (primitive) or composite (contains other simple/composite values). The domain is the set of all possible attribute values.

A relationship is an association between multiple entities. An \(n\)-ary relationship associates \(n\) entities with each other. A relationship set is a set of similar relationships, such as those containing common entities - an extension. A relationship type is an abstraction over a relationship set - the intension, like WORKS_IN_DEPARTMENT in an employee databse.

Relationship types and entity types can have attributes.

When designing databases, we need to determine the functional requirements and the database requirements. Using these, we figure out the database interface (the transaction specification) and schema, and then implement the application program and set up the DBMS.

Constraints and Relationships

A candidate key is a minimal set of attributes (least possible number of attributes) of an entity type that uniquely identifies each entity in its entity set. For example, a collection of students might have student number or social insurance number as candidate key.

A primary key is a candidate key that is chosen (as part of the design) as the main way to identify entities in an entity set. Primary keys can be added to any entity type, but are always optional. These key constraints apply to entity types.

An entity-relation diagram represents entity types and relationship types using a flowchart-like diagram:

Entity-relationship diagrams often get cluttered by attribute ellipses. We can avoid this by not drawing attribute ellipses, and then also including a list with entries of the form ENTITY_TYPE (ATTRIBUTE1, ATTRIBUTE2, ...) to list out the entities. As before, primary key attributes are underlined in this list.


For example, for a school application we might have entity types STUDENT, PROGRAM, and CLASS. Relationships might include a STUDENT entity being ENROLLED_IN multiple CLASS entities, or a STUDENT entity MAJORED_IN a PROGRAM entity.

The cardinality of a relationship type restricts the number of each entity type that can be in that relationship type. For example, for binary relationships (relationships between two entity types), common cardinalities are 1:1 (one to one, 1 of each entity type per relationship), 1:N (one to many, one or more of one entity type per each of the other), and N:M (many to many, any number of either entity type). These cardinality constraints apply to relationship types. The cardinality, essentially, retricts the number of relationships an entity can can participate in.

The existance constraint constrains whether an entity can exist independently of something else. For example, a department should not exist if there are no employees in it. Since a department is always associated with an employee. An entity type is totally dependent on another if every entity of the first is always associated with at least one of the second - the first cannot exist without the second.

Relationship Types

An is-a relationship represents subclass entity types inheriting from a superclass entity type, and is drawn in entity-relationship diagrams as a down-pointing triangle labelled "ISA". The superclass entity type is connected by a line, above the triangle, and the subclass entity types are connected by lines below it. Also, attributes of the superclass are inherited by subclasses.

A generalization is when multiple related entity types are combined into a single, more general entity type. In ERDs, we make the new generalized entity type, and then add an is-a relationship between the specialized entity types and the new general entity type.


A specialization is when a single general entity type is broken down into multiple, more specific entity types. In ERDs, we make the new specialized entity types, and then add an is-a relationship between these new entity types and the general entity type.

An aggregation is when a relationship between multiple entities is itself treated like an entity type, like having relationships between that relationship and other entity types. In ERDs, we draw these as a box around the entire relationship diamond and any entity types it relates together.

For example, suppose a school offers courses, and students can enquire into offered courses, we can model this using a school entity type, related with a course entity type with a "OFFERS" relationship, and a student entity type. Here, we could have a "ENQUIRES_ABOUT" relationship between students, schools, and courses, but this rather cumberome. Instead, we can use aggregation - we treat the OFFERS relationship like an entity, and relate students to "OFFERS" using an "ENQUIRES_ABOUT" relationship.


Databases are often too large to fit in main memory. Therefore, they are generally stored on disk drives or SSDs.

Disk drives have disk packs (disks on a spindle), which have multiple platters (disks), each with many concentric tracks (circles). Tracks are divided into equally sized pages/blocks/sectors (chunks), each of which stores around 512 bytes to 4 kilobytes. The tracks with corresponding radii on each platter form a cylinder.

On disks themselves, 1's and 0's are stored as transitions between north and south magnetized poles. That means that within the space for 1 bit, the track needs to change in the middle, from north to south or south to north. This allows a small current to be induced in the read/write head.

The disk pack is rotated at a constant speed (typically thousands of RPM), and read/write heads move back and forth over the disks to read/write the data. The blocks on each track have the track number (the cylinder in which the block resides), surface number (which platter the block is on), and block number within the track. Blocks are the unit of transfer and are uniquely identified by these 3 values.

To perform a read/write a block on a disk drive, the read/write head has to seek to the right track, wait for the disk to spin around to the right track, and then perform the read/write until done. As this is a mechanical process, this can take multiple milliseconds - two orders of magnitude greater than CPU operations.

As a result of the mechanical aspects of disk drives, it's a good idea to put related data close together on the same track, in order to avoid seek times and rotational delay when reading related information - this is known as clustering. We also try to cache recently used values (and marking modified cache entires as dirty to write them back later), in order to avoid disk operations entirely when possible.

Solid state drives are a lot better, with access times around 150 nanoseconds, but are still very slow relative to the CPU.

A file is a sequence of records (fixed or variable length). Each record is stored on at most one block. The blocks storing records can be contiguous (side by side in memory), linked (blocks point to their neighbors), or indexed (a separate mapping is maintained for where certain data is located).

An ordered/sequential file has its records sorted by a particular attribute, the ordering field. This allows us to find records very fast using binary search, but it also slows down insertions - one common pattern is to accumulate records to insert in an overflow file, and merge it in periodically.

An index is extra information added to a file to provide faster accesses at the cost of slower writes. The index is generally a separate file, so there is a data file and an index file.

Indices are built on a particular set of attributes, called the search key. A primary index is one built on the primary key alone, and any other is a secondary index. An ordered index keeps the search key in sorted order, while an unordered index does not.

Indices are generally implemented using B+ trees. These are very good for range queries, and can easily be inserted into or removed from.


B+ trees are similar to the B trees introduced in CS240, but with a few important differences:

The index is a useful concept because it is independent of the data itself - we can manage indices without touching the data at all.

Although indices make data access faster, they also take up additional space, and slow down data insertion/deletion, as we also have to manage structures like the B+ tree.

The B+ tree represents data as the leaf nodes of a tree in data blocks, each of which can contain multiple entries (these are generally pages on the disk). Entries within a data block contain the key and a reference/pointer to a record in the database, and are sorted by their key.

The interior nodes of the tree, the index blocks, contain references to other blocks for each range they contain.

Each index block will have a lot of ranges, generally around 70 or more. Each block is limited only by the size of disk pages, and we try to fill the pages as much as possible to minimise the number of page accesses - the most expensive operation. Also, index blocks are cached as much as possible, especially the upper levels, in order to avoid the first few page accesses.

A B+ tree of order \(m\) is a search tree in which all leaves are on the same level. Every interior node has between \(\floor{\frac {m - 1} 2}\) and \(m - 1\) keys sorted in asecending order (except the root, which can have between 1 and \(m - 1\)), and has children for each range between its keys - for keys \(k_1, \ldots, k_n\), we have ranges \((-\infty, k_1], (k_1, k_2], \ldots, (k_{n - 1}, k_n], (k_n, \infty)\). Leaf nodes have between \(\floor{\frac d 2}\) and \(d\) records, where \(d\) is a positive integer.

The constraint on the number of keys in each node ensures that the tree's leaf level is always at least half full, to minimise the height of the tree.


Suppose we want to insert a record into a B+ tree. First, we search for the leaf node we want to insert into, and then insert it.

If a leaf node overflows (has more than \(d\) keys), we split the records into two new, roughly equally sized data nodes (the left size is 0 or 1 more than the right size), then add a key to the parent index node for the largest value of the left side. This can cause the parent node to overflow as well, in which case we repeat the previous steps for the parent.

If an index node overflows (has more than \(m - 1\) keys), we split the keys into two new, roughly equally sized index nodes (the left side is 0 or 1 more than the right size), where the middle key of the original index node (or the key right after the middle of the key list) is moved into the parent index node. This can cause the parent node to overflow as well. If this node was originally the root, the new parent node is now the new root.

Suppose we want to delete a record from a B+ tree. First, we search for the leaf node we want to delete from, and then delete it.

If a leaf node underflows (has less than \(\floor{\frac d 2}\) keys), we check its neighbor - if the neighbor has exactly \(\floor{\frac d 2}\) keys, we merge the leaf node with its neighbor, removing a key from the parent. Otherwise, we redistribute records from the neighbor into this leaf node so they have roughly the same amount (at this point, neither is underflowing), and then recalculate keys in the parent node if necessary. This can cause the parent node to underflow as well if we end up merging (merging is only done if redistribution isn't possible).

If an index node underflows (has less than \(\floor{\frac m 2}\) keys), we also check its neighbor - if the neighbor has \(\floor{\frac m 2}\) keys, we merge the index node with its neighbor, removing a key from the parent. Otherwise, we redistribute keys from the neighbor into this index node, like with leaf nodes. This can cause the parent node to underflow as well if we end up merging (merging is only done if redistribution isn't possible).


The B+ trees we have seen so far are only for the primary key indices. For multiple indices, things get a lot more complicated.

The Relational Model

Modern DBMS use the relational model, based on relational algebra and first order logic. Models have the structural components, contraint components, and the operational components. These DBMSs became possible to implement a few decades ago, replacing earlier heirarchical models.

In relational models, the main idea is that all the information is represented using flat relations, which allows us to specify the structure of the data declaratively.

The relational schema represents relations that we want to model.

We can represent a relation \(R\) over \(n\) attributes as a set of \(n\)-tuples \(\tup{a_1, \ldots, a_n} \in D_1 \times \ldots \times D_n\). \(\tup{a_1, \ldots, a_n}\), where \(D_1, \ldots, D_n\) are the domains, sets of simple, single-valued attributes - \(D_i\) is the set of all values of attribute \(i\). In other words, relations are sets of tuples of attributes.

Attributes may also have a null value, which means that the value of the attribute is actually unknown.

If the relation is in \(D_1 \times \ldots \times D_n\), then it is on \(n\) sets and \(n\) is the degree of the relation. So a binary relation is a relation of degree 2.

In DBMSs, all relations are represented as flat tables, where the columns are the attributes \(D_1, \ldots, D_n\), and the rows are the instances of the relation. Note that there is no concept of entities here - relations represent both entity-relational entities and entity-relational relations.

The intension of a relation is the relational schema, and the intension of the database is the database schema \(R = \set{R_1, \ldots, R_k}\). The extension of a relation is the set of possible tuples \(\tup{a_1, \ldots, a_n}\).


By the closed world assumption (if something is true, then we know it's true), relations are complete - if things are related to each other, then we also know that they are related to each other. In other words, all the things we want to represent are known to us/representable in the table - everything is true or false, and all true things are in our database.

Converting ERDs to relational schemas

We have several heuristics for converting ERDs to relational schemas, but there isn't always a good way to.

When there is a one to many relation between doctors (Doctor(DOCTOR_ID, ...) in the ERD) and hospitals (Hospital(HOSPITAL_ID, ...) in the ERD), we might represent doctors in the relational schema as Doctor(DOCTOR_ID, HOSPITAL_ID, ...) - we added an attribute to doctors that specifies which hospital they work in, to represent this one-to-many relation. In general, when we have a many-to-one or one-to-one relation, we can add an attribute to one of the objects identifying the instance of the other object it's associated with.

When there is a many to many relation between students (Student(STUDENT_ID, ...)) and courses (Course(COURSE_ID)), we might add a new table for the relation Taking(STUDENT_ID, COURSE_ID). In general, when we have a many-to-many relation, we can add a new table that associates them with each other, where the primary key is chosen from the attributes of the entity types it relates.

A strong entity type is one that has a primary key, while a weak entity type does not. Strong entity types dominate weak entity types when connected by a relationship. Strong entity types are directly translated to tables where the columns are their attributes. Weak entity types result in tables with their attributes as well, but also include the primary key of entity types that dominate it and the attributes of its relationships.

For specialization (is-a relationships), we have separate tables for the subclasses, and the superclass doesn't need a table. Also, attributes are inherited, so we write them out directly for each subclass.

For aggregation, we create a separate table for the relationship if it doesn't exist, where the primary key is the primary key for the relationship.


Constraints limit the form of the data that can exist within the DBMS. Some constraints are inherent, such as the fact that a prmary key cannot contain null values.

There are also other types of constraints:


Table definitions have the following form:


Table contraints take the following form:

    "UNIQUE" "(" <COLUMN 1> "," <COLUMN 2> "," <...> ")" |
    "PRIMARY KEY" "(" <COLUMN 1> "," <COLUMN 2> "," <...> ")" |
    "FOREIGN KEY" "(" <COLUMN 1> "," <COLUMN 2> "," <...> ")" "REFERENCES" <TABLE NAME> "(" <COLUMN> ")" ["ON" ("UPDATE" | "DELETE") ("SET NULL" | "NO ACTION" | "CASCADE" | SET DEFAULT)] |

The SET NULL foreign key option sets broken references to null - when we delete the parent instance, the child reference becomes null. The NO ACTION option prevents the addition or deletion in the first place. The CASCASE option also deletes the child record when the referenced parent record is deleted. The SET DEFAULT option causes child reference's foreign key attribute to fall back to the default value set for the column.

Relational Algebra

Operations are functions of relations - they accept and return relations/tables. The basic operations are union (\(A \cup B\)), difference (\(\setminus\)), Cartesian product (\(\times\)), projection (\(\pi\)), selection (\(\sigma\)), and renaming (\(\rho\)).

Relational algebra is the formal basis for modern relational models. It is the basis of SQL, the language used to query DBMSs. Relational algebra is important because it has the closure quality - operands and the results of operations are both in the set of relations.

Union of \(A\) and \(B\) is simply the set of those tuples that are in \(A\) or \(B\). Difference of \(A\) and \(B\) is simply the set of those tuples that are in \(A\) but not \(B\).


The attributes of a relation scheme \(R\) are represented as \(\mathrm{Attr}(R)\).

A union or difference of two relational schemes \(R, S\) is the set union/difference of those sets. These are written as \(R \cup S\) and \(R - S\).

An intersection is the set of all records that are identical between two relation schemes \(R, S\), written \(R \cap S\).

A projection of a relation scheme \(R\) is a permutation of a subset of every tuple in \(R\) - a table where the columns are the \(R\)'s columns rearranged or removed. This is written as \(\pi_{A_{i_1}, \ldots, A_{i_m}}: R[A_1, \ldots, A_n] \to A_{i_1} \times \ldots \times A_{i_m}\).

Note that the resulting table for a projection may have duplicate records if \(A_{i_1}, \ldots, A_{i_m}\) is not a candidate key.

A selection of a relation scheme \(R\) is the set of tuples in \(R\) that satisfy a boolean condition \(F\) (where the expression allows attributes, constants, \(\le, \ge, <, >, =, \ne, \wedge, \vee, \neg\) operators, and parentheses). This is written as \(\sigma_F(R) = \set{t \in R \middle| F(t)}\).

The Cartesian product of two relational schemes \(R, S\) is the set of all possible tuples formed by concatenating a tuple in \(R\) with a tuple in \(S\) - each tuple in \(R\) prepended to each tuple in \(S\). This is written as \(R \times S\), and results in \(\abs{R} \abs{S}\) tuples, which can be a lot if both relation schemes were large.

A rename of a relation scheme \(R\) allows us to rename relation schemes and their attributes. This is written as \(\rho_{S[B_1, \ldots, B_n]}(R[A_1, \ldots, A_n])\), which renames \(R[A_1, \ldots, A_n]\) to \(S[B_1, \ldots, B_n]\).

Basically, projection is a rearrangement of columns, selection is a filter for records, Cartesian product is a combination of records, and rename is simply a renaming.

A relational algebra expression is a finite expression of the following form:

Any language that provides at least the retrival power of relational algebra expressions is relationally complete. Relational algebra is a useful notation since we can analyze it using first order logic techniques, and allows analysis independent of the DBMS.

Relational queries are a subset of all computable queries, and represent real-world querying needs pretty well.


A natural join over relation schemes \(R, S\) is a relation scheme \(R * S\) with records having attributes \(\mathrm{Attr}(R) \cup \mathrm{Attr}(S)\), where \(\mathrm{Attr}(R) \cap \mathrm{Attr}(S)\) are identical between \(R\) and \(S\). In other words, it is the set of records of unique attributes from both relation schemes where the common attributes agree between \(R\) and \(S\).

A theta join (\(\theta\)-join) over relation schemes \(R, S\) and attributes \(A \in \mathrm{Attr}(R), B \in \mathrm{Attr}(S)\) is \(R *_{A \theta B} S = \sigma_{A \theta B}(R \times S)\), where \(\theta\) is a comparison operator like \(\ne\) or \(\ge\). For example, an equijoin is a theta join where \(\theta\) is \(=\) - \(R *_{A = B} S = \sigma_{A = B}(R \times S)\).

Note that natural join is simply an equijoin where duplicate columns are removed - \(R * S = \pi_{\mathrm{Attr}(R) \cup \mathrm{Attr}(S)}(\sigma_{A = A, A \in \mathrm{Attr}(R) \cap \mathrm{Attr}(S)}(R \times S))\)

Universally quantified queries are those that use universal quantifiers, like "for all" and "exists".

Relational algebra examples:

Structured Query Language (SQL)

The query language we will use in this course will be SQL. This language is a way to represent relational algebra in a machine-readable way, and also allow definitions of relational schemes. Since SQL is built on sound mathematical foundations, we have certain guarantees about the data stored in the database, such as safety and consistency.


SQL is a way to write relational algebra in a machine-readable format. SQL can be either interactive or embedded in an application, and embedded SQL is either static or dynamic. SQL isn't Turing complete (no loops/recursion), but it can be combined with a Turing complete language to be able to evaluate all computable queries.

In SQL, columns have data types like INTEGER, VARCHAR, VARBINARY, or others like dates and times.

Columns in a table can also be null unless otherwise constrained (like primary key or not null constraints). Null values don't violate the closed world assuption because it doesn't mean that the value isn't in the database, just that it isn't known/applicable. Conditions with null values are neither true nor false (checking whether something is equal to null, for example), so we actually have three-valued logic (true, false, and unknown). SQL queries only select tuples that evaluate to true, so tuples that evaluate to neither true nor false aren't selected.

A table is like a relaion, but duplicates are allowed unless otherwise constrained. Indices are generated for columns of tables to speed up selection and joins.

SQL is both a data declaration language and a retrieval language. There are commands such as table create/alter/drop, index create/drop, and view create/drop, as well as access control commands such as grant/revoke. See slide 122 and after for the exact syntax used in this course.

Creating tables/indices/views builds a new table/index/view, altering tables add/remove columns (expanding with nulls as necessary), and dropping tables/indices/views deletes them, removing associated indices/views as necessary.

Consider SELECT 'the supplier ID is ', supplier_id, ' with status ', status FROM SUPPLIER WHERE city = 'Paris', which selects tuples of the form \(\tup{\text{the supplier ID is }, x \in \mb{Z}, \text{ with status }, y \in \mb{Z}}\).

SQL has tuple iteration semantics - we can iterate over the result to get a sequence of tuples.

For example, SELECT DISTINCT a, b FROM X, Y WHERE X.c = Y.c is an equijoin/inner join on \(c\) over \(X\) and \(Y\), projecting the attributes \(a, b\).


Suppose we have two relation schemes \(A, B\) both with attribute \(x\).

An inner join on \(A\) and \(B\) over \(x\) is the set of all records in \(A\) extended with attributes of \(B\) where \(x\) in \(A\) is equal to \(x\) in \(B\) - records of \(A\) and \(B\), only if they have matches in each other.

A left join on \(A\) and \(B\) over \(x\) is the set of all records in \(A\), extended with attributes of \(B\) where applicable (for records that don't match a record in \(B\), the record is extended with attributes of \(B\) all with value null) - records of \(A\), regardless of whether they have matches in \(B\) or not. A right join is the same but with \(A\) and \(B\) flipped around.

An outer join on \(A\) and \(B\) over \(x\) is the set of all records in \(A\) and \(B\), extended with attributes of each other where \(x\) matches up - records of \(A\) and \(B\), regardless of whether they have matches or not.

A cross join on \(A\) and \(B\) is simply \(A \times B\), the Cartesian product.

In SQL, we can have subqueries that allow us to query other things in our queries. For example, SELECT DISTINCT employee_id, name FROM EMPLOYEES WHERE employee_id IN (SELECT employee_id FROM BENEFIT_PLANS WHERE type = "fixed") - employees with at least one fixed benefit plan (here, we could also just use a join).

SQL can easily compute the answer for existential quantifiers, by querying to find something. To compute the answer for universal quantifier, we simply use \(\forall x. y \equiv \neg \exists x. \neg y\). For example, SELECT * FROM A WHERE NOT EXISTS (SELECT * FROM B WHERE x = 5 AND y = A.y) selects all entries of A corresponding to entries in B such that none of them have x equal to 5.

UNIQUE(...) returns true if there are no duplicate items in .... EXISTS(...) returns true if there are any items in ....


In SQL, we can rename schemes using AS: SELECT * FROM A AS B WHERE B.x = 5. This is very useful when we have subqueries that operate on the same tables.

Find suppliers that supply every part: SELECT name FROM SUPPLIER WHERE NOT EXISTS (SELECT * FROM PART WHERE NOT EXISTS (SELECT * FROM SUPPLIER_PART WHERE supplier_id = SUPPLIER.supplier_id AND part_id = PART.part_id)). Basically, SELECT * FROM PART WHERE NOT EXISTS (SELECT * FROM SUPPLIER_PART WHERE supplier_id = SUPPLIER.supplier_id AND part_id = PART.part_id) means "select parts which are not supplied by the given supplier", so the expression overall means "select supplier names for which there are no parts that are not supplied by the given supplier".

The strategy for solving this kind of problem is to first write the query using formal logic, and then transform that into SQL. For the above example, we could have written the problem first as \(\set{s_{\text{name}}, s \in \text{SUPPLIER} \middle| \forall p \in \text{PART}, \exists r \in \text{SUPPLIER_PART}, s_{\text{supplier_id} = r_{supplier_id} \wedge p_{\text{part_id} = r_{part_id}}}\) - the suppliers for which for every part, they supply those parts.

There are also aggregate functions over tables to allow us to obtain aggregate values such as sum, average, and count, which is something that relational algebra cannot do. For example, COUNT(SOME_EXPR) and COUNT(DISTINCT SOME_EXPR) is the number of values/distinct values in SOME_EXPR (which can be an expression like passed_amount + failed_amount), and SUM(SOME_EXPR) and AVG(SOME_EXPR) obtain the sum and average.

We can also group tuples together using GROUP BY: SELECT part_id, SUM(quantity_available) FROM SUPPLIER_PART GROUP BY part_id (the attributes in the GROUP_BY clause must uniquely identify each resulting group). Additionally, we can use the form GROUP BY SOME_ATTRIBUTES HAVING SOME_EXPR, such as GROUP BY part_id HAVING COUNT(*) > 2 would only show those parts where there are at least 3 suppliers for each.

List the ID and average ratings for suppliers that supply at least one part that isn't supplied by any other supplier: SELECT supplier_id, AVG(rating) FROM SUPPLIER GROUP BY supplier_id HAVING EXISTS (SELECT * FROM SUPPLIER_PART WHERE supplier_id = SUPPLIER.supplier_id AND part_id NOT IN (SELECT part_id FROM SUPPLIER_PART WHERE supplier_id != SUPPLIER.supplier_id)). Here, SELECT part_id FROM SUPPLIER_PART WHERE supplier_id != SUPPLIER.supplier_id are all the part IDs for parts not supplied by the given supplier, and SELECT * FROM SUPPLIER_PART WHERE supplier_id = SUPPLIER.supplier_id AND part_id NOT IN (SELECT part_id FROM SUPPLIER_PART WHERE supplier_id != SUPPLIER.supplier_id) are all the supplier parts for which no supplier other than the given one supplies the part.

In SQL, FROM SOME_TABLES generates the data sets, WHERE SOME_CLAUSES filters data sets, GROUP BY SOME_ATTRIBUTES aggregates data sets, HAVING SOME_CLAUSES filters aggregated data sets, SELECT SOME_VALUES transforms data sets, ORDER BY SOME_ATTRIBUTES sorts data sets, and LIMIT SOME_LIMIT/OFFSET SOME_OFFSET filters specific ranges of entries in data sets.


Midterm covers everything right up to ut not including relaitonal algebra. It takes place on November 6 and I will be writing in M3 1006.

We insert tuples into the table using INSERT INTO SOME_TABLE_OR_VIEW VALUES(...) to insert explicit values or INSERT INTO SOME_TABLE_OR_VIEW SELECT ... to insert all selected tuples.

We delete tuples from the table using DELETE SOME_TABLE WHERE ... or DELETE SOME_TABLE to delete all elements.

A view is a virtual table, derived from one or more base tables/views. This table doesn't actually physically exist in the database, but is computed dynamically from its definition in SQL. Basically, it's a schema defined in SQL whose elements are dynamically computed. Views can be queried but not updated, since their values are purely a function of their base tables/views.

Named views can be created with CREATE VIEW SOME_VIEW AS SOME_SUBQUERY. For example, CREATE VIEW S AS SELECT * FROM SUPPLIER WHERE volume <= 20 WITH CHECK OPTION. WITH CHECK OPTION disallows changes to S that modify records but don't modify the view itself - in the previous example, that means that we can't insert, update, or delete suppliers who have volume greater than 20.

Operations on some views can't easily be translated into operations on their base tables. For example, for CREATE VIEW X AS SELECT supplier_id, AVG(rating) FROM SUPPLIER there is no unambiguous way to map changes to the view to SUPPLIER.

A cursor is similar to a pointer for a set of records, a construct that allows records to be traversed over.

Embedded SQL can be embedded by allowing host applications to perform queries by preprocessing SQL statements embedded in the host language using a special SQL compiler (statement level interface), or construct queries as strings in the host language and perform queries by calling into a library (call level interface).

In this course we will use C as our host language and Oracle as our database.

We can do statement level interface using the exec sql macro:

#include <stdio.h>
#include <ctype.h>
exec sql sqlca; // include the SQL communication area - a structure used for communicating between Oracle and C and storing member variables
int main() {
    // declare SQL host-side variables to use with our application
    exec sql begin declare section; // host variables must be defined within a declare section
        VARCHAR customer_city[21]; // variable sized character array with space for the null terminator
        VARCHAR customer_name[21]; customer_float discount;
        VARCHAR db_username[11], db_password[11];
    exec sql end declare section;
    exec sql CONNECT :db_username IDENTIFIED BY :db_password; // host-side variables are referenced by adding a preceding colon
    exec sql DECLARE c_city CURSOR FOR select name, discount FROM CUSTOMERS WHERE city = :customer_city;
    exec sql OPEN c_city;
    exec sql WHENEVER NOT FOUND GO TO next; // do a jump when there are no more records
    while (TRUE) {
        exec sql FETCH c_city INTO :customer_name, :customer_discount; // automatically jumps to `next` when there are no more entries to fetch
        customer_name.arr[customer_name.len] = '\0';
        printf("name: %s, discount: %f\n", customer_name.arr, customer_discount);
    // if we modified any records above, we would have needed a `exec sql COMMIT WORK` to the 
    exec sql CLOSE c_city;
    exec sql DISCONNECT;


Cursors can be insensitive using the DECLARE c_city INSENSTIVE CURSOR, which means that the cursor can't update records in the database since they make a local copy (this is more efficient, since the DBMS knows we are reading but not writing).

Cursors can be scrollable or not (default is not), which means that the cursor can be read forward and backward - without scrollable, we can only read the next record and can't go backward.

For dynamic SQL, we can prepare a query from a character string, and then execute it as many times as we want. In dynamic SQL, host variables don't need to be declared; instead, we can bind and retrieve values from dummy variables in the query.

For example, we can execute a query with exec sql EXECUTE IMMEDIATE "SELECT * FROM CUSTOMER". We can also prepare a statement and bind some variables:

    char *db_username = "aaaa";
    char *db_password = "bbbb";
    varchar statement[80];
    varchar cust_name[21];
exec sql CONNECT :db_username IDENTIFIED BY :db_password;

// prepare the statement
statement.len = sprintf(statement.arr, "SELECT * FROM CUSTOMER WHERE name = :name");
exec sql PREPARE S FROM :statement;

// create cursor
exec sql DECLARE c_city CURSOR FOR S;
exec sql OPEN c_city USING :cust_name; // bind the variable :name to use :cust_name as its value

// execute the query here

JDBC is a call-level interface for SQL. In JDBC, we connect to the database, send SQL statements, and process the results.


Schema Analysis

Schema analysis is the analysis of relational models - what makes one schema good, and another bad?

In this course, \(R, X, Y, Z\) refer to relation schemes, \(r\) refer to a relation, \(s, t, u, v\) refer to tuples in relation schemes, \(A, B, C, D, E\) refer to attributes.

\(ABC\) and other combinations of attributes represent the set of those attributes, \(\set{A, B, C}\). \(f, g, h\) represent functional dependencies, while \(F, G, H\) represent sets of functional dependencies.


Midterm covers everything up to foreign key contraints, slide 93.

A functional dependency is a constraint between two sets of attributes in a relation \(R\), such that the value of one set of attributes combined with the tuples in \(R\) fully specify the value of the other. A functional dependency is written as \(X \to Y\) - the attributes \(X\) and the tuples in \(R\) fully specify the attributes \(Y\). We can write this as \(X\) functionally determines \(Y\).

Basically, \(X \to Y\) means that knowing \(X\) allows you to know \(Y\), if you have \(R\).

For example, if the customer's fee is $2 per book bought, and a table has a "fee" column and a "books_bought" column, there is a functional dependency between those columns - we can look up the fee from the books bought, and the books bought from the fee.

For example, if an employees table has employee IDs and their names, then the names are functionally determined by employee IDs - we can look up a name from an employee ID and the records in the table.

A relation \(r\) satisfies \(X \to Y\) if and only if when two tuples in \(r\) have the same values for attributes in \(X\), they also have the same values for attributes in \(Y\). In other words, a relation satisfies a functional dependency if it has that functional dependency.

If \(F\) (a set of functional dependencies) being satisfied implies that \(f\) (a functional dependency) is satisfied, then \(F \models f\) (\(F\) logically implies \(f\)). For example, \(\set{A \to B, B \to C} \models A \to B, B \to C, AC \to B, AB \to C, A \to C\) (as well as all trivial functional dependencies).

The closure of a set of functional dependencies \(F\) is the set of all the functional dependencies it logically implies - \(F^+ = \set{X \to Y \middle| F \models X \to Y}\). Saying \(f \in F^+\) is equivalent to saying \(F \models f\). Note that the size of \(F^+\) grows exponentially with respect to the sixe of \(F\).

The closure of a set of attributes \(X\) is the set of all attributes that functionally depend on \(X\) - \(X^+ = \set{A \middle| X \to A \in F^+}\). Note that \(X \to Y\) is in \(F^+\) if and only if \(Y \in X^+\).

Two sets of functional dependencies \(F, D\) are equivalent or cover each other if and only if they have the same closure - \(F^+ = G^+\). We write this as \(F \approx G\).

A set of attributes \(X\) is a candidate key of a relation scheme \(R\) if and only if \(X \to R \in F^+\) (all the attributes of \(R\) functionally depend on \(X\)) and there is no proper subset \(Y \subset X\) such that \(Y \to R \in F^+\). Basically, a key is a minimal set of attributes that specifies a single record.

A superkey is a superset of any key - a set of attributes that contains a key.

Given, a set of attributes \(Y = \set{A_1, \ldots, A_n}\), \(F \models X \to Y\) if and only if \(F \models X \to A_1, \ldots, F \models A_n\).

\(F\) is a minimal cover if and only if all of the following are true:


To compute the closure of an attribute set \(X\), \(X \to Y\) is logically implied by \(F\):

def attribute_set_closure(X, F):
    closure = X # initially, the closure contains all the attributes of the functional dependency
    changed = True
    while changed: # loop while the closure changes
        changed = False
        for (left_side, right_side) in F:
            if left_side.issubset(closure) and not right_side.issubset(closure): # left side is part of closure and right side has attributes not in the closure
                closure += right_side # add all the new attributes to the closure
                changed = True
    return closure

A decomposition of a relation scheme \(R\) splits its attributes over multiple new relation schemes, possibly duplicating them. Formally, a decomposition of \(R\) is \(set{R_1, \ldots, R_k}\) where \(\operatorname{attributes}(R_1), \ldots, \operatorname{attributes}(R_k) \subseteq \operatorname{attributes}(R)\) and \(\operatorname{attributes}(R_1) \cup \ldots \cup \operatorname{attributes}(R_k) = \operatorname{attributes}(R)\).

For example, a relation \(\text{CAR}(\text{model}, \text{cylinders}, \text{origin}, \text{tax})\) can be decomposed into \(\set{\text{CAR}(\text{model}, \text{cylinders}, \text{origin}), \text{TAX}(\text{origin}, \text{tax})}\).

An attribute in a relation \(R\) is prime if and only if it is in some candidate key of \(R\), and is nonprime otherwise. Finding all candidate keys is NP-complete, but in practice it is generally quite easy to at least find some of them.

An attribute \(A\) is fully dependent on a set of attributes \(X\) if and only if \(X\) is a minimal set of attributes that functionally determines \(A\) - \(X \to A\) and there does not exist a proper subset \(Z \subset X\) such that \(Z \to A\)

Let \(X \to Y\) be in \(R\) and \(Y \to X\) not in \(R\) (\(X\) functionally determines \(Y\), but not the other way around). An attribute \(A\) is transitively dependent on a set of attributes \(X\) if and only if \(A \notin X \cup Y\) and \(Y \to A\) is in \(R\) - \(A\) can be determined from \(Y\), which can be determined from \(X\), but \(X\) can't be determined from \(Y\).

A relation \(R\) is in 1NF (first normal form) if and only if the domains of all attributes in \(R\) are just atomic/simple/primitive values, and each attribute only has a single value from its domain.

A relation \(R\) is in 2NF (second normal form) if and only if \(R\) is in 1NF, and each nonprime attribute (attributes that are not part of any candidate keys) are fully dependent on every candidate key of \(R\) - every candidate key is a minimal set of attributes that functionally determines \(A\).

A relation \(R\) is in 3NF (third normal form) if and only if \(R\) is in 2NF, and there are no nonprime attributes (attributes that are not part of any candidate keys) that are transitively dependent on any candidate key of \(R\) - every nonprime attribute is not transitively dependent on any candidate key of \(R\).

A relation \(R\) is in BCNF (Boyce-Codd normal form) if and only if \(X \to \set{A}\) and \(A \notin X\), then \(X\) is a superkey of \(R\) (\(X\) contains a candidate key of \(R\)). This is the usual form we aim for. Also, all relations with two attributes is in BCNF.

A set of relations \(\set{R_1, \ldots, R_k}\) is in 2NF or 3NF or BCNF if and only if each relation in it is in that normal form.


No class today - there's a midterm at 4:30PM!


A relation \(R\) satisfies a set of functional dependencies \(F\) if and only if, for all functional dependencies \(f \in F\) over only those attributes in \(R\), \(R\) satisfies \(f\) - \(R\) satisfies all applicable functional dependencies in \(F\).

A decomposition \(U = \tup{R_1, \ldots, R_k}\) of a relation \(R\) is a lossless join decomposition with respect to \(F\) if and only if for all \(R_i \in U\) satisfying \(F\), \(R = \pi_{A_1}(R) * \ldots * \pi_{A_n}(R)\) - we can recover the original relation by natural joining all the relations in the decomposition with each other. If we cannot recover this, then \(U\) is a lossy join decomposition.

Basically, if we have a decomposition \(U = \tup{R_1, R_2}\) such that \(R_1\) and \(R_2\) satisfy \(F\), then it is lossless if \(R_1 * R_2 = R\). There is a theorem that says \(U = \tup{R_1, R_2}\) is a lossless join decomposition if and only if either \((\operatorname{Attr}(R_1) \cap \operatorname{Attr}(R_2) \to R_1) \in F^+\) or \((\operatorname{Attr}(R_1) \cap \operatorname{Attr}(R_2) \to R_2) \in F^+\) - the common attributes of the relations must functionally determine at least one of the two relations.

The chase algorithm is a polynomial time way to check if a decomposition is lossless or not. Given a decomposition \(U = \tup{R_1, \ldots, R_k}\) of \(R\), and a set of functional dependencies \(F\):

A decomposition \(U = \tup{R_1, \ldots, R_k}\) or a relation \(R\) where \(R\) has functional dependencies \(F\) and \(R_i\) has functional dependencies \(F_i\) is dependency-preserving if and only if \(F_1^+ \cup \ldots F_k^+ = F^+\) - if the combined closures of functional dependencies of all the relations in the decomposition is the same as the closure of the functional attributes of the original relation.

For example, consider a relation \(\text{SUPPLIER}(\text{SUPPLIER_ID}, \text{CITY}, \text{STATUS})\) with functional dependencies \(\text{SUPPLIER_ID} \to \text{CITY}\) and \(\text{CITY} \to \text{STATUS}\). One possible decomposition of this is \(U = \tup{\tup{\text{SUPPLIER_ID}, \text{CITY}}, \tup{\text{CITY}, \text{STATUS}}}\), and another is \(V = \tup{\tup{\text{SUPPLIER_ID}, \text{CITY}}, \tup{\text{SUPPLIER_ID}, \text{STATUS}}}\).

Clearly, \(U\) preserves dependencies, while \(V\) does not - there could be a case where \(\text{CITY} \to \text{STATUS}\) is not the case, when two different SUPPLIER_IDs have different cities but the same STATUS. Also note that, \(U\) is a lossy join decomposition while \(V\) is a lossless join composition.

There are actually algorithms that can generate 3NF, BCNF, lossless join decompositions, and dependency-preserving decompositions.



Consider a banking app. The following code might be used for a transfer:

read source account
subtract amount to transfer
write source account
read target account
add amount to transfer
write target account

This is a very poor way of going about data management. For example, if a power failure occurs right after write source account, the amount to transfer is simply lost, and the database is left in an inconsistent state. If the source account has an amount transferred into it between read source amount and write source amount, that transfer is lost.

Suppose we have concurrent operations that both read the same thing, both process the read data in different ways, then both write it back to the same place. Clearly, one of the written values must be lost - this is the lost update problem.

What if we're in the middle of a series of database operations, and we suddenly find that an error occurred? If we stopped in the middle of that series of operations, we might leave the database in an inconsistent state - we need the ability to rollback changes. Also, if we change written values in one series of operations from a concurrent one, even if we abort this one the changes made by the other concurrent operations should not be rolled back - the dirty read problem. Additionally, if other concurrent operations take place in the middle of this series of operations, they might see the intermediate values instead of the final result - the inconsistent analysis problem.

A transaction is a logical unit of work Transactions should have the following characteristics (ACID compliance):

Transactions only apply to writes, since we don't really need to worry about these issues when reading. Atomicity and consistency ensure that inconsistent states never happen, even in the event of power failures. Durability ensures that the lost update problem never manifests. Atomicity and isolation ensures that the dirty read problem never manifests. Isolation ensures that the inconsistent analysis problem never manifests.

The typical workflow when using transactions is to begin a transaction, make some changes, them either commit the changes (successfully execute the transaction) or abort/rollback (undo all the changes made as if the transaction never occurred).

Let \(T\) be a set of transactions. A schedule \(S\) of \(T\) is a sequence of operations formed by merging the operations in each transaction in \(T\) in the order they would be executed - that means interleaving them if run concurrently.

Suppose one transaction modifies something, and then second, concurrent transaction reads that modified thing and writes something else based on that. If the first transaction is aborted for any reason, the second must be too, since it is based on data that is now rolled back. The rolling back of the second transaction is called a cascading rollback. A schedule avoids cascading rollbacks if and only if every transaction on the schedule never reads something that is written by an uncommited transaction.

In the read before write model of transactions, the operations in transactions are read (obtaining an item's value), read/write (modifying an item based on its value), insert (add a new item), and delete (remove an existing item). From now on, we will use this model.

A schedule \(S\) is serial if and only if for each transaction in \(S\), all the steps execute consecutively - one transaction runs without interruption, then another, and so on, and no steps are interleaved.


In the read before write model, two schedules \(S_1, S_2\) are view equivalent when the following are satisfied:

Some non-serial schedules are equivalent to serial schedules. A schedule is serializable if and only if it is serial or equivalent to some serial schedule.

Let \(S\) be a schedule. Two steps \(i, j\) in \(S\) conflict if and only if they are from different transactions, operate on the same data item, and at least one of the steps involves a write operation. In other words, these are the pairs of steps for which the order of the steps affects the result of the operation.

For the read before write model, there is actually an algorithm that determines whether a schedule \(S\) is conflict-serializable (this doesn't determine view-serializability):

  1. Generate a precedence graph \(G\) for the schedule.
  2. Attempt to find a topological ordering of \(G\) (see CS341 notes for topological ordering properties).
  3. If a topological ordering cannot be found (this would be because \(G\) contains a cycle), \(S\) is not serializable.
  4. Otherwise, the topological ordering is an order of transactions that form an equivalent serial schedule.

In real-world use, serializing schedules is NP-complete and isn't practical for scheduling transactions. Instead, we use locks in the read/write lock model to ensure that we don't change or read data when we shouldn't.

Locking operations are managed by the lock manager, which implements:


Read locks and write locks are mutually exclusive - they cannot both be acquired at the same time.

Livelock/starvation is when a transaction wants a lock, but can never get it because it keeps getting locked by other transactions first. This can be solved by using a lock request queue - when a transaction tries to acquire a lock on a locked item, it gets put in the lock request queue and eventually gets its turn.

A deadlock is when transactions in a set of two or more transactions are each waiting to acquire a lock on an item that is locked by a different transaction in that set - A is waiting for Y while locking X, while B is waiting for X while locking Y. We can solve this by requesting all the locks at once, always locking items in a fixed order, or detecting deadlocks and aborting one of the deadlocked transactions.

In the read/write lock model, two schedules \(S_1, S_2\) are view equivalent when the following are satisfied:

We can generate a serial schedule in the read/write lock model quite simply as well, also using a topological sort:

  1. Generate a precedence graph \(G\) for the schedule.
  2. If there is a topological ordering, then the transactions in that ordering form a serial schedule, otherwise there is no possible serial schedule.

In two-phase locking, locks must always appear before unlocks within individual transactions. This divides the transaction into a growing phase, when we acquire all the locks, and a shrinking phase, when we release them. There is a theorem that says that schedules with transactions using two-phase locking are always serializable.

What is a data item? In our models they are the smallest unit that we can lock - a single row.


A predicate lock is another type of lock that defines a set of satisfying rows (via a predicate), like those that have a matching attribute value. Predicate locks have read and write variants just like locks on data items.

A phantom read is when two identical queries in the same transaction are executed, and they give different rows. This can still happen under our locking model because we can only lock rows that already exist - if new rows get added in between, they aren't included in the locks. These rows are called phantom rows:

transaction A: SELECT * FROM A ; select some rows, locking the selected rows
transaction B: INSERT INTO A VALUES(...) ; add a row, this row is not locked by the select statement above, so we can freely insert
transaction C: SELECT * FROM A ; the newly inserted row is selected here, in addition to the previously selected rows - there are more results than before, even though both queries are in transaction A

Predicate locks mitigate this because when a new row that matches the predicate gets inserted, it gets automatically locked and added to the predicate lock.

A non-repeatable read is when reading a row twice result in two different read values - sort of like phantom reads, but the row's value is changed rather than new rows being added:

transaction A: SELECT sum(x) FROM A ; select some rows, locking the selected rows
transaction B: UPDATE A SET x = 1 WHERE x != 1 ; the sum of all the values changes
transaction C: SELECT sum(x) FROM A ; sum is changed when reading here

Non-repeatable reads can only occur if we don't lock properly - long duration read/write locks prevent these.

Most SQL statements operate over a set of rows - this set of rows can be expressed as a predicate. When we run a statement, we can simply acquire a predicate lock for the SQL statement's predicate to ensure isolation.

Two predicate locks conflict if they have a common row matching both predicates, and one of the predicate locks is a write. If two statements have predicate locks that don't conflict, they can be run concurrently.

DBMSs can have their own nuances to locks, such as how they do locking based on rows/predicates. A short duration lock is held for the duration of one statement, a long duration lock is held for the rest of a transaction, and a medium duration lock is held for a duration in between these two.

Ensuring that schedules are perfectly serializable when using predicate locks can get expensive, since the locks can cover a lot of rows, and reduce concurrency. To mitigate this, DBMSs allow us to trade isolation for performance - allowing/disallowing phantom reads and non-repeatable reads to allow more transactions to execute in parallel. This can be done with SET TRANSACTION READ ONLY ISOLATION_LEVEL or SET TRANSACTION READ WRITE ISOLATION_LEVEL where ISOLATION_LEVEL is one of:

Writes (update/insert/delete) always need to hold long duration predicate write locks (regardless of isolation level).


The CURSOR STABILITY isolation level is a non-standard isolation level that is somewhere between READ COMMITTED and REPEATABLE READ. This is the same as READ COMMITTED, but with additional locks for cursors.

With CURSOR STABILITY, transactions can read data that has been committed by other transactions - repeatable reads while cursor isn't moved and phantom reads possible, Reads use short duration read locks, but accessing rows through cursor uses medium duration lock (locks rows until the cursor is moved).

The purpose of isolation levels is to allow us to selectively give up certain guarantees in exchange for better concurrency. Therefore, we want to use the lowest (most concurrent) isolation level we can get away with - for each transaction, we should look at what other transactions might be running concurrently, and check whether we care about repeatable reads or phantom reads.

Computers can crash at any time, for a number of reasons - hardware failure, software crashes, etc. Databases should be able to fail gracefully without any loss of data, in as many situations as possible.

To model this, we will assume that we have access to stable storage, a block-based storage device that can be written to atomically, and all data on this device we will consider safe. Stable storage doesn't exist in real life - RAM is volatile, while hard drves can't be written atomically, and are prone to failure.

Transaction-local failure is a failure of a single transaction, like a write failure. System-wide failure is a failure of the entire system, like an operating system crash, Media failure is a failure of the non-volatile storage.


A block/physical record/page is an atomic unit of storage. When we open a file, we allocate a buffer to go with it, which buffers the blocks used in the file.

\(\operatorname{input}(X)\) represents transferring a block containing data item \(X\) from storage to the buffer. \(\operatorname{output}(X)\) represents transferring a block containing data item \(X\) from the buffer to storage.

\(\operatorname{read}(X, v)\) represents setting a variable \(v\) to the value of data item \(X\), in the buffer (automatically runs \(\operatorname{input}\) to bring \(X\) into the buffer if necessary). \(\operatorname{write}(X, v)\) represents assigning the value of data item \(X\) to \(v\), in the buffer (automatically runs \(\operatorname{input}\) to bring \(X\) into the buffer if necessary).

These operations are usually managed by the OS, but we can also directly control them when necessary. For example, \(\operatorname{output}\) can be done using a buffer flush.

A log file is a sequential file stored on stable storage. Log file based techniques can let us recover from system failures at any time.

In incremental logging with deferred update, whenever we need to make a write, we record the exact changes (transaction, data item, old value, new value) to the global log file, and whenever we start a trasaction or commit a transaction we record that operation, ensuring that no data items are \(\operatorname{output}\) until all log file entries are \(\operatorname{output}\). After the system fails before log file entries are output, then we are guaranteed that the writes didn't go through either, since writes are always output after log file entries are output - the transaction didn't execute at all. After the system fails after log file entries are written and the writes didn't succeed, we can redo all the writes in the failed transaction by replaying all the writes in the log (this is called the RDU algorithm) - the transaction is executed from the log file entries. This ensures transactions either execute or don't, and if commited, are on stable storage - atomicity and durability.

In incremental logging with immediate update, we do the same thing, but we only need to guarantee that no data items are \(\operatorname{output}\) until its corresponding log file entries are \(\operatorname{output}\) (rather than all log file entries needing to be output first). After the system fails at any time, we can rollback uncommited transactions by undoing all the writes of uncommited transactions in reverse order that they were written to the log, and then redoing all the writes of committed transactions that happened with or after the first uncommited transaction in order. This tends to work better than incremental logging with deferred update in terms of performance, since we can do log writes while also performing operations on the database.

Searching through the entire log after a system failure can be slow, especially when the log gets longer To get around this, we use checkpoints - periodically, when the database is in a known good state (all modified data items and log entries are output), we can write a checkpoint log entry, which contains a list of uncommitted transactions. When recovering after a system failure, we can simply go back to the last checkpoint to get the transactions that we need to replay - there is no need to search through the entire log.

We can also recover from media failures using log files. Periodically, when the database is in a known good state and no transactions are running, we can make a backup of the database onto stable storage (and afterwards add a log entry noting that the backup took place). After a media failure, we can restore the (possibly outdated) backup and replay transactions in the log to get the current version of the database.


Database security is the act of protecting data against unwanted disclosure or leakage. This allows us to prevent potentially malicious users from being able to make changes or read secret data. Security is generally managed by the DBA (database administrator).

Discretionary access control is a security model in which users have privileges, which are granted and removed by a DBA. There are two types/levels of privileges - the account level (privileges independent of database schema, like creating tables/views), and relation/table level (privileges controlling access to individual tables/views). Every relation/table/view in the database has one and only one owner account, which has all privileges, and can grant any privileges to other accounts.

Granting and revoking is itself a privilege, which means we can grant/revoke the ability for other users to grant/revoke. For example, account A can run GRANT SELECT, INSERT ON EMPLOYEE TO B allows account B to select/insert, while GRANT SELECT, INSERT ON EMPLOYEE TO B WITH GRANT OPTION allows B to additionally grant select/insert privileges to other accounts. If A then runs REVOKE SELECT, INSERT ON EMPLOYEE FROM B, B no longer has access, while if A runs REVOKE SELECT, INSERT ON EMPLOYEE FROM B CASCADE, all accounts granted select/insert privileges from B also have those privileges revoked. We can also grant/revoke on a single column, with GRANT SELECT ON EMPLOYEE(SALARY) TO B.

Creative Commons License This work by Anthony Zhang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Copyright 2013-2017 Anthony Zhang.