{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6defcca5-57ca-4917-8dfd-a6789100f07b",
   "metadata": {},
   "source": [
    "## Announcements\n",
    "\n",
    "- Hw#1 due this thursday at midnight\n",
    "- Lecture exercise 5 is due tonight (mistake in setting the deadline by me!)\n",
    "- Lecture exercise 6 is available at 2pm today due on wednesday at midnight\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1835a6a3-09e1-4364-843b-b8d132e80848",
   "metadata": {},
   "source": [
    "\n",
    "## SuperKey\n",
    "\n",
    "Given a relation R and a set of fds F, the set X is a superkey if \n",
    "X+ contains all the attributes in R.\n",
    "\n",
    "## Key\n",
    "\n",
    "Given a relation R and a set of fds F, the set X is a key if \n",
    "1. X+ contains all the attributes in R, and\n",
    "2. X is minimal (no subset X1 of X has the property that X1+ contains all the attributes)\n",
    "\n",
    "\n",
    "R(A,B,C,D,E,F)    F={AC->DE, BD->AF, EC-> A}\n",
    "\n",
    "BC+ = {B,C}  \n",
    "ABC+ ={A,B,C,D,E,F}  \n",
    "BCD+ = {A,B,C,D,E,F}  \n",
    "BCE+ = {B,C,E,A,D,F}\n",
    "\n",
    "Keys: ABC, BCD, BCE  \n",
    "\n",
    "## Prime Attribute\n",
    "\n",
    "An attribute in a key is called a prime attribute.\n",
    "\n",
    "For the relation  R(A,B,C,D,E,F)    F={AC->DE, BD->AF, EC-> A}  \n",
    "Keys: ABC, BCD, BCE   \n",
    "Prime attributes: A,B,C,D,E  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6eac708f-1ad7-4c0e-a9c3-eb9d4f070a3a",
   "metadata": {},
   "source": [
    "# Normal Forms\n",
    "\n",
    "## Boyce-Codd Normal Form (BCNF)\n",
    "\n",
    "Given a relation R and a set of functional dependencies F, we say that R is in BCNF (R is in Boyce-Codd Normal Form) iff for every functional dependency X->Y in F, one of the following is true:\n",
    "1. X is a superkey, or\n",
    "2. X->Y is trivial.\n",
    "\n",
    "\n",
    "\n",
    "Students(RIN, Email, Name, Address)  \n",
    "RIN-> Email, Name, Address  \n",
    "Email -> RIN  \n",
    "Email RIN -> Name  \n",
    "Email Name -> Name  \n",
    "Name Address -> Name\n",
    "\n",
    "Keys: RIN or Email  \n",
    "In BCNF (All fd follow the rules)\n",
    "\n",
    "\n",
    "\n",
    "StudentMajors(RIN, Major, AdvisorRIN, AdvisorName)\n",
    "RIN Major -> AdvisorRIN AdvisorName\n",
    "AdvisorRIN -> AdvisorName\n",
    "\n",
    "Key: RIN Major  \n",
    "Not in BCNF, because AdvisorRIN -> AdvisorName violates it\n",
    "\n",
    "RIN  Major   AdvisorRIN   AdvisorName\n",
    "1    CSCI    456          Sibel\n",
    "2    CSCI    456          Sibel\n",
    "3    CSCI    234          Malik\n",
    "4    GSAS    235          Mei\n",
    "\n",
    "StudentHobbies(RIN, Name, Hobby)\n",
    "RIN-> Name  \n",
    "\n",
    "Key: RIN, Hobby  \n",
    "Not in BCNF, RIN is not a superkey and RIN->Name is not trivial\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c40d4ce-7ec6-4467-b89b-6936ba53e237",
   "metadata": {},
   "source": [
    "## Third Normal Form (3NF)\n",
    "\n",
    "Given a relation R and a set of functional dependencies F, we say that R is in 3NF (R is in Third Normal Form) iff for every functional dependency X->Y in F, one of the following is true:\n",
    "1. X is a superkey, or\n",
    "2. X->Y is trivial, or\n",
    "3. All attributes in Y are prime attributes.\n",
    "\n",
    "All relations in BCNF are also in 3NF.\n",
    "\n",
    "R(A,B,C,D,E,F)   F={ABC->DEF, F->AB, CDE->CD}\n",
    "\n",
    "\n",
    "Keys: ABC, FC  \n",
    "Prime attributes: A, B, C, F  \n",
    "\n",
    "Not in BCNF because F is not a superkey and F->AB is not trivial  \n",
    "Check 3NF:  \n",
    "\n",
    "- ABC->DEF,  ok because ABC is a superkey\n",
    "- F->AB, ok, F is not a superkey and A and B both are prime attributes\n",
    "- CDE->CD,  ok trivial\n",
    "\n",
    "\n",
    "Address(Street, State, City, Zip)  \n",
    "Street State City -> Zip  \n",
    "City Zip -> State  \n",
    "\n",
    "Key: Street State City, Street City Zip\n",
    "Prime attributes: all attributes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "378489e5-bd94-45f5-8589-06587db59433",
   "metadata": {},
   "source": [
    "## Basis\n",
    "\n",
    "A set of functional dependencies F is said to be in basis form, if there is a single attribute on the right hand side of all functional dependencies.\n",
    "\n",
    "If F is not a basis, we can quickly put it in basis form by decomposition.\n",
    "\n",
    "## Minimal Basis \n",
    "\n",
    "A set of functional dependencies F in basis form, is said to be minimal if there is simplification F1 of F obtained by either removing a functional dependency or an attribute from a functional dependency such that F1+ = F+\n",
    "\n",
    "## Algorithm for finding minimal basis\n",
    "\n",
    "Given a set of functional dependencies F\n",
    "\n",
    "1. Put it in basis form (decompose the right hand sides)\n",
    "2. Remove all trivial functional dependencies\n",
    "3. Remove a functional dependency X->Y such that after removing X->Y, X+ still contains Y.\n",
    "   Suppose F' = F - {X->Y},  then X+ in F' contains Y.\n",
    "4. Suppose XZ->Y in F, I can simplify this to X->Y is the closure does not change!\n",
    "   Given F,\n",
    "   F' = F-{XZ->Y} union {X->Y} (assuming XZ->Y is in F)\n",
    "   Check if X+ is the same in F and F', then this simplication is possible.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f406cd9-7e73-45df-a4a0-30e3bf911424",
   "metadata": {},
   "source": [
    "R(A,B,C,D,E,F,G)\n",
    "\n",
    "F={AB->C, BCE->DG, ABC-> BDE, D->AH, ABE-> BH}\n",
    "\n",
    "1. Put it in basis form\n",
    "\n",
    "F={AB->C, BCE->D, BCE->G, ABC-> B, ABC-> D, ABC-> E, D->A,  D->H, ABE-> B, ABE->H}\n",
    "\n",
    "2. Remove all trivial fds\n",
    "\n",
    "F={AB->C, BCE->D, BCE->G, ABC-> D, ABC-> E, D->A,  D->H, ABE->H}\n",
    "\n",
    "3. Remove X->Y if X->Y is implied by the rest of the functional dependencies\n",
    "\n",
    "Can I remove BCE->D?  \n",
    "F1={AB->C, BCE->G, ABC-> D, ABC-> E, D->A,  D->H, ABE->H}  \n",
    "\n",
    "BCE+ = {B,C,E,G}, no!  \n",
    "\n",
    "Can I remove ABC->D  \n",
    "F'={AB->C, BCE->D, BCE->G, ABC-> E, D->A,  D->H, ABE->H}\n",
    "\n",
    "ABC+ = {A,B,C,E,D,H}, yes!\n",
    "\n",
    "Can I remove D->H?  \n",
    "F''={AB->C, BCE->D, BCE->G, ABC-> E, D->A, ABE->H}  \n",
    "\n",
    "D+ = {D,A}, no!  \n",
    "\n",
    "Can I remove ABE->H?\n",
    "F''={AB->C, BCE->D, BCE->G, ABC-> E, D->A,  D->H}  \n",
    "\n",
    "ABE+ = {A,B,E,C,D,G,H}, yes!\n",
    "\n",
    "\n",
    "F={AB->C, BCE->D, BCE->G, ABC-> E, D->A,  D->H}  \n",
    "\n",
    "4. Can we simplify the right hand side?\n",
    "\n",
    "For example (changing ABC->E to AB->E)\n",
    "F={AB->C, BCE->D, BCE->G, ABC-> E, D->A,  D->H}    AB+ = {A,B,C,E,D,G}  \n",
    "F'={AB->C, BCE->D, BCE->G, **AB-> E**, D->A,  D->H}  AB+ = {A,B,E,C,D,G}  \n",
    "Yes!\n",
    "\n",
    "Can I remove E from BCE->D  \n",
    "F={AB->C, BCE->D, BCE->G,AB-> E, D->A,  D->H}  BC+={B,C}\n",
    "F'={AB->C, **BC->D**, BCE->G,AB-> E, D->A,  D->H}  BC+={B,C,D,A,H,E} \n",
    "No!\n",
    "\n",
    "Can I remove C from BCE->D  \n",
    "F={AB->C, BCE->D, BCE->G,AB-> E, D->A,  D->H}  BE+={B,E}\n",
    "F'={AB->C, **BE->D**, BCE->G,AB-> E, D->A,  D->H}  BE+={B,E,D,A,C,H}  \n",
    "No!\n",
    "\n",
    "No other removals!\n",
    "\n",
    "F={AB->C, BCE->D, BCE->G,AB-> E, D->A,  D->H}  **Minimal basis**\n",
    "\n",
    "Or combine fds with the same left hand side!\n",
    "\n",
    "F={AB->CE, BCE->DG, D->AH}  **Minimal cover**\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28d688ce-a5f2-4963-8dca-f3d90aad8e90",
   "metadata": {},
   "source": [
    "R(A,B,C)  F={A->B, B->C, A->C}  \n",
    "\n",
    "R(A,B,C)  F1={B->C, A->C}, F1+ ?= F+\n",
    "\n",
    "R(A,B,C)  F2={A->B, A->C}, F2+ ?= F+ \n",
    "\n",
    "R(A,B,C)  F3={A->B, B->C}, F3+ ?= F+  ,   Yes!\n",
    "\n",
    "Removed A->C, In F3, A+ ={A,B,C}, C is in A+, then F3 implies A->C\n",
    "\n",
    "\n",
    "R(A,B,C)  F={AB->C}  \n",
    "Can I remove B?  \n",
    "\n",
    "F'={A->C}   F={AB->C}  \n",
    "A+ = {A,C}  A+ = {A}  \n",
    "\n",
    "Can I remove A?  \n",
    "F'={B->C}   F={AB->C}  \n",
    "B+ = {B,C}  B+ = {B}  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b7fc12e-744e-4214-a91f-8adc58345ba4",
   "metadata": {},
   "source": [
    "## Decomposition of relations\n",
    "\n",
    "A decomposition of a relation R with functional dependencies F is given by a set\n",
    "of relations R1, R2,.., Rn such that all attributes in R1, R2,.., Rn are a subset of attributes in R\n",
    "\n",
    "If you had an instance of R(A,B,C,D,E), and you decompose into R1(A,B,C) and R2(C,D,E)\n",
    "\n",
    "Find R1 = Project_(A,B,C) (R)  \n",
    "     R2 = Project_(C,D,E) (R)  \n",
    "\n",
    "A decomposition is **lossless** if R1*R2 = R  (natural join of all decomposed relations is guaranteed to be identical to the original relation.\n",
    "\n",
    "R(A,B,C)  F={A->B}  \n",
    "\n",
    "**A   B   C**  \n",
    "a   b1  c  \n",
    "a2  b   c  \n",
    "\n",
    "R1(A,C)  \n",
    "\n",
    "**A C**  \n",
    "a c  \n",
    "a2 c  \n",
    "\n",
    "R2(B,C)\n",
    "\n",
    "**B  C**  \n",
    "b1 c  \n",
    "b  c \n",
    "\n",
    "R' = R1*R2  \n",
    "\n",
    "R'  \n",
    "**A   B   C**  \n",
    "a   b1  c  \n",
    "a   b   c  \n",
    "a2  b1  c  \n",
    "a2  b   c  \n",
    "\n",
    "Since R' is not equal to R, this decomposition is lossy!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2db2598-93d5-492e-bfa9-c58f57bca5b0",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
