{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cc737301-0f1a-4c4a-9873-a16a205cd1f5",
   "metadata": {},
   "source": [
    "## Announcements\n",
    "\n",
    "- Hw#1 due today at midnight\n",
    "- Lecture exercise #7 out at 2pm today, due midnight on saturday\n",
    "- Hw#2 to be handed out tomorrow, due next thursday at midnight\n",
    "- Exam #1 on Monday September 29 and 12pm\n",
    "  - Please send all accommodations by next week monday midnight\n",
    "\n",
    "- Exam #2 to be moved from November 6 Thursday to November 10 Monday (during class)\n",
    "   "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0aa894b8-4cb8-4095-8160-a350109dadf6",
   "metadata": {},
   "source": [
    "## Decomposition of relations\n",
    "\n",
    "A decomposition of a relation R with functional dependencies F is given by a set\n",
    "of relations R1, R2,.., Rn such that all attributes in R1, R2,.., Rn are a subset of attributes in R\n",
    "\n",
    "If you had an instance of R(A,B,C,D,E), and you decompose into R1(A,B,C) and R2(C,D,E)\n",
    "\n",
    "Find R1 = Project_(A,B,C) (R)  \n",
    "     R2 = Project_(C,D,E) (R)  \n",
    "\n",
    "A decomposition is **lossless** if R1*R2 = R  (natural join of all decomposed relations is guaranteed to be identical to the original relation.\n",
    "\n",
    "## CHASE Algorithm for checking lossless decompositions\n",
    "\n",
    "Given the decomposed relations, is it possible to reconstruct the original relation?\n",
    "\n",
    "- For each decomposed relation, assume a tuple in the \"canonical database\" where the missing\n",
    "  attributes are variables (variables: have subscripts, known values: constant, no subscript)\n",
    "- Apply functional dependencies X->Y to fill in missing values\n",
    "  - If two tuples have the same value for X, if Y is known in one tuple, make the other value also known (otherwise set the two to the same variable)\n",
    "- Continue until\n",
    "  1. There is a tuple with no subscripts, all variavles are known. Then this is a lossless decomposition.\n",
    "  2. No more changes are possible and there is no tuple without a subscript, then this is a lossy decomposition and the resulting relation is a counter example of why it is lossy!\n",
    "\n",
    "R(A,B,C,D,E,F)   F={D->A, AD->E, AE->F, BC->A}\n",
    "R1(A,B,C)  \n",
    "R2(B,C,D)  \n",
    "R3(D,E,F)\n",
    "\n",
    "Is it lossless?   \n",
    "\n",
    "\n",
    "A  B  C  D  E  F  \n",
    "a  b  c  d1 e1 f1  #for R1  \n",
    "a2 b  c  d  e2 f2  #for R2  \n",
    "a3 b3 c3 d  e  f   #for R3 \n",
    "\n",
    "Given D->A  \n",
    "\n",
    "A  B  C  D  E  F  \n",
    "a  b  c  d1 e1 f1  #for R1  \n",
    "a2 b  c  d  e2 f2  #for R2  \n",
    "a2 b3 c3 d  e  f   #for R3 \n",
    "\n",
    "Given AD->E\n",
    "\n",
    "A  B  C  D  E  F  \n",
    "a  b  c  d1 e1 f1  #for R1  \n",
    "a2 b  c  d  e f2  #for R2  \n",
    "a2 b3 c3 d  e  f   #for R3 \n",
    "\n",
    "Given AE->F\n",
    "\n",
    "A  B  C  D  E  F  \n",
    "a  b  c  d1 e1 f1  #for R1  \n",
    "a2 b  c  d  e  f  #for R2  \n",
    "a2 b3 c3 d  e  f   #for R3 \n",
    "\n",
    "Given BC->A\n",
    "\n",
    "A  B  C  D  E  F  \n",
    "a  b  c  d1 e1 f1  #for R1  \n",
    "a  b  c  d  e  f  #for R2  \n",
    "a2 b3 c3 d  e  f   #for R3 \n",
    "\n",
    "\n",
    "R(A,B,C)  F={A->B}  \n",
    "\n",
    "R1(A,C)  \n",
    "R2(B,C)  \n",
    "\n",
    "A  B  C  \n",
    "a  b1 c\n",
    "a2 b  c\n",
    "\n",
    "No further changes, this decomposition is lossy!\n",
    "\n",
    "R1(A,C)  \n",
    "\n",
    "**A C**  \n",
    "a c  \n",
    "a2 c  \n",
    "\n",
    "R2(B,C)\n",
    "\n",
    "**B  C**  \n",
    "b1 c  \n",
    "b  c \n",
    "\n",
    "R' = R1*R2  \n",
    "\n",
    "R'  \n",
    "**A   B   C**  \n",
    "a   b1  c  \n",
    "a   b   c  \n",
    "a2  b1  c  \n",
    "a2  b   c  \n",
    "\n",
    "Since R' is not equal to R, this decomposition is lossy!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19bdfb66-d840-4337-84bd-f0ad4b26e189",
   "metadata": {},
   "source": [
    "## Dependency preserving decompositions\n",
    "\n",
    "R(A,B,C) F={A->B}\n",
    "\n",
    "R1(A,C) F1={}  \n",
    "R2(B,C) F2={}  \n",
    "\n",
    "\n",
    "R(A,B,C,D) F={A->B,B->C,C->D}\n",
    "\n",
    "R1(A,C) F1={A->C}  \n",
    "R2(B,D) F2={B->D}  \n",
    "R3(B,C) F3={B->C}  \n",
    "\n",
    "F1 union F2 union F3 ?equivalent= F\n",
    "F'= {A->C,B->D,B->C}   \n",
    "\n",
    "Is $F'\\equiv F$?\n",
    "\n",
    "We already know, everything in F' is in F+.\n",
    "\n",
    "Is everything in F implied by F'?\n",
    "\n",
    "A->B, A+ in F' = {A,C} so no, B is not in A+\n",
    "B->C, in F'  \n",
    "C->D, C+ in F'= {C} so no, D is not in C+\n",
    "\n",
    "Hence, $F'\\not\\equiv F$? and This is **NOT** a dependency preserving decomposition\n",
    "\n",
    "\n",
    "R(A,B,C,D) F={A->B,B->C,C->D}  \n",
    "R4(A,B) F5={A->B}  \n",
    "R5(B,C) F6={B->C}  \n",
    "R6(C,D) F7={C->D}  \n",
    "\n",
    "F5 UNION F6 UNION F7 = F, hence\n",
    "**This is a dependency preserving decomposition**\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4109a078-bed7-4db7-a647-35fa5e650e8e",
   "metadata": {},
   "source": [
    "## Projecting functional dependencies to a decomposition\n",
    "\n",
    "Given a relation R and functional dependency F, the projection of F onto a decomposed relation R1 is\n",
    "the set F1 of all functional dependendices in F+ that only include attributes in R1.\n",
    "\n",
    "R(A,B,C,D) F={A->B,B->C,C->D}\n",
    "\n",
    "R1(A,B,D) \n",
    "A->ABD  \n",
    "B->BD  \n",
    "D->D  \n",
    "AB->ABD  \n",
    "AD->ABD  \n",
    "BD->BD  \n",
    "\n",
    "F1={A->B, B->D}  #projection of F into R1\n",
    "\n",
    "\n",
    "\n",
    "## Dependency preserving decompositions\n",
    "\n",
    "Suppose you are given a relation R and functional dependency F, a decomposition of R into R1, ...Rn where F1,...Fn are the projection of F onto R1,...,Rn, then this decomposition is **dependency preserving** if \n",
    "\n",
    "(F1 union F2 union ... union Fn)+ = F+.\n",
    "\n",
    "**Decompositions should always be lossless. It is desirable (but not necessary) that they are also dependency preserving.**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a46563d5-2f7b-41d3-a126-115dfe719bf7",
   "metadata": {},
   "source": [
    "## 3NF Decomposition\n",
    "Given a relation R and a set of functional dependencies F **in minimal form**, the 3NF decomposition is computed as follows:\n",
    "\n",
    "- For each fd X->Y, create a new relation with attributes (X union Y)\n",
    "- If there is no relation with all the attributes in one of the keys, then create one relation.\n",
    "- If one resulting relation R1 has all the attributes in R2 (and more), then remove R2.\n",
    "\n",
    "**3NF Decomposition results in relations in 3NF and is lossless and functional dependency preserving.**\n",
    "\n",
    "MusicGroups(Artist, DoB, Group, DateFormed, DJoined, DLeft, Genre)\n",
    "Artist -> DoB  \n",
    "Group -> DateFormed  \n",
    "Artist Group -> DJoined DLeft  \n",
    "\n",
    "Key: Artist, Group, Genre  \n",
    "\n",
    "3NF Decomposition:\n",
    "\n",
    "R1(Artist, DoB)   F={Artist->DoB}   \n",
    "R2(Group, DateFormed)  F2={Group -> DateFormed}  \n",
    "R3(Artist, Group, DJoined, DLeft) F3 = {Artist Group -> DJoined DLeft }  \n",
    "R4(Artist, Group, Genre)  F={}  \n",
    "\n",
    "\n",
    "R(A,B,C,D,E)     F={AB->C, C->A, CD->E}    Keys: ABD, BCD\n",
    "\n",
    "R1(A,B,C)  {AB->C, C->A}  \n",
    "R2(A,C)    {C->A}  <-- subset of R1, get rid of it!\n",
    "R3(C,D,E)  {CD->E}    \n",
    "R4(B,C,D)  {}\n",
    "\n",
    "\n",
    "R1(A,B,C)  {AB->C, C->A}   Key: AB, BC,  not in BCNF  \n",
    "R3(C,D,E)  {CD->E}     Key: CD, in BCNF  \n",
    "R4(B,C,D)  {}   Key: B,C,D  , in BCNF  \n",
    "\n",
    "\n",
    "R(A,B,C,D)  {A->B, A->C, A->D}\n",
    "\n",
    "(A,B)  \n",
    "(A,C)  \n",
    "(A,D)  \n",
    "\n",
    "R(A,B,C,D)  {A->BCD}  !in 3NF\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15478f00-016a-49d5-bcb1-ad814b4595a3",
   "metadata": {},
   "source": [
    "## BCNF decomposition\n",
    "\n",
    "Suppose a relation R and set F is not in BCNF, find X->Y in F that violates BCNF! Create two relations:\n",
    "1. Take out X->Y by constructing a new relation with attributes X+\n",
    "2. Create a new relation containing all attributes in R except for (X+ - X)\n",
    "\n",
    "Check if the resulting relations are in BCNF, if not apply BCNF decomposition **recursively** to each relation that is not in BCNF form.\n",
    "\n",
    "**BCNF Decomposition is always lossless but not necessarily dependency preserving.**\n",
    "\n",
    "R(A,B,C,D) F={A->B,B->C,C->D}  Key: A\n",
    "\n",
    "B->C violates BCNF, take it out!\n",
    "\n",
    "B+ = {B,C,D}\n",
    "\n",
    "R1(B,C,D)  F1={B->C, C->D}  \n",
    "   Key: B  not in BCNF (C is not a superkey,C->D not in BCNF)\n",
    " \n",
    "R2(A,B)    F2={A->B}  (all attributes of R except B+-B={C,D})  \n",
    "   Key: A, in BCNF  \n",
    "\n",
    "\n",
    "Decompose R1 using C->D\n",
    "\n",
    "R3(C,D)  {C->D}, key: C in BCNF\n",
    "R4(B,C}  {B->C}, key: B in BCNF\n",
    "\n",
    "Done!\n",
    "\n",
    "(A,B)  \n",
    "(B,C)  \n",
    "(C,D)  \n",
    "\n",
    "What if we did not use the closure rule!\n",
    "\n",
    "R(A,B,C,D) F={A->B,B->C,C->D}  Key: A  \n",
    "Took B->C  (not using closure)  \n",
    "\n",
    "(B,C)   {B->C} in BCNF\n",
    "(A,B,D)  {A->B, B->D} not in BCNF B->D\n",
    "Take out B->D\n",
    "(A,B)   {A->B}  \n",
    "(B,D)   {B->D}  \n",
    "\n",
    "Final result:\n",
    "(B,C)   {B->C}\n",
    "(A,B)   {A->B}  \n",
    "(B,D)   {B->D}  \n",
    "\n",
    "Not dependency preserving!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "02419f3f-c812-46a2-940c-b567da96fade",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
