{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f513a74b",
   "metadata": {},
   "source": [
    "### Lecture 25 - Query Optimization and Concurrency in Transactions\n",
    "\n",
    "### Today's topics: \n",
    "- Query Optimization\n",
    "- Concurrency:\n",
    "  - Definition of serial and serializable schedules\n",
    "  - Checking of serializable schedules\n",
    "  - 2 Phase Locking\n",
    "\n",
    "### Announcements\n",
    "\n",
    "- Office hours \n",
    "  - **My office hours today are from 2:15-3:15PM**\n",
    "  - **I will hold office hours next week as usual: 2:30PM-5PM**\n",
    "  - Any additional office hours will be posted on Bulletin Board\n",
    "  - Please ask questions on Bulletin board\n",
    "  \n",
    "- Remaining lectures:\n",
    "  - Today: Query Optimization and  Concurrency in Transactions \n",
    "  - Thursday: Concurrency in Transactions continued and Database Tuning \n",
    "  \n",
    "- Remaining assignments:\n",
    "  - Hw6: Due on today midnight\n",
    "  - Lecture Exercise 24: due on Friday 12/12\n",
    "  - Optional Lecture Exercise 3: due on Friday 12/12\n",
    "  - Lecture Exercise 25: out today, due on Friday 12/12\n",
    "\n",
    "- We will drop the lowest 5 lecture exercises (despite the weird naming convention I used!). So two of these exercises can be treated as optional. \n",
    "\n",
    "- Use lecture exercises to study! Just because we drop some does not mean the material in them is optional. They are there for you to study. \n",
    "\n",
    "- Final Exam is on 12/18, 11:30AM-2:30PM on DCC 308\n",
    "  - If you need extra time, expect to take the exam between 11:30AM-4PM\n",
    "  - Notify me of any conflicts if you have not done so "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "574cd2c9",
   "metadata": {},
   "source": [
    "### Query optimization\n",
    "\n",
    "- Parse the query and simplify conditions if possible\n",
    "- Consider different query trees\n",
    "  - Join ordering \n",
    "    - ( R join S vs S join R), \n",
    "    - (R join S) join T vs (R join T) join S\n",
    "  - Consider pushing selections down joins \n",
    "    - Potentially reduce size/cost of joins and allow for index scans\n",
    "    -  select_C (R join S) = select_C(R) join select_C(S)\n",
    "  - Sorting may be beneficial (so consider sorting even if in the intermediate steps it appears too costly)\n",
    "  \n",
    "- Query optimization works as a search algorithm:\n",
    "  - Consider all possible 2- way joins (and selections/sorts applicable)\n",
    "  - Add a third relation, and eliminate any 2-way joins that are too costly (compared to a 3-way join)\n",
    "  - Continue until the whole query is implemented"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25d69776",
   "metadata": {},
   "source": [
    "![Query Optimization 1](lecture25_files/page1.png)\n",
    "![Join Ordering](lecture25_files/page2.png)\n",
    "![Pushing Selections Down](lecture25_files/page3.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4482ab44",
   "metadata": {},
   "source": [
    "### Transactions (xact) - Concurrency - Isolation\n",
    "\n",
    "\n",
    "\n",
    "- A transaction has an beginning and an end point\n",
    "\n",
    "```\n",
    "BEGIN TRANSACTION ;\n",
    "select ...\n",
    "insert ...\n",
    "delete ...\n",
    "COMMIT ;\n",
    "\n",
    "```\n",
    "\n",
    "- Atomicity of transactions:\n",
    "  - Either all the transaction succeeds and all the changes it made are final, or the transaction has no effect in the database.\n",
    "  - This includes all the impacts of the transaction that becomes a part of the transaction: e.g. foreign key cascade/set null, triggers.\n",
    "  \n",
    "  \n",
    "- Isolation of transactions:\n",
    "  - We write xacts assuming it is the only one executing or even if other xacts are executing, it will not cause a problem.\n",
    "  \n",
    "- There is no problem if only one transaction executes in the database.\n",
    "  - So if the xacts executed serially!, then the result would be fine.\n",
    "  \n",
    "- A parallel execution of xacts if also fine! (no problem), if the result is the same as one of a serial order of xacts.\n",
    "  - This is called a serializable execution.\n",
    "  \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1e4a984",
   "metadata": {},
   "source": [
    "**Let us see two different transactions and the result of their\n",
    "potential serial executions.**\n",
    "\n",
    "![Transaction model 1](lecture25_files/page4.png)\n",
    "![Transaction model 2](lecture25_files/page5.png)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36a8f08f",
   "metadata": {},
   "source": [
    "Now let us see a different execution which does not produce the same results as any serial execution.\n",
    "\n",
    "![Transaction model 1](lecture25_files/page6.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39f71c03",
   "metadata": {},
   "source": [
    "**Serializable Transactions**\n",
    "\n",
    "- Simplify transactions into read/write operations\n",
    "\n",
    "  - T1: r1(x) r1(y) w1(x) w1(y) c\n",
    "  - T2: r2(x) w2(x) c\n",
    "  \n",
    "- Simplify the execution of operations into schedules (ordering of the transaction operations):\n",
    "  - s1: r1(x) r1(y) w1(x) w1(y) r2(x) w2(x) \n",
    "  - s2: r2(x) w2(x) r1(x) r1(y) w1(x) w1(y) \n",
    "  - s3: r1(x) r1(y) r2(x) w2(x) w1(x) w1(y) \n",
    "  - s4: r1(x) r1(y) w1(x) r2(x) w1(y) w2(x) \n",
    "\n",
    "- A schedule is serializable if there exists a serial order of transactions S such that all reads are guaranteed to have the same value as serial order S and if all writes occur in the same order as the serial order S.\n",
    "\n",
    "- A conflict in a schedule is two operations by two different transactions on the same item and one of them is a write operation.\n",
    "- Conflicts:\n",
    "  - r1(x) .... w2(x)\n",
    "  - w1(x) .... r2(x)\n",
    "  - w1(x) .... w2(x)\n",
    "  \n",
    "- If the ordering of a conflicting operation is changed, the final result of the transaction may change.  \n",
    "  - r1(x) .... w2(x)    vs..   w2(x) ...  r1(x)\n",
    "  - w1(x) .... w2(x)     vs.   w2(x) ...  w1(x)\n",
    "  \n",
    "- A schedule is serializable if there exists a serial order of transactions S such that all conflicting operations in the schedule occur in the same order as a serial schedule.\n",
    "\n",
    "  - s1: r1(x) r1(y) w1(x) w1(y) r2(x) w2(x) \n",
    "  - s2: r2(x) w2(x) r1(x) r1(y) w1(x) w1(y) \n",
    "\n",
    "\n",
    "\n",
    "- Given s4: r1(x) r1(y) w1(x) r2(x) w1(y) w2(x) \n",
    "list all conflicts:\n",
    "\n",
    "r1(x) w2(x)\n",
    "w1(x) r2(x)\n",
    "w1(x) w2(x)\n",
    "\n",
    "- Given s1: r1(x) r1(y) w1(x) w1(y) r2(x) w2(x) \n",
    "list all conflicts:\n",
    "\n",
    "r1(x) w2(x)\n",
    "w1(x) r2(x)\n",
    "w1(x) w2(x)\n",
    "\n",
    "- All conflicts in s4 occur in the same order as s1. s1 is a serial schedule (T1 T2). Hence, s4 is serializable. \n",
    "\n",
    "- Given s3: r1(x) r1(y) r2(x) w2(x) w1(x) w1(y) \n",
    "list all conflicts:  \n",
    "\n",
    "\n",
    "r1(x) w2(x)\n",
    "r2(x) w1(x)\n",
    "w2(x) w1(x)\n",
    "\n",
    "- There is no serial order that has this ordering!\n",
    "  - T1 T2 (r2(x) w1(x), w2(x) w1(x) not possible)\n",
    "  - T2 T1 (r1(x) w2(x) not possible)\n",
    "  \n",
    "  \n",
    "**Conflict Graph**\n",
    "\n",
    "- Each transaction is a node\n",
    "- For every conflict  T1...T2 (such as r1(x) w2(x) or w1(x) w2(x)), there is an edge from T1 to T2\n",
    "\n",
    "- If the graph has a cycle, then the given schedule is not serializable\n",
    "- If the graph has no cycles, then the given schedule is serializable and we can find the corresponding serial schedule using topological search on the conflict graph\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d132612a",
   "metadata": {},
   "source": [
    "![Conflict Graph](lecture25_files/page7.png)\n",
    "![Conflict Graph](lecture25_files/page8.png)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68c30c92",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
