# SQL - Procedural Programming

## Overview

- Queries and expressions in SQL are ideal for large set-at-a-time
  operations:

  - Operations that need to search the whole relation to answer a
    complex combination of tuples.
  - Using SQL allows database to use the query optimizer to find the
    best query execution plan for the query.

- However, SQL is not efficient for some programming tasks:

  - Sequence of code: A block of queries that must be executed in an order
  - Conditionals: if/then/else statements (some simple ones can be
    handled, but cannot be combined with sequence of code
  - Loops
  - Store values in and use values from program variables
  - Error handling

## Example Program

- We will see an example program that is hard to write using an SQL
  query, but trivial with a procedural language.

- Find the top 3 most frequent y for each x in table R(x,y,z).

  - Not very easy!

  - However, suppose we have a query:

    ```
    SELECT x,y, count(*) num FROM R GROUP BY x,y ORDER BY x,y, num;
    ```

    Now, we can write a simple program to compute this as follows:

    ```
    Algorithm Top 3 y for each x:
       Run query:
            SELECT x,y, count(*) num
            FROM R
            GROUP BY x,y
            ORDER BY x,y, num;
       In a loop read each tuple such that:
           if a new x is found:
               read the next three y values
               (error handling if less than three y values are found)
               skip remaining x values
    ```

## Procedural SQL

- To enable the use of SQL for costly queries, while making it
  possible to write code/procedures on top of it, databases support a
  number of options.

- The options belong in two main options:

  - Server side
  - Client side

### Server side

- Languages make it possible to define procedures, functions and
  triggers
- These programs are compiled and stored within the database server
- They can also be called by queries

### Client side

- Languages allow programs to integrate querying of the database with
  a procedural language
- Coding in a host language with db hooks (C, C++, Java, Python, etc.)
  using the data structures of these languages
- Coding in frameworks with their own data models (Java, Python, etc)
  with similar db hooks as in above.

## Programming with SQL

- All programming paradigms support:

  - Methods to execute queries and update statements

    Executing any SQL statement and catching the outcome

  - Interpret errors returned if any

  - Input values from variables into queries, output query results
    into variables

  - Loop over query results (for queries returning multiple tuples)

  - Raise exceptions (which results in rollback of transactions)

  - Store and reuse queries in the shape of cursors

  - Starting and committing transactions

- Client side programs also support:

  - Opening/closing connections, allocating/releasing database
    resources for queries

- The actual syntax of the methods for doing this vary, but the
  principle is the same.

### Server side language examples

- Generally database specific

- Postgresql:

  - pl/pgsql: generic procedural language for postgresql
  - pl/pyhton: procedural language that is an extension of python
    (also see pl/tcl, pl/perl)

### Client side language examples

- Database specific extensions of host languages, for example

  - libpq: C library for postgresql, uses library calls for specific
    to postgresql
  - OCCI: Oracle library for C

  Write code in C with hooks to the database in the form of
  functions and libraries, and compile using C compiler

  - ECPG: embedded programming in SQL, based on the embedded
    programming standard with a postgresql specific pre-compiler, an
    the standard C compiler

- Standard libraries for connecting to the database

  - JDBC is a Java standard library, all databases implement the same
    functions (some non-standard data types will be different)

    To compile code with JDBC functions, the specific adapter
    (library) for a database must be used.

  - No specific standard for libraries based on C, but ODBC is a
    non-language specific standard, .NET tries to achieve the same

  - Python DB-API is a database independent framework for Python

    Similar to Java, a module for each database management system is
    needed. The postgresql module is called psycopg2.

## Frameworks

- Frameworks are based on specific design principles for developing
  database backed applications

- Examples:

  - Object-relational-mapping used by Rails, Hibernate, Django,
    WebObjects, .NET (different frameworks have different models)
  - Note that the frameworks can be built on top of other languages
    (such as Java + JDBC)


## Client-side programming

- Client-side programming languages have constructs to connect to a
  database, keep a pool of open connections and close connections, in
  additions to constructors for executing queries.

### Running and Debugging

- Depending on how tight the integration with the database, error
  checking becomes an issue.

- Two sources of syntax errors:

  - Host language syntax errors
  - SQL syntax errors: note that often these are not checked until you
    send the query to the server. So, it is difficult to debug.

- Some client-side programming paradigms are a hybrid of host language
  and additional constructs.

  - Example: Embedded SQL is an SQL standard for writing programs in a
    host language.
  - There is a precompilation phase for only the SQL code and
    variables with better error checking.
  - SQL calls are changed to calls in the host language after
    precompilation.
  - Though writing programs in this paradigm can be a bit awkward and
    it is a less popular platform as a result.

### Performance

- One must be especially careful about sending large data sets over
  the network when writing client code.

- When writing client-side code, you must balance the work that must
  be done in the server side and client side.

  - Very complex queries slow down performance.
  - Very simple queries lead to very large data sets being sent over
    the network.

- Your programs must balance (A) the cost of query execution at the
  server, (B) cost of data transfer and (C) cost of processing the
  data at the client application.

  - In most CS courses, you learn how to improve C, run time cost.
  
  - In this course, we are learning about A and B in detail, to be
    coupled with C.


## Call level interfaces for client-side programming

- Supplies the constructs for opening connections, running queries,
  looping over them etc.
- SQL is completely handled with special function calls in the host
  language
- JDBC is an industry standard for Java, supported by all databases
  using drivers.
- Other database specific examples: OCCI for C in Oracle, Libpq for C
  in Postgresql

## Python DB-API

- DB-API is a generic db interface for python (like JDBC).

- psycopg2 is a python adapter that implements DB-API.

  ```
  import psycopg2 as dbapi2

  db = dbapi2.connect (database="python", user="python", password="python")
  cur = db.cursor()
  cur.execute ("INSERT INTO versions VALUES ('2007-10-18', '2.4.4', 'stable')")
  conn.commit ()
  cur.close()
  db.close()
  ```

- Example 2:

  ```
  import psycopg2 as dbapi2

  db = dbapi2.connect (database="python", user="python", password="python")
  cur = db.cursor()

  cur.execute ("SELECT * FROM versions");
  rows = cur.fetchall()
  for i, row in enumerate(rows):
      print ("Row", i, "value = ", row)

  try:
      cur.execute ("""UPDATE versions SET status='stable' where version='2.6.0' """)
      cur.execute ("""UPDATE versions SET status='old' where version='2.4.4' """)
      db.commit()
  except Exception, e:
      db.rollback()
  ```

- Note that in all these examples, there is no support for database
  types and any SQL level error checking. Programmer has to use the
  correct methods and debug errors which requires a high degree of
  expertise! Frameworks try to make this easier on programmers by
  providing a tighter connectivity!

- What these examples do not show is the level of exception processing
  required to make these useful. 

- Let's see another example. Note that `psycopg` is the newest
  implementation of the DB-API connection module.

   ```
   import psycopg
   from psycopg import DatabaseError, OperationalError, ProgrammingError, IntegrityError

   def main():
      conn = None
      try:
          # Connect to the PostgreSQL database
          conn = psycopg.connect(
              dbname="mydatabase",
              user="myuser",
              password="mypassword",
              host="localhost",
              port="5432"
          )

          # Create a cursor
          cur = conn.cursor()

          # --- SELECT statement ---
          print("Fetching employees...")
          cur.execute("SELECT id, name, salary FROM employees ORDER BY id;")
          rows = cur.fetchall()
          for row in rows:
              print(f"ID: {row[0]}, Name: {row[1]}, Salary: {row[2]}")

          # --- UPDATE statement ---
          print("\nUpdating salary for employee with ID=1...")
          cur.execute("UPDATE employees SET salary = salary * 1.1 WHERE id = 1;")
          print(f"{cur.rowcount} row(s) updated.")

          # Commit the transaction
          conn.commit()

      except OperationalError as e:
          print("Operational error (e.g., connection issue):", e)

      except ProgrammingError as e:
          print("Programming error (e.g., SQL syntax issue):", e)

      except IntegrityError as e:
          print("Integrity error (e.g., constraint violation):", e)

      except DatabaseError as e:
          print("General database error:", e)

      except Exception as e:
          print("Unexpected error:", e)

      finally:
          # Close the cursor and connection
          if conn is not None:
              try:
                 conn.close()
                  print("\nConnection closed.")
              except Exception as e:
                  print("Error closing connection:", e)


    if __name__ == "__main__":
       main()
    ```   