Basic SQL Interview Questions

Table of Contents

11. What is a cursor?

When we want to retrieve certain records, we use SELECT statements. They retrieve all the matching records for the filter conditions in WHERE clause. But when this SELECT statement used in the PL/SQL block is restricted to return only one row. This is not always welcome in a program. There will be situations where we need to select multiple rows by using SELECT query. In order do this; we use cursors where we write a SELECT query to retrieve multiple rows. These cursors are the named SELECT query and the result of which will be traversed one by one in the program using loops.

Therefore, a cursor is a named SELECT statement or query which can be used anywhere in the PL/SQL program. It will be declared and defined at the beginning, at declaration section.

12. How do you find duplicate records in a table?

One of the methods to find the duplicate records in a table is by using GROUP BY clause. Suppose we need to find number of jobs that an employee has performed during his carrier. We will get different jobs that an employee has taken during the past in JOB_HISTORY table. We need to find number of jobs taken by the employee. Hence we need to group on EMPLOYEE_ID and find number of times the EMPLOYEE_ID present in this table. This will give the number of times an employee has changed his job.

SELECT EMPLOYEE_ID, COUNT (1) AS NUMBER_OF_JOBS
 FROM JOB_HISTORY
GROUP BY EMPLOYEE_ID
HAVING COUNT (1)>1;

We can write simple SELECT statement to get these duplicate values – without using GROUP BY Clause. Suppose employees who change their do not take up same job over the time. Imagine they are getting always promotion to new job than getting demotion to their previous jobs. In such case, we can use self-join on JOB_HISTORY to compare with the EMPLOYEE_ID which is same in both these tables but their JOB_IDs are different.

SELECT DISTINCT jh.EMPLOYEE_ID 
FROM JOB_HISTORY jh, JOB_HISTORY jh_dup
WHERE jh_dup.EMPLOYEE_ID = jh.EMPLOYEE_ID
AND jh_dup.JOB_ID <> jh.JOB_ID;

Here if we do not use DISTINCT, then it will show EMPLOYEE_IDs as many times as they have changed their job.

13. How do you remove duplicate records in a table?

There are multiple ways to delete duplicate records based on the record and users assumptions. Below is one of the methods to delete the duplicate records from JOB_HISTORY table, using START_DATE of the Employee’s job. That means we are going to delete the duplicate record with newer jobs.

DELETE FROM JOB_HISTORY jh
WHERE EXISTS (SELECT 1 FROM JOB_HISTORY jh_dup
      WHERE jh_dup.EMPLOYEE_ID = jh.EMPLOYEE_ID
        AND jh_dup.JOB_ID <> jh.JOB_ID
        AND jh_dup.START_DATE < jh.START_DATE);

Suppose we need to delete the older job record from the JOB_HISTORY table. Then we need to change the condition on START_DATE of jh_dup as greater than.

DELETE FROM JOB_HISTORY jh
WHERE EXISTS (SELECT 1 FROM JOB_HISTORY jh_dup
      WHERE jh_dup.EMPLOYEE_ID = jh.EMPLOYEE_ID
        AND jh_dup.JOB_ID <> jh.JOB_ID 
        AND jh_dup.START_DATE > jh.START_DATE);

Above duplicate records are not actually duplicate records. They are details of the same employee based on their different jobs. Hence we can delete them using START_DATE. But imagine that we have real duplicate records in a table, say JOBS.

Here again we have many ways to delete the duplicate records.

Using ROWID : One of the simplest ways of deleting the duplicate records using ROWID is by using sub query. Let JOB_ID be the unique key column using which we can identify duplicate records (in the case where unique key is not present in the table, then we might have to use all the columns of the table in the GROUP BY Clause below). That means, if we query JOBS based on JOB_ID, then we will get all of its duplicate records. If we are able to select the MIN or MAX of ROWID for each of the duplicate records and delete them, then we have deleted the duplicate records of the table. i.e.;

DELETE FROM JOBS
WHERE ROWID NOT IN (SELECT MIN (ROWID) FROM JOBS
		   GROUP BY JOB_ID);

Above query is finding the minimum of ROWID for each of the duplicate records based on its JOB_ID and deletes all other ROWIDs of the record from the table. This delete is simple one.

Using Self-Join : This method is also same as above method, but we do not use GROUP BY clause to get the duplicates. Instead, here we join with the same table and exclude MAX or MIN value from deleting.

DELETE FROM JOBS j1
	WHERE ROWID NOT IN (SELECT MIN (ROWID) FROM JOBS j2
				WHERE j1.JOB_ID = j2.JOB_ID);

Here while self-joining, we have used j1.JOB_ID = j2.JOB_ID, which will get us the all the duplicate records for the same JOB_ID and then find the minimum ROWID out of it. The outer DELETE statement will delete all of its duplicate records excluding minimum value.

Using ROW_NUMBER () : ROW_NUMBER is the function which returns the integer number to each row based on the column on which we are grouping the duplicates. In our example here, JOB_ID is the key on which we have to group the data first- in ROW_NUMBER terms partitioning the records. Once we group based on JOB_ID and number each duplicate records, we will delete all the records except the first record.

Below table shows how row number is calculated in the case of JOB_HISTORY table for EMPLOYEE_ID.

Hence, our delete statement for duplicate records would be:

DELETE FROM JOBS
WHERE ROWID IN 
     (SELECT RWID FROM
         (SELECT ROWID RWID,
	      ROW_NUMBER () OVER (PARTITION BY JOB_ID ORDER BY JOB_ID) RW_NUM
	   FROM JOBS)
	WHERE RW_NUM>1); -- Retains first record and sends all other duplicates for delete

Using DENSE_RANK () : This is similar to ROW_NUMBER. The only difference here is while sorting the data. Here we sort it based on the ROWID instead of JOB_ID.

DELETE FROM JOBS
	WHERE ROWID IN 
		(SELECT RWID FROM
		       (SELECT ROWID RWID,
			DENSE_RANK () OVER (PARTITION BY JOB_ID ORDER BY ROWID) RNK
		         FROM JOBS)
		WHERE RNK >1); -- Retains first record and sends all other duplicates for delete

While deleting duplicate records, extra caution is to be taken to:

1. First identify the duplicates using select.
2. Double verify if they are actual ‘duplicates’ or not
3. Take backup of records if necessary
4. Apply COMMIT only if you are sure that it is correct.

14. What is a Procedure? Write one simple procedure?

A procedure is a named PL/SQL block which executes one or more related task. Standard syntax for a procedure is as below:

CREATE [OR REPLACE] PROCEDURE procedure_name
    [ (parameter [,parameter]) ]
IS
    [declaration_section]
BEGIN
    executable_section
[EXCEPTION
    exception_section]
END [procedure_name];

Where Parameter is a one or more values input for the procedure, for which we need to perform the task or use them to perform the task. In our above example, STUDENT_ID is the input parameter. For the ID input to procedure, the marks and totals are displayed. Parameters being entered are of 3 types – IN OUT and INOUT
IN Parameters values cannot be modified by the procedure/function. It is like a read only value input to procedure/function.
OUT parameter act as a result parameter. It will not have any input value, but the result of the procedure/function will be passed back to the calling program.
INOUT parameters will input some value to the procedure/function and the result of the tasks within will be sent back to the calling program using the same parameter.
Note: If parameter type is not mentioned explicitly, it will take IN as default parameter type.
Declaration_Section will have set of variables declared which are local to the procedure and are used to perform the various tasks.
Executable_Section is the actual section of the procedure where the one or more steps of tasks are performed to meet the goal. In our example above, in this section, we would fetch the marks of student, calculate his total and display the result in a report.
Exception_Section is the error handling section of the procedure. Suppose there is an error while calculating the tasks, say there was no data found for the entered Student, then no need to perform rest of the actions. It will throw errors and procedure will fail. But user will not be happy to see that there is some error on the page. If proper error message is displayed, it will please them. Hence, when there is a pre-defined error situation, we capture them and proper message/alternative set of action will be defined.

Let us put our example in a procedure.

CREATE OR REPLACE Procedure sp_getReport (STD_ID IN NUMBER) 
IS
-- Declaration Section
   ln_Subject1 NUMBER;
   ln_Subject2 NUMBER;
   ln_Subject3 NUMBER;
   ln_Subject4 NUMBER;
   ln_Subject5 NUMBER;
   ln_total NUMBER;   
   lv_Std_name varchar2 (50);
   
-- Executable Section
BEGIN
-- Retrieve the marks of a student 
    SELECT s.STUDENT_NAEM, m.subject1, m.subject2, 
	m.subject3, m.subject4, m.subject5
    INTO lv_Std_name,
    	 ln_Subject1, 
    	 ln_Subject2, 
    	 ln_Subject3, 
    	 ln_Subject4, 
    	 ln_Subject5
    FROM MARKS m, STUDENT s
    WHERE m.STUDENT_ID = s.STUDENT_ID
    AND m.STUDENT_ID = STD_ID; -- If no records are found by this query, it will jump to exception block below
    
-- Calculate the Total Marks
    ln_total:= ln_Subject1 + ln_Subject2 + ln_Subject3 + ln_Subject4 + ln_Subject5;
    
-- Display the Report
    dbms_output.put_line ('Student Report');
    dbms_output.put_line ('-----------------------');
    dbms_output.put_line ('Student name:'|| lv_std_name);
    dbms_output.put_line ('Student Id:'|| STD_ID);
    dbms_output.put_line ('SUBJECT1   SUBJECT2 SUBJECT3 SUBJECT4 SUBJECT5   TOTAL');
    dbms_output.put_line (ln_Subject1 || ''||ln_Subject2 || '   '||ln_Subject3 || '   '||ln_Subject4 || '   '||ln_Subject5 || '   '||ln_total);

-- Exception Section
EXCEPTION
  WHEN NO_DATA_FOUND THEN
     dbms_output.put_line ('Student Report');
     dbms_output.put_line ('-----------------------');
     dbms_output.put_line ('No Records are found!');
END sp_getReport; -- End of the Procedure

Procedure can be executed in 2 ways:

From the SQL prompt

EXECUTE [or EXEC] procedure_name;

It can be called from an anonymous pl/sql block or any other procedure.

procedure_name;

If we need to remove the procedure from the database, we fire DROP command.

DROP PROCEDURE procedure_name;

e.g.;

DROP PROCEDURE sp_getReport;

15. What is the difference between Procedure and Function?

Function is similar to procedure but it will return the result to the calling program. It is different from procedure that it will always return some value to the calling program; whereas procedure may or may not return value.

CREATE [OR REPLACE] FUNCTION function_name [parameters]
RETURN return_datatype;  -- Datatype of result
IS 
Declaration_section 
BEGIN 
Execution_section
Return return_variable; 
EXCEPTION 
Exception section 
Return return_variable; 
END;

Suppose we want get the first name and last name of employee.

CREATE OR REPLACE FUNCTION fn_emp_name (EMP_ID NUMBER)
   RETURN VARCHAR (20);
IS 
   lv_emp_name VARCHAR (20); 
BEGIN 
	SELECT e.EMP_FIRST_NAME || ' '||e.EMP_LAST_NAME 
		INTO lv_emp_name
	FROM EMPLOYEE e WHERE e.EMPLOYEE_ID = EMP_ID;
	
	RETURN lv_emp_name; -- Returns the employee name to calling program
 END;
 /

How do we execute these functions? We have 3 methods to execute this function.

We can directly assign it to a variable, if there is only one return value.

lv_emp_name:= fn_emp_name (103);

We can call it in a select statement directly

SELECT fn_emp_name (100) FROM DUAL;

We can call it inside the display function as below

dbms_output.put_line (fn_emp_name (100));

Functions can be removed from the database just like a procedure.

DROP FUNCTION function_name;