SQL Project Planning

  • + 1 comment

    Steps:

    1. Find the previous end date of a task using LAG(end_date) OVER(ORDER BY end_date) window function
    2. Assigning values to individual rows to form groups(new column name = group_check) 1. Assign 1 -> if the start_date and prev_end does not match or prev_end is NULL. This means it is the start of a new project 2. Assign 0 -> if the start_date and prev_end matches. This means it is a a task of the same project
    3. Use the cummulative_sum SUM(group_check) OVER(ORDER BY end_date) on the group_check_column 3.1 Intuition - As long as cummulative sum stays same, it indicates that tasks are part of same project, else different projects
    4. SELECT MIN(start_date) and MAX(end_date) from each group order by the cummulative sum value(which is essentially group_id)
    5. ORDER BY DATEDIFF(MAX(end_date) - MIN(end_date)) => essentially means ordering by no of days
    with cte AS
    (
        SELECT Task_ID, Start_Date, End_Date, LAG(end_date) OVER(ORDER BY end_date) as              prev_end
        FROM 
        projects
    ),
    groupTasks AS
    (
        SELECT Start_Date, End_Date,
        (CASE WHEN Start_Date > prev_end OR prev_end IS NULL THEN 1 ELSE 0 END) as                  group_check
        FROM
        cte
    )
    SELECT MIN(A.start_date) as start_date, MAX(end_date) AS end_date
    FROM
    (SELECT *,SUM(group_check) OVER(ORDER BY end_date) as cummulative_sum 
    FROM 
    groupTasks) as A
    GROUP BY A.cummulative_sum
    ORDER BY DATEDIFF(MAX(A.end_date),MIN(A.start_date)) ASC;