MEDIUM
DATA STRUCTURES AND ALGORITHMS
Written By
Adam Bhula

Employee Hierarchy Introduction

The Employee Hierarchy problem (also known as the Employee Importance problem) involves writing a function to calculate the score for a given employee, given an array of employee IDs including who they report to. It's important to watch out for circular reporting structures as well as consider how we can achieve better runtime through memoization when looking at real-world applications of the problem.

Employee Hierarchy Problem

Given an array of employee IDs including who they report to, write a function to calculate the score for a given employee. The employee score for an employee equals "Total number of direct and indirect employees report to that employee, then plus 1."

The “plus one" here means adding the employee itself as self-reporting.

An employee without any other employees reporting to it, will have employee score 1.

Each employee has a unique eid (employee_id). Given a direct report map, where key is an eid, value is an array of eids who directly report to key. This map should contain all employees. The map could contain cycles.

Here is an example of direct report map: {123: [234, 345], 234: [456, 789], 345:[], 456:[], 789:[]}

Your solution should have better runtime when called multiple times.

Employee Hierarchy Solutions

To calculate the employee score, we have a function called calculateEmployeeScore. This function takes an employee ID, a map of direct reports, and a memoization dictionary to store the calculated scores. It starts by checking if the employee score for the given ID is already memoized. If it is, we simply return the memoized score. If the employee has no direct reports, their score is 1 (including themselves). Otherwise, we recursively calculate the scores for each direct report, sum them up, and add 1 for the employee themselves. This way, we consider both direct and indirect reports in the score calculation.

To ensure the direct reports map is valid, we have a function called validateDirectReports. It checks three conditions:

  1. all employees are listed in the direct reports map
  2. no employee reports to themselves
  3. there are no circular reporting hierarchies where employees form a loop of reporting relationships.

To detect circular dependencies, we use a helper function called hasCycle. It performs a depth-first search (DFS) traversal of the direct reports graph, keeping track of visited nodes and maintaining a stack of the current path. If a visited node is encountered again in the current path, it indicates the presence of a cycle. By using these functions together, we can accurately calculate the employee scores while ensuring the direct reports map is valid and free of circular dependencies.

def calculateEmployeeScore(eid, directReports, memo):
    # Check if the employee score is already memoized
    if eid in memo:
        return memo[eid]
    
    # Check if the employee has any direct reports
    if eid not in directReports:
        # No direct reports, employee score is 1 (including self)
        memo[eid] = 1
        return 1
    
    # Calculate the employee score recursively for each direct report
    score = 0
    for report in directReports[eid]:
        score += calculateEmployeeScore(report, directReports, memo)
    
    # Add 1 to the score to include the employee itself
    score += 1
    
    # Memoize the calculated score
    memo[eid] = score
    
    return score


def validateDirectReports(directReports):
    # Validate condition 1: All employees are in the direct reports map
    allEmployees = set(directReports.keys())
    for reports in directReports.values():
        allEmployees.update(reports)
    if len(allEmployees) != len(directReports):
        return False
    
    # Validate condition 2: No employee reports to themselves
    for eid, reports in directReports.items():
        if eid in reports:
            return False
    
    # Validate condition 3: No circular reporting hierarchy
    visited = set()
    for eid in directReports:
        if eid not in visited:
            if hasCycle(eid, directReports, visited, set()):
                return False
    
    return True


def hasCycle(eid, directReports, visited, stack):
    visited.add(eid)
    stack.add(eid)
    
    for report in directReports.get(eid, []):
        if report not in visited:
            if hasCycle(report, directReports, visited, stack):
                return True
        elif report in stack:
            return True
    
    stack.remove(eid)
    return False


# Example usage
directReports = {
    123: [234, 345],
    234: [456, 789],
    345: [],
    456: [],
    789: []
}

# Validate the direct reports map
if validateDirectReports(directReports):
    # Calculate and print the employee scores
    memo = {}
    for eid in directReports:
        score = calculateEmployeeScore(eid, directReports, memo)
        print(f"Employee {eid}: Score = {score}")
else:
    print("Invalid direct reports map.")

Time/Space Complexity Analysis

  • Time Complexity:

First time: O(n)

The time complexity of the first-time run is dependent on the depth of the reporting hierarchy for the given employee ID. In the worst case, if there are n employees reporting to the given employee ID, the time complexity can be considered O(n), as each direct report needs to be traversed. Additionally, if there are cycles in the reporting hierarchy, the time complexity can be higher.

Second time: O(1)

The subsequent runs for the same employee ID will have a time complexity of O(1) since the employee score has been memoized in the memo map. The score can be directly retrieved from the memoized values.

  • Space Complexity:

First time: O(n)

The space complexity of the first-time run is determined by the depth of the recursion stack. In the worst case, if the reporting hierarchy is a straight line without cycles, the space complexity can be considered O(n), as the recursion stack will have n frames.

Second time: O(n)

The space complexity remains the same as the first-time run since the memo map is used to store the calculated scores for all employee IDs. The space complexity is still O(n) in the worst case.

It's important to note that the space complexity mentioned here assume that the direct reports map is valid, without any circular reporting hierarchy or missing employee IDs.

About interviewing.io

interviewing.io is a mock interview practice platform. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.