## What Are Sets?

A set is a collection of distinct objects of similar type. They are a fundamental construct in mathematics. Groups of elements, such as numbers, be it integers, real numbers, floats, and even null types, are commonly grouped as sets when discussing set theory.

Given their fundamental stature in the world of integers and floats, it was only natural that programming languages would have data structures that reflect set properties in the world of bits and bytes. In Python, you can define sets (`{}`

) using curly braces or the set() function. Sets in Python are mutable and can be modified using various built-in methods. Java also provides a built-in Set interface, with variant implementations such as HashSet, TreeSet, and LinkedHashSet. In C++, the Standard Template Library (STL) provides the set container, which allows for efficient insertion, deletion, and searching of elements. Similarly, in C#, the HashSet and SortedSet classes can be used to implement sets.

## Properties of Sets

**Uniqueness**: Sets contain only unique elements, meaning that there are no duplicate elements within a set.**Unordered**: Sets do not have any inherent order to their elements, meaning that the elements can be listed in any order.**Mutability**: Fundamentally, sets should be mutable. Most programming languages have base set implementations that allow you to add or remove elements from a set.**Subsets**: A set can be a subset of another set if all of its elements are also contained within the other set. The empty set is a fundamental subset of all sets. It is mathematically defined as`∅`

**Equality**: Two sets are equal if they contain the same elements, regardless of the order in which they are listed.

## Common Set Operations

Operation | Description | Time Complexity |
---|---|---|

union(set1, set2) | Returns a set containing all elements from both set1 and set2 | O(len(set1) + len(set2)) |

intersection(set1, set2) | Returns a set containing only elements that are in both set1 and set2 | O(min(len(set1), len(set2))) |

difference(set1, set2) | Returns a set containing only elements that are in set1 but not in set2 | O(len(set1)) |

symmetric_difference(set1, set2) | Returns a set containing only elements that are in set1 but not in set2 | O(len(set1) + len(set2)) |

insertion(element, set) | Returns a set containing only elements that are in either set1 or set2, but not both | O(1) |

deletion(element, set) | Removes element from set | O(1) |

lookup(element, set) | Returns True if element is in set, False otherwise | O(1) |

update(set1, set2) | Updates set1 with elements from set2 | O(len(set2)) |

The examples below illustrate these operations:

Operation | Set 1 | Set 2 | Result |
---|---|---|---|

union | {1, 2, 3} | {3, 4, 5} | {1, 2, 3, 4, 5} |

intersection | {1, 2, 3} | {3, 4, 5} | {3} |

difference | {1, 2, 3} | {3, 4, 5} | {1, 2} |

symmetric_difference | {1, 2, 3} | {3, 4, 5} | {1, 2, 4, 5} |

insertion | 4, {1, 2, 3} | - | {1, 2, 3, 4} |

deletion | 3, {1, 2, 3, 4} | - | {1, 2, 4} |

lookup | 3, {1, 2, 3, 4} | - | True |

update | {1, 2}, {2, 3, 4} | - | {1, 2, 3, 4} |

## Sets Implementation in Various Programming Languages

**English**: According to Shakespeare, …Okay, I’m kidding!

Overall, set implementation in programming languages varies slightly, but the basic principles and functionality remain the same.

Language | Syntax |
---|---|

Python | set1 = {1, 2, 3} or set1 = set([1, 2, 3]) |

Java | Set<Integer> set1 = new HashSet<>(); |

C++ | std::set<int> set1; |

C# | HashSet<int> set1 = new HashSet<int>(); |

Javascript | let set1 = new Set([1, 2, 3]); |

## When to Use Sets in Interviews

Two common scenarios should always trigger your set antennas:

**Quick lookups**: Sets have an amortized constant lookup, insertion, and deletion time complexity. Any time you have to track elements, be it when pathfinding or performing deduplication tasks, requiring a look back to what elements you have seen before, use sets. Their lookup, insertion, and deletion complexities guarantee excellent performance.**Unique elements**: Homogeneity entails all elements being of the same type. Be it Integers, strings, floats, etc, any time you have a collection of similar element types and need to ensure no duplicates, using a set is the way to go.

## Common Mistakes in Interviews Featuring Sets

### Using Hashtables Instead of Sets

By far the most common mistake candidates make is simply failing to use sets. Hashtables also give you quick lookups, but they come with a lot of overhead. Each insertion requires both a key and a value; thus, a hashtable, by default, will always use at least twice the memory a set would track the same unique collection of elements. Similarly, a single insertion operation into a set would require two in a hashtable, as they require both the key and values set. Anytime you need to track unique elements and do not need to save values, occurrence frequencies, or such, a set should be the go-to data structure.

### Using List/Array Operations on Sets

Lists will usually allow you to append elements to the end of the list. They also allow you to perform splices and even mix element types. This is because lists have an inherent order. Sets, however, do not. Any operations that leverage the order of elements will usually not be available in base set implementations. You can add elements to a set, but unless it is an ordered set, you are not guaranteed that it will be at the end. As such, you cannot pop from a set. Instead, you `delete`

an element from the set. Below is a table of common operations on sets and lists that get mixed up and set specific idiosyncrasies to keep an eye out for.

List Operation | Set Operation | Notes |
---|---|---|

list.append(elem) | set.add(elem) | list.append() adds an element to the end of the list. set.add() will add elem to the set but does not guarantee order, only presence. |

list.pop() | set.pop() | list.pop() removes and returns the last element of the list. set.pop() removes and returns an arbitrary element of the set. |

list.extend(lst) | set.update(other_set) | list.extend() appends the elements of list to the end of the list. set.update() adds the elements of other_set to the set in no inherent order. |

list.remove(elem) | set.remove(elem) | list.remove() removes the first occurrence of elem from the list. set.remove(elem) removes elem from the set, or raises an error if elem is not in the set. |

Do note that the above is based on Python, but the behavior tends to be similar in most languages. Confirm the nuances by reading through the documentation of your preferred language. This section primarily aims to spotlight how differently common operations behave.

### Trying to Use an Index as a Look Up Element

Candidates often make the mistake of trying to access elements in the set using indices because lists and sets are similar. Inherently, sets do not have an order, and so elements aren’t assigned an index. As such, you cannot retrieve elements by their position. That said, lookups in sets are `O(1)`

, so simply searching for the element should be performant enough.

### Mixing Element Types

When working with sets, it is important to ensure that you add elements of the same type. Candidates often forget to check the data types of the elements they add to the set or to standardize the data type when type-casting elements. This is especially common when converting lists to sets. Adding elements of different types to a set can result in unpredictable behavior or type errors. Sets are designed to contain only elements of a single data type. To avoid this mistake, make it a habit to explicitly cast elements to a standard type (The string type is a good fallback as most objects and data types can be cast to string).

## What to Say in Interviews to Show Mastery Over Sets

### Clarify Element Type

During interviews, ensure you explicitly declare what data type you will be adding to your set. Type check or even cast when inserting if necessary, but make sure the type is clear. The goal is to avoid type errors while also showcasing a fundamental understanding of the data type homogeneity property of sets.

### Complexity of Operations

In your interviews, be mindful of set operation complexities. Most “advanced” set operations are expensive, but candidates don’t realize it. Talk to your interviewer about the worst-case as this is the expected benchmark when discussing performance. An example:

*To perform the intersection operation, we need to see if each element in one set is in the other set. Lookup being O(1) means that the intersection will cost at least O(min(len(set1), len(set(2)))), assuming we are going through elements of the smaller set and looking up their presence in the larger set. The space complexity is also O(min(len(set1), len(set(2)))) as in the worst case, all elements of the smaller set will be in the large set.*

### Knowing When to Use Sets vs. Hash Tables

While you can get away with using either of these two in most situations, try to use the right structure for the job. A simple heuristic to use is:

**Do I need to remember any values?** Examples are frequency counting or tracking occurrences or positions. If yes, a hashtable is the way to go. Otherwise, a set should suffice.

### Partition Equal Subset Sum

Given an array of positive numbers, determine if the array can be split such that the two partition sums are equal.

### Transformation Dictionary

Given a dictionary of words, determine whether it is possible to transform a given word into another with a fixed number of characters.

## About the Author

Githire (Brian) is a backend and ML engineer with 7 YoE ranging from startups to major corporations. He has worked on tech serving a wide demographic ranging from mobile money in his homeland Kenya, embedded tech with Kakao in South Korea to MLE at Microsoft. Brian has also worked as a teacher and has a knack for writing technical articles

## About interviewing.io

interviewing.io is a **mock interview practice platform**. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.