diff --git a/book_roadmap.md b/book_roadmap.md deleted file mode 100644 index 12b31f5..0000000 --- a/book_roadmap.md +++ /dev/null @@ -1,39 +0,0 @@ -## This is a list of recommended software related books. - -I will attempt to update this list as I also move forward in my career. Take this as a good rough guide to some very good books. I perfer to read books that are less dry than your average textbook. - -### The Basics -Data structures and algothrims are the bread and butter of creating good code, so are good design and clean structure. -- Cracking the coding interview -- Elements of programming interviews -- Clean Code by Robert Martin -- Head First Design Patterns: A Brain-Friendly Guide - -### Python -- Python Tricks: A Buffet of Awesome Python Features - -## C++ -TODO - -### Linux -Linux is a very powerful operating system and its the most common OS used in the software industry. Thats because its the most customizable, flexible OS there is. Maintained by many contributors, very few bugs/crash, high uptime (no need to restart after updates), and less prone to viruses due to permissions. Its important to learn how to develop around linux as you will stumble upon some flavor of linux during your software career. Best to learn it early. -- The Linux Command Line by William Shotts -- How Linux Works, 2nd Edition: What Every Superuser Should Know -- Wicked Cool Shell Scripts, 2nd Edition: 101 Scripts for Linux, OS X, and UNIX Systems - -### Architecture Design -Once you get a few months of being a programmer, its important to learn how to architect your code. This will allow you to merge what you know about design patterns into code that is scalable and maintainable throughout the years to come. -- Domain-Driven Design: Tackling Complexity in the Heart of Software -- Clean Architecture: A Craftsman's Guide to Software Structure and Design - -### DevOps: Continuous Integration and Continuous Deployment -You don't need to necessarily understand the full scope of DevOps as a programmer but quite franking you should at-least understand the perspective of a DevOps engineer you are coding for. Utlimately, your code has to be executed and if its coded in a way that is hard to configure, deploy, compile or test its going to making the operation teams' lives difficult. Additionally, too many companies do not have a good software development pipeline in place. They do not follow the idea of continuous testing per commit nor do they have automatic ways to midgating risk of deploying bad code to production. -- The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations - -### Life Skills -Life skills is rather important, as software engineers tend to solo themselves rather than work together with the team. As you age, you will find that you can't finish the project alone and you have to leverage your teammates at some point to get the job done. How you communicate with your peers is just as important as how you code. You have to work with people in this world, there is no avoiding it. -- How to Win Friends & Influence People -- The Schmuck in My Office: How to Deal Effectively with Difficult People at Work - -### Management -- Principles by Ray Dalio diff --git a/good_coding_style_tips.md b/good_coding_style_tips.md index 61b7ca0..ba3a160 100644 --- a/good_coding_style_tips.md +++ b/good_coding_style_tips.md @@ -3,10 +3,12 @@ Ideal, we want to write functions that always return something back. It gets confusing when you have a list of functions and there isn't a clear distinction on whether if passing an object into that function modifies your object or not. So that is why its generally a good rule of thumb to instead return an object versus passing an object to modify it. -There are some exception during interviews, it tends to deal with recursion. Sometimes there can be a problem where you are storing a temporary result and until some thing like reaching the end of the index, do you save it into the global result. -In this scenario, you will have to pass in a global_result object list for each recursion because you won't know when the save trigger occurs. +There are some exception during interviews, it tends to deal with recursion. +Sometimes there can be a problem where you are storing a temporary result and until reaching a base case like the end of a list, do you save your temporary result into the final result. +In this scenario, you will have to pass in a mutable for each recursion because you won't know when the save trigger occurs. -However, outside of recursion, I would recommend to always return an object over passing in an object and modifying it. +However, outside of recursion, I would recommend to always return an object instead of passing in an object and modifying it. +There are some times exceptions to this depending on the question. # SOLID Principles ### Single Responsiblity Principle diff --git a/leetcode/#3_longest_substring_without_repeats.md b/leetcode/archive/#003_longest_substring_without_repeats.md similarity index 100% rename from leetcode/#3_longest_substring_without_repeats.md rename to leetcode/archive/#003_longest_substring_without_repeats.md diff --git a/leetcode/#127_word_ladder.md b/leetcode/archive/#127_word_ladder.md similarity index 100% rename from leetcode/#127_word_ladder.md rename to leetcode/archive/#127_word_ladder.md diff --git a/leetcode/#138. Copy List With Random Pointer.md b/leetcode/archive/#138. Copy List With Random Pointer.md similarity index 100% rename from leetcode/#138. Copy List With Random Pointer.md rename to leetcode/archive/#138. Copy List With Random Pointer.md diff --git a/leetcode/#139. Word Break.md b/leetcode/archive/#139. Word Break.md similarity index 100% rename from leetcode/#139. Word Break.md rename to leetcode/archive/#139. Word Break.md diff --git a/leetcode/#140._word_break_ii.md b/leetcode/archive/#140._word_break_ii.md similarity index 100% rename from leetcode/#140._word_break_ii.md rename to leetcode/archive/#140._word_break_ii.md diff --git a/leetcode/#146_LRU_Cache.md b/leetcode/archive/#146_LRU_Cache.md similarity index 100% rename from leetcode/#146_LRU_Cache.md rename to leetcode/archive/#146_LRU_Cache.md diff --git a/leetcode/#198. House Robber.md b/leetcode/archive/#198. House Robber.md similarity index 100% rename from leetcode/#198. House Robber.md rename to leetcode/archive/#198. House Robber.md diff --git a/leetcode/#200. Number of Islands.md b/leetcode/archive/#200. Number of Islands.md similarity index 100% rename from leetcode/#200. Number of Islands.md rename to leetcode/archive/#200. Number of Islands.md diff --git a/leetcode/#212._word_search_ii.md b/leetcode/archive/#212._word_search_ii.md similarity index 100% rename from leetcode/#212._word_search_ii.md rename to leetcode/archive/#212._word_search_ii.md diff --git a/leetcode/#215. Kth Largest Element in an Array.md b/leetcode/archive/#215. Kth Largest Element in an Array.md similarity index 100% rename from leetcode/#215. Kth Largest Element in an Array.md rename to leetcode/archive/#215. Kth Largest Element in an Array.md diff --git a/leetcode/#218_skyline_problem.md b/leetcode/archive/#218_skyline_problem.md similarity index 100% rename from leetcode/#218_skyline_problem.md rename to leetcode/archive/#218_skyline_problem.md diff --git a/leetcode/#221. Maximal Square.md b/leetcode/archive/#221. Maximal Square.md similarity index 100% rename from leetcode/#221. Maximal Square.md rename to leetcode/archive/#221. Maximal Square.md diff --git a/leetcode/#22_generate_parentheses.md b/leetcode/archive/#22_generate_parentheses.md similarity index 100% rename from leetcode/#22_generate_parentheses.md rename to leetcode/archive/#22_generate_parentheses.md diff --git a/leetcode/#23. Merge k Sorted Lists.md b/leetcode/archive/#23. Merge k Sorted Lists.md similarity index 100% rename from leetcode/#23. Merge k Sorted Lists.md rename to leetcode/archive/#23. Merge k Sorted Lists.md diff --git a/leetcode/#289._game_of_life.md b/leetcode/archive/#289._game_of_life.md similarity index 100% rename from leetcode/#289._game_of_life.md rename to leetcode/archive/#289._game_of_life.md diff --git a/leetcode/#295_find_median_from_data_stream.md b/leetcode/archive/#295_find_median_from_data_stream.md similarity index 100% rename from leetcode/#295_find_median_from_data_stream.md rename to leetcode/archive/#295_find_median_from_data_stream.md diff --git a/leetcode/#355. Design Twitter b/leetcode/archive/#355. Design Twitter similarity index 100% rename from leetcode/#355. Design Twitter rename to leetcode/archive/#355. Design Twitter diff --git a/leetcode/#367. Valid Perfect Square.md b/leetcode/archive/#367. Valid Perfect Square.md similarity index 100% rename from leetcode/#367. Valid Perfect Square.md rename to leetcode/archive/#367. Valid Perfect Square.md diff --git a/leetcode/#399_evaluate_division.md b/leetcode/archive/#399_evaluate_division.md similarity index 100% rename from leetcode/#399_evaluate_division.md rename to leetcode/archive/#399_evaluate_division.md diff --git a/leetcode/#402. Remove K Digits.md b/leetcode/archive/#402. Remove K Digits.md similarity index 100% rename from leetcode/#402. Remove K Digits.md rename to leetcode/archive/#402. Remove K Digits.md diff --git a/leetcode/#42. Trapping Rain Water b/leetcode/archive/#42. Trapping Rain Water similarity index 100% rename from leetcode/#42. Trapping Rain Water rename to leetcode/archive/#42. Trapping Rain Water diff --git a/leetcode/#44. Wildcard Matching.md b/leetcode/archive/#44. Wildcard Matching.md similarity index 100% rename from leetcode/#44. Wildcard Matching.md rename to leetcode/archive/#44. Wildcard Matching.md diff --git a/leetcode/#5. Longest Palindromic Substring.md b/leetcode/archive/#5. Longest Palindromic Substring.md similarity index 100% rename from leetcode/#5. Longest Palindromic Substring.md rename to leetcode/archive/#5. Longest Palindromic Substring.md diff --git a/leetcode/#516. Longest Palindromic Subsequence.md b/leetcode/archive/#516. Longest Palindromic Subsequence.md similarity index 100% rename from leetcode/#516. Longest Palindromic Subsequence.md rename to leetcode/archive/#516. Longest Palindromic Subsequence.md diff --git a/leetcode/#518. Coin Change 2.md b/leetcode/archive/#518. Coin Change 2.md similarity index 100% rename from leetcode/#518. Coin Change 2.md rename to leetcode/archive/#518. Coin Change 2.md diff --git a/leetcode/#542._01_matrix.md b/leetcode/archive/#542._01_matrix.md similarity index 100% rename from leetcode/#542._01_matrix.md rename to leetcode/archive/#542._01_matrix.md diff --git a/leetcode/#56. Merge Intervals.md b/leetcode/archive/#56. Merge Intervals.md similarity index 100% rename from leetcode/#56. Merge Intervals.md rename to leetcode/archive/#56. Merge Intervals.md diff --git a/leetcode/#587.Erect_the_fence.md b/leetcode/archive/#587.Erect_the_fence.md similarity index 100% rename from leetcode/#587.Erect_the_fence.md rename to leetcode/archive/#587.Erect_the_fence.md diff --git a/leetcode/#6. ZigZag Conversion.md b/leetcode/archive/#6. ZigZag Conversion.md similarity index 100% rename from leetcode/#6. ZigZag Conversion.md rename to leetcode/archive/#6. ZigZag Conversion.md diff --git a/leetcode/archive/076_minimum_window_substring.md b/leetcode/archive/076_minimum_window_substring.md new file mode 100644 index 0000000..f3e086a --- /dev/null +++ b/leetcode/archive/076_minimum_window_substring.md @@ -0,0 +1,178 @@ +# 76. Minimum Window Substring + +## Two Pointer with Map Solution +- Runtime: O(S * T) + O(T) but O(S * (S+T)) + O(T) due to string slicing +- Space: O(S) + O(T) +- S = Number of characters in string S +- T = Number of characters in string T + +So the definition of a result requires the substring to contain all characters in T. It then wants the smallest substring. +With that, we have to begin the solution by finding the first instance of that substring. +Once we have that, we need to basically prune it, remove the extra characters from this substring to get the minimum substring. +Finally, we keep repeating this until we reach the end. +We can traverse the string via. two pointers, a left and right iterator. +Right is bound within the given string S and left is bound within the substring created by the right pointer. +This is an example of a sliding window. + +To figure out if we have all the characters in this substring, we would have to count T via. a dictionary. +This is our known count that is required for each substring. +Then we have to keep a dynamic counter, decrementing and incrementing character counts as we move across the string. +With this, after we moved the right pointer to the right, we can use these two dictionaries to check if this is a substring that meets the requirements. +If so, we can then try pruning with the left pointer all the way to the right pointer or if it doesn't meet the requirement. + +For the run-time, worst case, we will be visiting each character twice. + +You may think that the run-time for this is exponential, especially when we are checking the two dictionaries. +However, don't be mistaken, the comparison is actually a constant T run-time, it doesn't change based on S, but rather on T. +There is one slight problem, python's implementation of string concatention is actually **O(N)**. +When the question wants the actual substring and not a count, even using a deque will not solve this problem. +So this implementation is technically **O(S * (S+T))** due to python. + +We could use indexes instead of string slices, however, the code gets complex. +When implementing these type of two pointer questions, I recommend to **avoid using indexes as much as possible and use iterators**. +It is very easy to get a one off error doing these and within a 30 minute timeframe, it is very risky. +Just talk about how to refactor with indexes. + +``` +from collections import defaultdict +from collections import Counter + +class Solution: + def minWindow(self, s: str, t: str) -> str: + all_ch_counts = Counter(t) + ch_to_n_counts = defaultdict(int) + str_builder, min_substr, found = '', s, False + for right_ch in s: + ch_to_n_counts[right_ch] += 1 + str_builder += right_ch + if chars_occur_ge(ch_to_n_counts, all_ch_counts): + for left_ch in str_builder: + if chars_occur_ge(ch_to_n_counts, all_ch_counts): + found = True + if len(str_builder) < len(min_substr): + min_substr = str_builder + else: + break + ch_to_n_counts[left_ch] -= 1 + str_builder = str_builder[1:] + return min_substr if found else '' + +def chars_occur_ge(ch_to_n_counts, all_ch_counts): + for ch, count in all_ch_counts.items(): + if ch not in all_ch_counts or ch_to_n_counts[ch] < count: + return False + return True +``` + +## Two Pointer with Map Solution (Optimized) +- Runtime: O(S) + O(T) but O(S * S) + O(T) due to string slicing +- Space: O(S) + O(T) +- S = Number of characters in string S +- T = Number of characters in string T + +We can further improve the solution by optimizing the way we check if its a valid substring. +We can still use a dictionary to count the occurances, but we can also keep a separate count for the unique characters in T. +This will represent the number of unique valid characters of T. + +If T = 'abcc', then there are 3 keys in the dictionary. +Whenever we increment a key in the dictionary, we can then compare the dictionary T's count with dictionary S's count. +If S's count equals exactly what T's count is, then we just got one of the 3 keys validated. +If we decrement, and S's count is T's count-1, then we just unvalidated one of those keys. + +This will remove the need to traverse all the keys whenever we need to revalidate the substring. +Hence, this validation will run at O(1). String slicing will still slow us down however. + +``` +from collections import defaultdict +from collections import Counter + +class Solution: + def minWindow(self, s: str, t: str) -> str: + char_counter = CharacterCounter(t) + str_builder, min_substr, found = '', s, False + for right_ch in s: + char_counter += right_ch + str_builder += right_ch + if char_counter.is_valid: + for left_ch in str_builder: + if char_counter.is_valid: + found = True + if len(str_builder) < len(min_substr): + min_substr = str_builder + else: + break + char_counter -= left_ch + str_builder = str_builder[1:] + return min_substr if found else '' + +class CharacterCounter: + def __init__(self, source_str): + self._ch_to_n_counts = defaultdict(int) + self._source_counts = Counter(source_str) + self._n_valid_chars = 0 + + def __iadd__(self, char): + self._ch_to_n_counts[char] += 1 + if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]: + self._n_valid_chars += 1 + return self + + def __isub__(self, char): + self._ch_to_n_counts[char] -= 1 + if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]-1: + self._n_valid_chars -= 1 + return self + + @property + def is_valid(self): + return self._n_valid_chars == len(self._source_counts.keys()) +``` + +## Two Pointer with Map Solution (No String slicing) +- Runtime: O(S) + O(T) +- Space: O(S) + O(T) +- S = Number of characters in string S +- T = Number of characters in string T + +``` +from collections import defaultdict +from collections import Counter + +class Solution: + def minWindow(self, s: str, t: str) -> str: + char_counter = CharacterCounter(t) + left_i, left_i_result, right_i_result = 0, 0, len(s)-1 + found = False + for right_i, right_ch in enumerate(s): + char_counter += right_ch + if char_counter.is_valid: + while left_i <= right_i and char_counter.is_valid: + found = True + if right_i-left_i < right_i_result-left_i_result: + right_i_result, left_i_result = right_i, left_i + char_counter -= s[left_i] + left_i += 1 + return s[left_i_result:right_i_result+1] if found else '' + +class CharacterCounter: + def __init__(self, source_str): + self._ch_to_n_counts = defaultdict(int) + self._source_counts = Counter(source_str) + self._n_valid_chars = 0 + + def __iadd__(self, char): + self._ch_to_n_counts[char] += 1 + if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]: + self._n_valid_chars += 1 + return self + + def __isub__(self, char): + self._ch_to_n_counts[char] -= 1 + if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]-1: + self._n_valid_chars -= 1 + return self + + @property + def is_valid(self): + return self._n_valid_chars == len(self._source_counts.keys()) +``` diff --git a/leetcode/archive/README.md b/leetcode/archive/README.md new file mode 100644 index 0000000..2d52663 --- /dev/null +++ b/leetcode/archive/README.md @@ -0,0 +1,3 @@ +# Archive + +This folder represents solutions that are no longer of high enough quality. \ No newline at end of file diff --git a/leetcode/easy/053_maximum_subarray.md b/leetcode/easy/053_maximum_subarray.md new file mode 100644 index 0000000..c67f715 --- /dev/null +++ b/leetcode/easy/053_maximum_subarray.md @@ -0,0 +1,25 @@ +# 53. Maximum Subarray + +## Kadane's Algothrim +- Run-time: O(N) +- Space: O(1) +- N = Number of nodes in tree + +Kadane's Algothrim is a good thing to know. +There are proofs out there explaining why this works if you are interested. +Overall its a dynamic programming solution at its core. + +Because the question is asking for a contiguous subarray, we can use the previous sums to find our max sum for a given n. +The main idea is that there are two choices to be made, whether (previous sum + n) or (n) is larger. + +``` +class Solution: + def maxSubArray(self, nums: List[int]) -> int: + if len(nums) == 0: + return 0 + max_sum, prev_sum = nums[0], nums[0] + for n in nums[1:]: + prev_sum = max(n, n+prev_sum) + max_sum = max(max_sum, prev_sum) + return max_sum +``` diff --git a/leetcode/easy/101_symmetric_tree.md b/leetcode/easy/101_symmetric_tree.md new file mode 100644 index 0000000..cb4f126 --- /dev/null +++ b/leetcode/easy/101_symmetric_tree.md @@ -0,0 +1,54 @@ +# 101. Symmetric Tree + +## Recursive Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of nodes in tree + +I recommend drawing out a tree of height 4 to find the intuition for the solution. +You should notice that everytime you want to go left, you care about the right-most node of the tree. +The tough part is what to do with the inner nodes? + +If your recursion passed in two nodes at a time, a left node and a right node, and traverse down the tree one height at a time. +You can then call the recursion twice, one of the recursions will look at left and right while the other will go right then left. +This will allow you to keep nodes that are across the span of the tree. + +You can also think of this by dividing the tree in two parts, say a left tree and a right tree. +You give your recursion two nodes or "two root nodes" and go down the tree at the same time. +Just remember the question is about being a mirror. + +``` +class Solution: + def isSymmetric(self, root: TreeNode) -> bool: + def is_symmetric_helper(root, other_node): + if root is None and other_node is None: + return True + if root is None or other_node is None: + return False + return root.val == other_node.val \ + and is_symmetric_helper(root.left, other_node.right) \ + and is_symmetric_helper(root.right, other_node.left) + return is_symmetric_helper(root, root) +``` + +## Iterative Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of nodes in tree + +``` +class Solution: + def isSymmetric(self, root: TreeNode) -> bool: + stack = list([(root, root)]) + while len(stack) != 0: + n1, n2 = stack.pop() + if n1 is None and n2 is None: + continue + if n1 is None or n2 is None: + return False + if n1.val != n2.val: + return False + stack.append((n1.left, n2.right)) + stack.append((n1.right, n2.left)) + return True +``` diff --git a/leetcode/easy/110_balanced_binary_tree.md b/leetcode/easy/110_balanced_binary_tree.md new file mode 100644 index 0000000..82e9948 --- /dev/null +++ b/leetcode/easy/110_balanced_binary_tree.md @@ -0,0 +1,31 @@ +# 110. Balanced Binary Tree + +## Recursive solution + +- Runtime: O(N) +- Space: O(H) +- N = Number of elements in list +- H = Height of tree + +Since the definition of a balance tree is that for each node the difference between each children's height is never greater than 1. We can perform a post order traversal and then compare the heights given by both children. If we encounter an imbalance, we ignore anymore traversals and head back up to the root node using an arbitrary value like -1 to symbolize an imbalance. + +Worst case is if the tree is balanced and we have to visit every node. +However, no matter what, we will end up using at most O(H) space to traverse the tree. + +``` +class Solution: + def isBalanced(self, root: TreeNode) -> bool: + + def balance_helper(root): + if root is None: + return 0 + left = balance_helper(root.left) + if left == -1: + return -1 + right = balance_helper(root.right) + if right == -1: + return -1 + return max(left, right)+1 if abs(left-right) <= 1 else -1 + + return balance_helper(root) != -1 +``` diff --git a/leetcode/easy/235_lowest_common_ancestor_of_a_binary_search_tree.md b/leetcode/easy/235_lowest_common_ancestor_of_a_binary_search_tree.md new file mode 100644 index 0000000..aeb22a5 --- /dev/null +++ b/leetcode/easy/235_lowest_common_ancestor_of_a_binary_search_tree.md @@ -0,0 +1,35 @@ +# 235. Lowest Common Ancestor of a Binary Search Tree + +## Iterative Solution + +- Runtime: O(H) +- Space: O(H) +- H = Height of the tree + +I recommend doing question 236 on a binary tree first before doing this question. +Compared to a binary tree, there are some optimizations we can do to improve the solution. + +Since this is a BST, we can easily figure out where p and q are in relation to the root node. +Using this, we can build the following intuition. +- If p and q exist on separate sub-trees, left and right, then the root must be the LCA. +- If p and q both exist on the right, we should traverse right. +- If p and q both exist on the left, we should traverse left. +- If the root is p or q, for example root is p, since we are traversing towards p and q, q is also in one of our sub-trees, this means root must be LCA. + +``` +class Solution: + def lowestCommonAncestor(self, root: 'TreeNode', p: 'TreeNode', q: 'TreeNode') -> 'TreeNode': + curr = root + if p.val > q.val: # p always smaller than q + p, q = q, p + while curr: + if p.val < curr.val and q.val > curr.val: # p and q exists on both sub-trees + return curr + elif curr is p or curr is q: # root is p or q + return curr + elif p.val < curr.val and q.val < curr.val: # p and q both exists on the left sub-tree + curr = curr.left + else: # p and q both exists on the right sub-tree + curr = curr.right + return root +``` diff --git a/leetcode/easy/496_next_greater_element_I.md b/leetcode/easy/496_next_greater_element_I.md new file mode 100644 index 0000000..f1b31b5 --- /dev/null +++ b/leetcode/easy/496_next_greater_element_I.md @@ -0,0 +1,36 @@ +# 496. Next Greater Element I + +## Stack Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of elements in both lists + +This is an example of where a monotonic stack is useful. +Monotonic stacks only contain elements that are increasing or decresing in value. + +Therefore, in this case, we can have a stack that only has increasing values. +For each number in nums2, if we encounter an element larger than the one on top of the stack, we can pop off the top of the stack and map the popped element with the current number as the next largest element. +We continue popping and mapping until the stack is empty, then we can add the current number into the stack. +At the end, we will have a mapping for each number in num2 with their next greater element using this method. + +``` +from collections import defaultdict + +class Solution: + def nextGreaterElement(self, nums1: List[int], nums2: List[int]) -> List[int]: + + def get_greater_map(): + stack = list() + greater_map = defaultdict(lambda:-1) + for num in nums2: + while len(stack) and num > stack[-1]: + greater_map[stack.pop()] = num + stack.append(num) + return greater_map + + greater_map = get_greater_map() + result = list() + for num in nums1: + result.append(greater_map[num]) + return result +``` diff --git a/leetcode/easy/937_reorder_data_in_log_files.md b/leetcode/easy/937_reorder_data_in_log_files.md new file mode 100644 index 0000000..45072c7 --- /dev/null +++ b/leetcode/easy/937_reorder_data_in_log_files.md @@ -0,0 +1,22 @@ +# 937. Reorder Data in Log Files + +## Sorted In Place Solution +- Runtime: O(Nlog(N)) +- Space: O(1) +- N = Number of logs + +This is a common Amazon question asked during online assessment tests. + +``` +class Solution: + def reorderLogFiles(self, logs: List[str]) -> List[str]: + def getKey(log): + identifier, words = log.split(' ', 1) + if words[0].isdigit(): + return (1, 0, 0) + else: + return (0, words, identifier) + + logs.sort(key=getKey) + return logs +``` diff --git a/leetcode/hard/004_median_of_two_sorted_arrays.md b/leetcode/hard/004_median_of_two_sorted_arrays.md new file mode 100644 index 0000000..392878c --- /dev/null +++ b/leetcode/hard/004_median_of_two_sorted_arrays.md @@ -0,0 +1,51 @@ +#4. Median of Two Sorted Arrays + +## Linear Solution +- Run-time: O(N) +- Space: O(1) +- N = Number of elements in both lists + +For the first solution, lets dissect what it means to find a median in two sorted lists. + +Some properties we know are: +- Total length of both lists. +- Both lists are sorted. + +If you had one list, it would be as easy as index = total length // 2 and list[index] as the median if even, else list[index+1] if odd. +Since we know the total length, we can still find how many times we need to traverse until we hit a median. +With one list, we can treat the list as having a left array and a right array. +Knowing how big the left array would be as one list and the fact that both lists are sorted, we can traverse the left side of both lists until we have traversed enough indexes. +When the left most index of both lists are found, the left index + 1 of both lists would be part of the right array. + +``` +class Solution: + def findMedianSortedArrays(self, nums1: List[int], nums2: List[int]) -> float: + + def get_right_most_indexes_of_array1(): + n_moves_to_med = (len(nums1) + len(nums2)) // 2 + left1_idx = left2_idx = -1 + while n_moves_to_med and left1_idx+1 < len(nums1) and left2_idx+1 < len(nums2): + n_moves_to_med -= 1 + if nums1[left1_idx+1] < nums2[left2_idx+1]: + left1_idx += 1 + else: + left2_idx += 1 + while n_moves_to_med and left1_idx+1 < len(nums1): + n_moves_to_med -= 1 + left1_idx += 1 + while n_moves_to_med and left2_idx+1 < len(nums2): + n_moves_to_med -= 1 + left2_idx += 1 + return (left1_idx, left2_idx) + + left1_idx, left2_idx = get_right_most_indexes_of_array1() + # left most number of array2 + n2 = min(nums1[left1_idx+1] if 0 <= left1_idx+1 < len(nums1) else float('inf'), + nums2[left2_idx+1] if 0 <= left2_idx+1 < len(nums2) else float('inf')) + if (len(nums1) + len(nums2)) % 2 == 0: # is even? + # right most number of array1 + n1 = max(nums1[left1_idx] if 0 <= left1_idx < len(nums1) else float('-inf'), + nums2[left2_idx] if 0 <= left2_idx < len(nums2) else float('-inf')) + return (n1 + n2) / 2 + return n2 # is odd +``` diff --git a/leetcode/hard/072_edit_distance.md b/leetcode/hard/072_edit_distance.md new file mode 100644 index 0000000..9074d20 --- /dev/null +++ b/leetcode/hard/072_edit_distance.md @@ -0,0 +1,72 @@ +# 72. Edit Distance + +## Levenshtein Distance +- Run-time: O(M*N) +- Space: O(M*N +- M = Length of word1 +- N = Length of word2 + +The solution is called the "Levenshtein Distance". + +To build some intuition, we should be able to notice that recursion is possible. +Since there are three choices(insert, delete, replace), the run-time would equate to 3^(max(length of word1, length of word2). +Because there is a recursion solution, lets come up with a dynamic programming solution instead. + +A 2d array should come to mind, columns as word1, rows as word2 for each letter of each word. +Besides the three operations, there are also two other choices, whether the letter from each word matches. +If the letters match or not, then what we care about is the previous minimum operations. +Using the 2d array, we can figure out the previous minimum operations. +For any given dp element, the left, top and top-left values are what we care about. + +``` +Columns = word1 +Rows = word2 + +Insert: + '' a b +'' 0 1 2 +a 1 0 1 +b 2 1 0 +c 3 2 1 + +Delete: + '' a b c +'' 0 1 2 3 +a 1 0 1 2 +b 2 1 0 1 + +Replace: + '' a b c +'' 0 1 2 3 +a 1 0 1 2 +b 2 1 0 1 +d 3 2 1 1 +``` + +So for any given dp element, dp[i][j] = 1 + min(dp[i-1][j-1], d[i-1][j], dp[i][j-1]). +The only important thing to consider is when the letters match. +For that scenario, dp[i-1][j-1] + 1 does not apply, it doesn't need any operations done for that dp[i][j]. + +``` +class Solution: + def minDistance(self, word1: str, word2: str) -> int: + + def create_dp(): + dp = [[0] * (len(word1) + 1) for _ in range(len(word2) + 1)] + for idx in range(len(word1) + 1): + dp[0][idx] = idx + for idx in range(len(word2) + 1): + dp[idx][0] = idx + return dp + + dp = create_dp() + for col_idx, ch1 in enumerate(word1, 1): + for row_idx, ch2 in enumerate(word2, 1): + top_left = dp[row_idx-1][col_idx-1] + if ch1 == ch2: + top_left -= 1 + dp[row_idx][col_idx] = 1 + min(top_left, # top left corner + dp[row_idx][col_idx-1], # left + dp[row_idx-1][col_idx]) # above + return dp[-1][-1] +``` diff --git a/leetcode/hard/076_minimum_window_substring.md b/leetcode/hard/076_minimum_window_substring.md index f3e086a..26d3e16 100644 --- a/leetcode/hard/076_minimum_window_substring.md +++ b/leetcode/hard/076_minimum_window_substring.md @@ -1,178 +1,42 @@ # 76. Minimum Window Substring -## Two Pointer with Map Solution -- Runtime: O(S * T) + O(T) but O(S * (S+T)) + O(T) due to string slicing -- Space: O(S) + O(T) -- S = Number of characters in string S -- T = Number of characters in string T +## Sliding Window with Dictionary Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of characters in S -So the definition of a result requires the substring to contain all characters in T. It then wants the smallest substring. -With that, we have to begin the solution by finding the first instance of that substring. -Once we have that, we need to basically prune it, remove the extra characters from this substring to get the minimum substring. -Finally, we keep repeating this until we reach the end. -We can traverse the string via. two pointers, a left and right iterator. -Right is bound within the given string S and left is bound within the substring created by the right pointer. -This is an example of a sliding window. +Lets focus on how to figure out if a sub-string has all characters of T. +You can do this with a dictionary counter, keeping occurances of each character of T. +To avoid checking if each character in the dictionary is less than or equal to zero occurances, we can keep a separate variable as the remmaining characters needed to be found. -To figure out if we have all the characters in this substring, we would have to count T via. a dictionary. -This is our known count that is required for each substring. -Then we have to keep a dynamic counter, decrementing and incrementing character counts as we move across the string. -With this, after we moved the right pointer to the right, we can use these two dictionaries to check if this is a substring that meets the requirements. -If so, we can then try pruning with the left pointer all the way to the right pointer or if it doesn't meet the requirement. +Next is the idea of a sliding window, if we iterate from left to right, we can find all the sub-strings containing T using the above soluiton. +Once that sub-string is found, then its the matter of decrementing the left most character of the sub-string until we need to find another character. +With that, we have to decrement the dictionary and the number of remaining characters accordingly. -For the run-time, worst case, we will be visiting each character twice. - -You may think that the run-time for this is exponential, especially when we are checking the two dictionaries. -However, don't be mistaken, the comparison is actually a constant T run-time, it doesn't change based on S, but rather on T. -There is one slight problem, python's implementation of string concatention is actually **O(N)**. -When the question wants the actual substring and not a count, even using a deque will not solve this problem. -So this implementation is technically **O(S * (S+T))** due to python. - -We could use indexes instead of string slices, however, the code gets complex. -When implementing these type of two pointer questions, I recommend to **avoid using indexes as much as possible and use iterators**. -It is very easy to get a one off error doing these and within a 30 minute timeframe, it is very risky. -Just talk about how to refactor with indexes. - -``` -from collections import defaultdict -from collections import Counter - -class Solution: - def minWindow(self, s: str, t: str) -> str: - all_ch_counts = Counter(t) - ch_to_n_counts = defaultdict(int) - str_builder, min_substr, found = '', s, False - for right_ch in s: - ch_to_n_counts[right_ch] += 1 - str_builder += right_ch - if chars_occur_ge(ch_to_n_counts, all_ch_counts): - for left_ch in str_builder: - if chars_occur_ge(ch_to_n_counts, all_ch_counts): - found = True - if len(str_builder) < len(min_substr): - min_substr = str_builder - else: - break - ch_to_n_counts[left_ch] -= 1 - str_builder = str_builder[1:] - return min_substr if found else '' - -def chars_occur_ge(ch_to_n_counts, all_ch_counts): - for ch, count in all_ch_counts.items(): - if ch not in all_ch_counts or ch_to_n_counts[ch] < count: - return False - return True -``` - -## Two Pointer with Map Solution (Optimized) -- Runtime: O(S) + O(T) but O(S * S) + O(T) due to string slicing -- Space: O(S) + O(T) -- S = Number of characters in string S -- T = Number of characters in string T - -We can further improve the solution by optimizing the way we check if its a valid substring. -We can still use a dictionary to count the occurances, but we can also keep a separate count for the unique characters in T. -This will represent the number of unique valid characters of T. - -If T = 'abcc', then there are 3 keys in the dictionary. -Whenever we increment a key in the dictionary, we can then compare the dictionary T's count with dictionary S's count. -If S's count equals exactly what T's count is, then we just got one of the 3 keys validated. -If we decrement, and S's count is T's count-1, then we just unvalidated one of those keys. - -This will remove the need to traverse all the keys whenever we need to revalidate the substring. -Hence, this validation will run at O(1). String slicing will still slow us down however. - -``` -from collections import defaultdict -from collections import Counter - -class Solution: - def minWindow(self, s: str, t: str) -> str: - char_counter = CharacterCounter(t) - str_builder, min_substr, found = '', s, False - for right_ch in s: - char_counter += right_ch - str_builder += right_ch - if char_counter.is_valid: - for left_ch in str_builder: - if char_counter.is_valid: - found = True - if len(str_builder) < len(min_substr): - min_substr = str_builder - else: - break - char_counter -= left_ch - str_builder = str_builder[1:] - return min_substr if found else '' - -class CharacterCounter: - def __init__(self, source_str): - self._ch_to_n_counts = defaultdict(int) - self._source_counts = Counter(source_str) - self._n_valid_chars = 0 - - def __iadd__(self, char): - self._ch_to_n_counts[char] += 1 - if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]: - self._n_valid_chars += 1 - return self - - def __isub__(self, char): - self._ch_to_n_counts[char] -= 1 - if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]-1: - self._n_valid_chars -= 1 - return self - - @property - def is_valid(self): - return self._n_valid_chars == len(self._source_counts.keys()) ``` - -## Two Pointer with Map Solution (No String slicing) -- Runtime: O(S) + O(T) -- Space: O(S) + O(T) -- S = Number of characters in string S -- T = Number of characters in string T - -``` -from collections import defaultdict from collections import Counter class Solution: def minWindow(self, s: str, t: str) -> str: - char_counter = CharacterCounter(t) - left_i, left_i_result, right_i_result = 0, 0, len(s)-1 - found = False - for right_i, right_ch in enumerate(s): - char_counter += right_ch - if char_counter.is_valid: - while left_i <= right_i and char_counter.is_valid: - found = True - if right_i-left_i < right_i_result-left_i_result: - right_i_result, left_i_result = right_i, left_i - char_counter -= s[left_i] - left_i += 1 - return s[left_i_result:right_i_result+1] if found else '' - -class CharacterCounter: - def __init__(self, source_str): - self._ch_to_n_counts = defaultdict(int) - self._source_counts = Counter(source_str) - self._n_valid_chars = 0 - - def __iadd__(self, char): - self._ch_to_n_counts[char] += 1 - if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]: - self._n_valid_chars += 1 - return self + char_counter = Counter(t) + left_idx, n_chars_needed = 0, len(t) + result = (-1, -1) # left and right result indexes - def __isub__(self, char): - self._ch_to_n_counts[char] -= 1 - if char in self._source_counts and self._ch_to_n_counts[char] == self._source_counts[char]-1: - self._n_valid_chars -= 1 - return self - - @property - def is_valid(self): - return self._n_valid_chars == len(self._source_counts.keys()) + for right_idx, ch in enumerate(s): + if ch in char_counter: + char_counter[ch] -= 1 + if char_counter[ch] >= 0: + n_chars_needed -= 1 + + while n_chars_needed == 0: + if result[0] == -1 or result[1]-result[0] > right_idx-left_idx: + result = (left_idx, right_idx) + left_ch = s[left_idx] + if left_ch in char_counter: + char_counter[left_ch] += 1 + if char_counter[left_ch] == 1: + n_chars_needed += 1 + left_idx += 1 + + return s[result[0]:result[1]+1] if result[0] != -1 else '' ``` diff --git a/leetcode/hard/099_recover_binary_search_tree.md b/leetcode/hard/099_recover_binary_search_tree.md new file mode 100644 index 0000000..dd6abc5 --- /dev/null +++ b/leetcode/hard/099_recover_binary_search_tree.md @@ -0,0 +1,37 @@ +# 99. Recover Binary Search Tree + +## In-order Solution +- Runtime: O(N) +- Space: O(H) +- N = Number of nodes in BST +- H = Height of BST + +This was asked on an amazon phone screen interview question. + +This can be difficult to get, you would have to know that out of the three traversals, in-order traversal actually traverses the BST in sorted order. +Using this, we can traverse the BST in in-order and compare the current node to the previous node. +If either are out of sort, we can determine which are the swapped nodes. + +``` +class Solution: + prev = TreeNode(-sys.maxsize-1) + node1 = None + node2 = None + + def recoverTree(self, root: TreeNode) -> None: + + def recover_helper(root): + if root is None: + return + recover_helper(root.left) + if self.node1 is None and self.prev.val >= root.val: + self.node1 = self.prev + if self.node1 is not None and self.prev.val >= root.val: + self.node2 = root + self.prev = root + recover_helper(root.right) + + recover_helper(root) + self.node1.val, self.node2.val = self.node2.val, self.node1.val + return root +``` diff --git a/leetcode/hard/1032_stream_of_characters.md b/leetcode/hard/1032_stream_of_characters.md new file mode 100644 index 0000000..9c5f741 --- /dev/null +++ b/leetcode/hard/1032_stream_of_characters.md @@ -0,0 +1,59 @@ +# 1032. Stream of Characters + +## Trie Solution +- Run-time: O(C * Q) +- Space: O(C) + O(N) +- N = Number of characters in stream +- Q = Number of queries +- C = Number of characters in word list + +When dealing with single characters, its worth considering the trie data structure. +If we created a trie of the word list, we can figure out existence of words from the stream up to the recent query. +However, you may notice that if we built the trie structure beginning from left to right, it would result in a slower run-time. +This is because the new letter from the stream is at the right-most position while the trie structure starts at the left-most letter of each word in the word list. +Instead of building it from the left to right, we can build the trie structure in reverse. +That means, both the trie and the stream of letters would be traversed from the right to the left together. + +Each query will result in O(C) run-time, since we have N queries, this will total to O(C * Q). +However, there are still worst case inputs like a given word list of ['baaaaaaaaaaaaaa'] and a query of a's. +But for a general case, since the trie is built in reverse, if the most recent letter in the stream doesn't exist in the root, it will be less than O(C) for each query. + +``` +from collections import defaultdict + +class StreamChecker: + + def __init__(self, words: List[str]): + self.root = TrieNode.create_tries(words) + self.stream = list() + + def query(self, letter: str) -> bool: + self.stream.append(letter) + curr = self.root + for ch in reversed(self.stream): + if ch not in curr.next: + return False + curr = curr.next[ch] + if curr.is_word: + return True + return False + +class TrieNode(object): + + def __init__(self): + self.next = defaultdict(TrieNode) + self.is_word = False + + def __repr__(self): + return '{} {}'.format(self.next.keys(), self.is_word) + + @staticmethod + def create_tries(words): + root = TrieNode() + for word in words: + curr = root + for ch in reversed(word): + curr = curr.next[ch] + curr.is_word = True + return root +``` diff --git a/leetcode/hard/124_binary_tree_maximum_path_sum.md b/leetcode/hard/124_binary_tree_maximum_path_sum.md new file mode 100644 index 0000000..fcd0ccf --- /dev/null +++ b/leetcode/hard/124_binary_tree_maximum_path_sum.md @@ -0,0 +1,38 @@ +# 124. Binary Tree Maximum Path Sum + +## Recursive Solution +- Runtime: O(N) +- Space: O(H) +- N = Nodes in tree +- H = Height of tree + +Consider the solution in the perspective of a single node. +If you wanted to figure out which path was the largest there are 4 cases. +1. Only my node. +2. The left path and my node. +3. The right path and my node. +4. The left and right paths and my node. + +After that, you should always return the largest left or right path to your parent node. + +``` +class Solution: + def __init__(self): + self.max_path = float('-inf') + + def maxPathSum(self, root: TreeNode) -> int: + def path_sum_helper(root): + if root is None: + return 0 + left_path = path_sum_helper(root.left) + right_path = path_sum_helper(root.right) + self.max_path = max(self.max_path, + root.val, + left_path+root.val, + right_path+root.val, + left_path+root.val+right_path) + return max(left_path+root.val, right_path+root.val, root.val) + + path_sum_helper(root) + return self.max_path +``` diff --git a/leetcode/hard/128_longest_consecutive_sequence.md b/leetcode/hard/128_longest_consecutive_sequence.md new file mode 100644 index 0000000..7109d81 --- /dev/null +++ b/leetcode/hard/128_longest_consecutive_sequence.md @@ -0,0 +1,29 @@ +# 128. Longest Consecutive Sequence + +## Best Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of elements in array + +The most intuitive solution is to sort the array, then iterate each number to find the longest range. +However, that would be N(log(N)) run-time. + +To improve the run-time, we can store all the numbers into a set, then check if the left(n-1) and right(n+1) numbers are in the set. +However, the run-time would be poor, O(N^2), you would need a visited set to avoid traversing left and right for each number in the set to get O(N). + +You can further improve the space complexity by only starting at either the left-most number or the right-most number. +That would mean you just traverse only in one direction, avoiding the need of a visited set. +This solution requires two passes. + +``` +class Solution: + def longestConsecutive(self, nums: List[int]) -> int: + num_set, longest_length = set(nums), 0 + for n in nums: + if n-1 not in num_set: # left-most number + length, curr = 0, n + while curr in num_set: + curr, length = curr+1, length+1 + longest_length = max(longest_length, length) + return longest_length +``` diff --git a/leetcode/hard/145_binary_tree_postorder_traversal.md b/leetcode/hard/145_binary_tree_postorder_traversal.md index 91fbe24..a3c35da 100644 --- a/leetcode/hard/145_binary_tree_postorder_traversal.md +++ b/leetcode/hard/145_binary_tree_postorder_traversal.md @@ -6,7 +6,7 @@ - N = Number of elements in tree - H = Height of tree -Post order is left, right, node. +Post order is (Left -> Right -> Node). The recusive solution is fairly easy. Most of the heavy lifting is abstracted away by the recursion call. @@ -27,32 +27,30 @@ class Solution: ## Iterative Solution - Runtime: O(N) -- Space: O(H) +- Space: O(N) - N = Number of elements in tree -- H = Height of tree -The iterative solution for post order is fairly diffucult to come up with on your own. -It requires two stacks. -The first stack is used to traverse the tree but in the opposite direction (node -> right -> left). -During the traversal, the 1st stack will transfer its nodes to the 2nd stack, this will place the nodes in the reverse order or post-order (left -> right -> node) when they are popped off the stack later. -I recommend drawing this out, as its important to understand the relationships and responsibilities. +Take a look back at how a preorder is done (Node -> Left -> Right). +Compared to postorder (Left -> Right -> Node), what are some similarities? +You may notice that you can perform a postorder with an inverted preorder traversal. + +Another way to look at it is, since postorder is (Left -> Right -> Node), we can go (Node -> Right -> Left) and reverse the result at the end to get the postorder. + +So we can achieve an iterative postorder traversal via. an inverted preorder traversal. + +Since we need to use an additional stack/list, which we then reverse as the result, we cannot get O(H) additional space. +The best we can achieve is O(N) space. ``` class Solution: def postorderTraversal(self, root: TreeNode) -> List[int]: - if root is None: - return [] - stack1, stack2 = list([root]), list() - result = list() - while len(stack1) > 0: - node = stack1.pop() - stack2.append(node) - if node.left is not None: - stack1.append(node.left) - if node.right is not None: - stack1.append(node.right) - while len(stack2) > 0: - node = stack2.pop() - result.append(node.val) # <-- Business logic goes here - return result + stack = list([root]) + inverted_preorder = list() + while stack: + node = stack.pop() + if node: + inverted_preorder.append(node.val) + stack.append(node.left) + stack.append(node.right) + return inverted_preorder[::-1] ``` diff --git a/leetcode/hard/212_word_search_II.md b/leetcode/hard/212_word_search_II.md new file mode 100644 index 0000000..fd326fa --- /dev/null +++ b/leetcode/hard/212_word_search_II.md @@ -0,0 +1,70 @@ +# 212. Word Search II + +## Trie + DFS Solution +- Run-time: O((R \* C)^2) +- Space: O(W) +- R = Number of Rows +- C = Number of Columns + +In order to figure out if a word exists in the board, it is required to do some sort of traversal on the board, generally DFS will do here. +Secondly, by using a trie, we can traverse the board and trie together one character at a time. + +Each DFS will be for the worst case, traversing the longest word in the word list. +For example, a board full of a's and word list of different lengths of a's. +The longest word could end up being as long as all the elements on the board. +So the run-time will total to O((R \* C)^2) but generally this will be lower for the average case with the use of the trie. + +``` +from collections import defaultdict + +class Solution: + def findWords(self, board: List[List[str]], words: List[str]) -> List[str]: + + def dfs(trie, r, c, word=list(), visited=set()): + if (r, c) in visited or board[r][c] not in trie.next: + return + visited.add((r, c)) + word.append(board[r][c]) + trie = trie.next[board[r][c]] + if trie.is_word: + results.append(''.join(word)) + trie.is_word = False # avoid duplicates + for _r, _c in get_neighbors(r, c): + dfs(trie, _r, _c, word, visited) + word.pop() + visited.remove((r, c)) + + def get_neighbors(r, c): + dirs = [(1, 0), (0, 1), (-1, 0), (0, -1)] + for _r, _c in dirs: + _r += r + _c += c + if 0 <= _r < len(board) and 0 <= _c < len(board[0]): + yield (_r, _c) + + root = TrieNode.create_tries(words) + results = list() + for r, row in enumerate(board): + for c in range(len(row)): + dfs(root, r, c) + return results + +class TrieNode(object): + + def __init__(self): + self.next = defaultdict(TrieNode) + self.is_word = False + + def __repr__(self): + return 'Next: {}, IsWord: {}'.format(self.next.keys(), self.is_word) + + @staticmethod + def create_tries(words): + root = TrieNode() + for word in words: + curr = root + for ch in word: + curr = curr.next[ch] + curr.is_word = True + return root +``` diff --git a/leetcode/hard/295_find_median_from_data_stream.md b/leetcode/hard/295_find_median_from_data_stream.md new file mode 100644 index 0000000..7e98534 --- /dev/null +++ b/leetcode/hard/295_find_median_from_data_stream.md @@ -0,0 +1,70 @@ +# 295. Find Median from Data Stream + +## Sort solution +- Runtime: O(N) for addNum() and O(1) for findMedian(), in total O(N\*N) worst case) +- Space: O(N) +- N = Number of elements in array + +This solution is fairly simple, as long as we keep a sorted order of numbers, we can figure out the median quickly. + +However, it is important to note that if we instead did a different approach where we would add the value into the list then sort it only when findMedian() was called, it would actually cause a slower run-time, O(N * Nlog(N)). +The worst case is when we call findMedian() after each newly inserted number. +We would basically be sorting the entire array N times. +That is because every time we would sort, the list keeps growing, we aren't utilizing the fact that the list is already sorted. +With an already sorted list, we can just perform a binary search and insert the new number instead of resorting an already sorted list for each number. + +``` +import bisect + +class MedianFinder: + + def __init__(self): + """ + initialize your data structure here. + """ + self._nums = list() + + def addNum(self, num: int) -> None: + bisect.insort(self._nums, num) + + def findMedian(self) -> float: + if len(self._nums) == 0: + return 0 + median_index = len(self._nums) // 2 + return self._nums[median_index] if len(self._nums) % 2 \ + else (self._nums[median_index-1] + self._nums[median_index]) / 2 +``` + +## Two Heap Solution +- Runtime: O(log(N)) for addNum() and O(1) for findMedian(),in total O(Nlog(N)) worst case +- Space: O(N) +- N = Number of elements in array + +This second approach is rather innovative to say the least. +It uses two heaps, one that keeps a set of large numbers and another set of smaller numbers. +You can think of this as a divide and conquer approach. + +The only tricky part is to keep the two heaps balanced, balanced meaning the two heaps cannot differ in size by more than 1. +Secondly, we need to keep the two heap property of smaller and larger sets. + +Once these two properties are met, finding the median can be done by using the two values on top of the heap if both heap sizes are the same or taking the top value of the larger heap. + +``` +class MedianFinder: + + def __init__(self): + """ + initialize your data structure here. + """ + self.max_heap = list() + self.min_heap = list() + + def addNum(self, num: int) -> None: + heapq.heappush(self.max_heap, -heapq.heappushpop(self.min_heap, num)) + if len(self.max_heap) > len(self.min_heap): + heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap)) + + def findMedian(self) -> float: + return (self.min_heap[0] + -self.max_heap[0]) / 2 \ + if len(self.min_heap) == len(self.max_heap) else self.min_heap[0] +``` diff --git a/leetcode/hard/297_serialize_and_deserialize_binary_tree.md b/leetcode/hard/297_serialize_and_deserialize_binary_tree.md new file mode 100644 index 0000000..4a54dfc --- /dev/null +++ b/leetcode/hard/297_serialize_and_deserialize_binary_tree.md @@ -0,0 +1,64 @@ +# 297. Serialize and Deserialize Binary Tree + +## DFS Solution +- Run-time: O(N) +- Space: O(N) +- N = Number of nodes in tree + +We cannot reuse the solutions from question #105. or #106 using two traversals because duplicate values are allowed. +We would need a different method. + +A DFS method is probably the easiest method to come up with. +I chose to use preorder, you can implement this using any of the traversals. + +Let's start with serialize(), we can simply traverse via. preorder and construct a list. +This part is straightforward, however, there is a caveat and that is related to space and how we should represent None nodes. +We could use a string like 'None' but we can save more space by using an empty string instead, this would require us to use a delimiter. + +For deserialize(), using the delimiters, we can get the list of nodes in preorder. +It would simply be a preorder traversal to reconstruct the tree. + +There are some further optimizations that we could've done, like condensing consecutive empty strings together. +It would be more over engineering at this point but good to mention. + +``` +class Codec: + + curr_idx = 0 + def serialize(self, root): + """Encodes a tree to a single string. + + :type root: TreeNode + :rtype: str + """ + def preorder_encode(root): + if root is None: + result.append('') + return + result.append(str(root.val)) + preorder_encode(root.left) + preorder_encode(root.right) + + result = list() + preorder_encode(root) + return ','.join(result) + + def deserialize(self, data): + """Decodes your encoded data to tree. + + :type data: str + :rtype: TreeNode + """ + def preorder_decode(): + if self.curr_idx > len(tokens) or tokens[self.curr_idx] == '': + return None + root = TreeNode(int(tokens[self.curr_idx])) + self.curr_idx += 1 + root.left = preorder_decode() + self.curr_idx += 1 + root.right = preorder_decode() + return root + + tokens = data.split(',') + return preorder_decode() +``` diff --git a/leetcode/hard/316_remove_duplicate_letters.md b/leetcode/hard/316_remove_duplicate_letters.md new file mode 100644 index 0000000..0cd5906 --- /dev/null +++ b/leetcode/hard/316_remove_duplicate_letters.md @@ -0,0 +1,56 @@ +# 316. Remove Duplicate Letters + +## Stack Solution +- Run-time: O(N) +- Space: O(1) or 26 +- N = Number of characters in S + +To gather the intuition for this solution lets look at a few examples. + +``` +Example 1: +abc -> abc + +Example 2: +cba -> cba + +Example 3: +aba -> ab + +Example 4: +bab -> ab + +Example 5: +xyzabczyx -> abczyx +``` + +Examples 1 and 2 don't really tell us much but combining them with examples 3 and 4 can. +Notice that example 3 doesn't care about the last letter 'a', while example 4 doesn't care about the first letter 'b'. + +Instead of thinking about figuring out which letter to delete, we can think of this by building the word from scratch, from left to right. +With this frame of reference we can reword the question into finding the biggest lexicographic sorted word that doesn't contain duplicates. + +From examples 3 and 4, we can then denote that if a letter isn't the last occurring letter, we can build a much larger word that is closer to being lexicographic. +So by using a monotonic stack, that is increasing by nature, we can achieve examples 1 and 3. +A set should be obvious to avoid adding duplicate values into the stack. +However, to achieve examples 2 an 5 with the monotonic stack, we have to add another invariant were we want last occurring letters. +We can skip letters we have already seen due to the fact that stack is monotonic, the letters in the stack are already in the best position so far, this is to achieve example 3. + +In summary, we can build the word from left to right using an increasing monotonic stack. +When it comes to popping off the stack, we will continue to pop from the stack if the new letter is smaller than whats on top of the stack AND if whats on top of the stack isn't the last occurring letter. + +``` +class Solution: + def removeDuplicateLetters(self, s: str) -> str: + stack = list() + seen = set() + ch_to_last_idx = {ch: idx for idx, ch in enumerate(s)} + result = '' + for idx, ch in enumerate(s): + if ch not in seen: + while stack and ch < stack[-1] and idx < ch_to_last_idx[stack[-1]]: + seen.discard(stack.pop()) + seen.add(ch) + stack.append(ch) + return ''.join(stack) +``` diff --git a/leetcode/hard/329_longest_increasing_path_in_a_matrix.md b/leetcode/hard/329_longest_increasing_path_in_a_matrix.md new file mode 100644 index 0000000..e04dfa6 --- /dev/null +++ b/leetcode/hard/329_longest_increasing_path_in_a_matrix.md @@ -0,0 +1,47 @@ +# 329. Longest Increasing Path in a Matrix + +## Memo and DFS Solution +- Run-time: O(V + E) +- Space: O(V) +- V = Vertices +- E = Edges + +Simple solution is to perform a DFS for each element in the matrix. +That would be a O(V + E)^2 run-time. +We can further improve the run-time by using memoization to reduce the need to recalculate the same element over again. +Therefore, each element in the matrix will only have one DFS performed on it. +The memoization will keep the longest increasing path for each (i, j). + +Unlike a traditional DFS that uses a visited set, we can save more space by not using one here. +As the question is worded, we only care about visiting elements that are increasing. +Therefore, a cycle is not possible in this case, no need for a visited set. + +``` +class Solution: + def longestIncreasingPath(self, matrix: List[List[int]]) -> int: + + def get_neighbors(r, c): + dirs = [(0, 1), (1, 0), (0, -1), (-1, 0)] + for _r, _c in dirs: + _r += r + _c += c + if 0 <= _r < len(matrix) and 0 <= _c < len(matrix[0]): + yield (_r, _c) + + def dfs(i, j): + if (i, j) in memo: + return memo[(i, j)] + longest = 1 + for _i, _j in get_neighbors(i, j): + if matrix[_i][_j] > matrix[i][j]: + longest = max(longest, dfs(_i, _j) + 1) + memo[(i , j)] = longest + return longest + + longest = 0 + memo = dict() + for i in range(len(matrix)): + for j in range(len(matrix[0])): + longest = max(longest, dfs(i, j)) + return longest +``` diff --git a/leetcode/hard/480_sliding_window_median.md b/leetcode/hard/480_sliding_window_median.md new file mode 100644 index 0000000..452ff40 --- /dev/null +++ b/leetcode/hard/480_sliding_window_median.md @@ -0,0 +1,47 @@ +# 480. Sliding Window Median + +## Sort Solution +- Run-time: O(N*K) +- Space: O(K) +- N = Number of elements in nums +- K = Given k value + +Similar to question 295. + +By keeping a sorted array of size K, we can use binary search for each new number at O(logK) run-time. +However, due to the requirement of the sliding window, we need to find the previous value that is outside of the sliding window and remove it from the sorted array. This takes O(logK) with binary search to find but O(K) to rebuild the array after deletion. +We would then have to do this N times, therefore O(N*K) overall run-time. + +``` +import bisect + +class Solution: + def medianSlidingWindow(self, nums: List[int], k: int) -> List[float]: + window, results = list(), list() + median_idx = k // 2 + for idx, n in enumerate(nums): + bisect.insort(window, n) + if len(window) > k: + window.pop(bisect.bisect_left(window, nums[idx-k])) + if len(window) == k: + results.append(window[median_idx] if k % 2 \ + else (window[median_idx-1] + window[median_idx]) / 2) + return results +``` + +Slightly better performance but same big O run-time. +``` +import bisect + +class Solution: + def medianSlidingWindow(self, nums: List[int], k: int) -> List[float]: + window, results = list(nums[0:k-1]), list() + window.sort() + median_idx = k // 2 + for idx, n in enumerate(nums[k-1:], k-1): + bisect.insort(window, n) + results.append(window[median_idx] if k % 2 \ + else (window[median_idx-1] + window[median_idx]) / 2) + window.pop(bisect.bisect_left(window, nums[idx-k+1])) + return results +``` diff --git a/leetcode/hard/727_minimum_window_subsequence.md b/leetcode/hard/727_minimum_window_subsequence.md new file mode 100644 index 0000000..0d82061 --- /dev/null +++ b/leetcode/hard/727_minimum_window_subsequence.md @@ -0,0 +1,131 @@ +# 727. Minimum Window Subsequence + +## Multiple Pointer Solution +- Runtime: O(S^2) +- Space: O(1) +- S = Number of characters in S + +By keeping a pointer on T and a pointer to the start of the sub-string. +We can iterate each character in the string and compare it to T to keep the ordering. +Once we have found all of T, we now have a sub-string that has all of T in it and has the ordering we want. +However, we need to trim it to make it the minimum subsequence, so we would have to traverse the sub-string in reverse order. +The trim stage would need to compare the sub-string and T both in reverse order to find the miniumum subsequence. + +The worst case can be when we have an input like S = 'aaaaaa', T = 'aa'. +We would have to attempt a trim stage for every character, making it O(N^2). + +Some edge cases to consider: +``` +S = 'abcabdcd' +T = 'abcd' +Expected Output = 'abdcd' +``` + +``` +class Solution: + def minWindow(self, S: str, T: str) -> str: + + def trim(start): + t_idx = len(T)-1 + while t_idx >= 0: + if S[start] == T[t_idx]: + t_idx -= 1 + if t_idx < 0: + return start + start -= 1 + return 0 + + window = S + ' ' + t_idx = s_idx = 0 + while s_idx < len(S): + if S[s_idx] == T[t_idx]: + t_idx += 1 + if t_idx == len(T): + end = s_idx + s_idx = trim(s_idx) + t_idx = 0 + window = min(window, S[s_idx:end+1], key=lambda x: len(x)) + s_idx += 1 + return window if len(window) < len(S) else '' +``` + +## Dynamic Programming Solution +- Runtime: O(S\*T) +- Space: O(T) +- S = Number of characters in S +- T = Number of characters in T + +The intution for this solution comes from using a 1d array as a way to build a pseudo linked chain. +If we have a 1d array of size T, each element representing the characters of T. +At any given T[n], will represent a valid possible contiguous sub-string pointing to the first possible index of T[0]. +If a character of S exists in T, then we will set appropriate element of DP to the previous DP element, DP[n] = DP[n-1]. +This builds the chain-like property to fulfill the ordering requirement of the sub-string. + +The only exception to this rule is if we find that the given character of S is the first character of T (T[0]), then we should set all duplicate characters of T first before changing DP[0] to the current index. This will start a new possible chain link that could ripple across the other DP elements if the right ordering occurs. + +Worst case is when T and S are the same duplicate set of characters. Like S = 'aaaaaaa' and T = 'aa'. +This would make us set all DP elements of T for each S. + +Given S = 'abcabdcd', T = 'abcd' +``` +abcabdcd + + +DP: +a b c d +-1 -1 -1 -1 +-------------- + +abcabdcd +^^^ + +DP: +a b c d +0 0 0 -1 +-------------- + +abcabdcd + ^ + +DP: +a b c d +3 0 0 -1 +-------------- + +abcabdcd + ^^ + +DP: +a b c d +3 3 0 0 +-------------- + +abcabdcd + ^^ + +DP: +a b c d +3 3 3 3 +``` + +``` +from collections import defaultdict + +class Solution(object): + def minWindow(self, S, T): + dp = [-1] * len(T) + ch_to_t_idx = defaultdict(list) + min_substr = S + ' ' + for idx, t in enumerate(T): + ch_to_t_idx[t].append(idx) + for idx, ch in enumerate(S): + if ch in ch_to_t_idx: + for t_idx in ch_to_t_idx[ch][::-1]: + if t_idx == 0: # first letter of T + dp[t_idx] = idx + else: + dp[t_idx] = dp[t_idx-1] + if t_idx == len(T)-1 and dp[t_idx] != -1: # last letter of T, save min + min_substr = min(min_substr, S[dp[t_idx]:idx+1], key=lambda x: len(x)) + return min_substr if len(min_substr) < len(S) else '' +``` diff --git a/leetcode/hard/778_swim_in_rising_water.md b/leetcode/hard/778_swim_in_rising_water.md new file mode 100644 index 0000000..7906685 --- /dev/null +++ b/leetcode/hard/778_swim_in_rising_water.md @@ -0,0 +1,62 @@ +# 778. Swim in Rising Water + +## Heap Solution +- Runtime: O(T * Nlog(N)) +- Space: O(N) +- N = Number of elements in grid +- T = Time + +You may have thought to use BFS for this question, but the problem is the fact that we cannot swim to an element in the grid until the water is at its level. +So we have a second issue to figure out which elements are up to the water's level. +Since we are not finding the shortest path exactly, just the shortest time needed to reach grid[N-1][N-1]. + +To find the elements within the water's level, a min heap comes as the obvious choice. +We also need to avoid adding the same element in the heap, so a visited set will be required. + +The logic goes as follows: +1. As long as there are elements in the heap. +2. Increment time + 1. +2. Continue popping off the min heap until the top of the heap is > time. +2. For each popped off element, if the element popped off is the bottom-right element of the grid, return current time. +2. Add all neighbors of the popped elements into the min heap. Keep a visited set to not add duplicate elements into the heap. +5. Repeat. + +``` +class Square(object): + + def __init__(self, elevation, x, y): + self.elevation = elevation + self.x = x + self.y = y + + def __lt__(self, other): + return self.elevation < other.elevation + +class Solution(object): + def swimInWater(self, grid): + + def get_neighbors(x, y): + directions = [(0, 1), (1, 0), (-1, 0), (0, -1)] + for _x, _y in directions: + _x += x + _y += y + if _x >= 0 and _x < len(grid) and _y >= 0 and _y < len(grid): + yield (_x, _y) + + if len(grid) == 0 or len(grid[0]) == 0:a + return 0 + min_heap = list([Square(grid[0][0], 0, 0)]) + visited = set([(0, 0)]) + time = 0 + while len(min_heap) != 0: + time += 1 + while len(min_heap) != 0 and min_heap[0].elevation <= time: + popped_square = heapq.heappop(min_heap) + if popped_square.x == len(grid)-1 and popped_square.y == len(grid)-1: + return time + for _x, _y in get_neighbors(popped_square.x, popped_square.y): + if (_x, _y) not in visited: + visited.add((_x, _y)) + heapq.heappush(min_heap, Square(grid[_x][_y], _x, _y)) + return 0 +``` diff --git a/leetcode/medium/003_longest_substring_without_repeating_characters.md b/leetcode/medium/003_longest_substring_without_repeating_characters.md new file mode 100644 index 0000000..a7d0d30 --- /dev/null +++ b/leetcode/medium/003_longest_substring_without_repeating_characters.md @@ -0,0 +1,46 @@ +# 3. Longest Substring Without Repeating Characters + +## Sliding Window Solution +- Run-time: O(N) or 2N +- Space: O(N) +- N = Number of characters in string + +Using a sliding window and a set, we can identify when we need to move the left side of the sliding window when a duplicate character is found as well as when to stop. + +This solution is actually a two pass solution, due to the fact we have to remove elements on the left side from the set, hence, visiting each element twice. + +``` +class Solution: + def lengthOfLongestSubstring(self, s: str) -> int: + seen = set() + left_idx = longest_substr = 0 + for idx, ch in enumerate(s): + while ch in seen and left_idx <= idx: + seen.remove(s[left_idx]) + left_idx += 1 + seen.add(ch) + longest_substr = max(longest_substr, len(seen)) + return longest_substr +``` + +## One Pass Solution +- Run-time: O(N) +- Space: O(N) +- N = Number of characters in string + +To perform a one pass solution, we can use a dictionary where the key is the character and the value is its index. +Similar to the set, we can use the dictionary to check if we have a duplicate in our sliding window. +If that is true, we can immediately move the left side to the previous duplicated character's index + 1. + +``` +class Solution: + def lengthOfLongestSubstring(self, s: str) -> int: + seen = dict() + left_idx = longest_substr = 0 + for idx, ch in enumerate(s): + if ch in seen: + left_idx = seen[ch] + 1 + seen[ch] = idx + longest_substr = max(longest_substr, idx - left_idx + 1) + return longest_substr +``` diff --git a/leetcode/medium/011_container_with_most_water.md b/leetcode/medium/011_container_with_most_water.md new file mode 100644 index 0000000..3e3cb1f --- /dev/null +++ b/leetcode/medium/011_container_with_most_water.md @@ -0,0 +1,32 @@ +# 11. Container With Most Water + +## Two pointer one pass solution + +- Runtime: O(N) +- Space: O(1) +- N = Number of elements in list + +The intution can be gained by thinking about how to carry the most water in an increasing and decreasing input like [1,2,3,4] or [4,3,2,1]. +If you begin with the left and right, you can figure out how much water you can have with just those two inputs. + +The next step is to figure out which direction you should move inwards, either the left or the right, squeezing into the middle. +You can figure that out by using peak and valley inputs, like [1,2,3,2,1] or [3,2,1,2,3]. +What you begin to notice is that, the only way we can increase the amount of water is by moving the lowest height and finding a larger height. +If both left and right heights are the same, we can pick an arbitrary side to move. + +``` +class Solution: + def maxArea(self, height: List[int]) -> int: + left, right = 0, len(height)-1 + max_area = 0 + lowest_height = min(height[left], height[right]) + while left < right: + lowest_height = min(height[left], height[right]) + length = right - left + max_area = max(max_area, lowest_height * length) + if height[left] <= height[right]: + left += 1 + else: + right -= 1 + return max_area +``` diff --git a/leetcode/medium/015_3Sum.md b/leetcode/medium/015_3Sum.md new file mode 100644 index 0000000..5113cfb --- /dev/null +++ b/leetcode/medium/015_3Sum.md @@ -0,0 +1,59 @@ +## Solution + +- Runtime: O(N^2) +- Space: O(1) +- N = Number of elements in array + +Since the array can have duplicates and the solution only wants unique triplets, the difficulty here is to figure out a way avoid having duplicate results. + +First, lets take a look at how we would find a triplet. +If this was a combination problem, the brute force, would to traverse with three pointers all the possible sums in the array. +That would take O(N^3) run-time. + +Instead of thinking about the solution as a triplet, lets consider just two sums. +If you just wanted to figure out if two numbers exists that add up to a target in a array, how would that be done? +Again, brute force would be O(N^2). +However, we can improve that to linear time. +If we first sorted the array O(Nlog(N)), we can have two pointers starting at the left and right of the array. +Incrementing the left if the sum is too low and decrementing the right if the sum is too high. + +Now if we applied this 2 sum solution to the 3 sum problem, we can first sort the array, then select a target or a pivot, then use that pivot with the two sum solution to find zero. +This will improve the run-time, since the two sum is linear and we have to select N pivots, the run-time is O(N^2). +The sort has a lower big O, so it is ignored. + +Now the tricky part is to avoid duplicate triplets. + +Given input: [-1-1-1,0,0,0,1,1,1] + +We notice that if pivot was -1, we would traverse everything after the first element. +But during the two sum solution, we don't care about using the same number again, so it is important to move the two pointers to the next unique number every time. +Also, once a pivot is selected, we would have exhausted all combinations using that pivot, so it is also important to select a unique pivot every time. + +``` +class Solution: + def threeSum(self, nums: List[int]) -> List[List[int]]: + nums.sort() + results = list() + for index, n in enumerate(nums[:-2]): + if index == 0 or n != nums[index-1]: + self.find_two_sums(nums, index+1, n, results) + return results + + def find_two_sums(self, nums, start_index, pivot, results): + left = start_index + right = len(nums)-1 + while left < right: + n = pivot + nums[left] + nums[right] + if n == 0: + results.append([pivot, nums[left], nums[right]]) + curr_right = nums[right] + while nums[right] == curr_right and left < right: + right -= 1 + curr_left = nums[left] + while nums[left] == curr_left and left < right: + left += 1 + elif n < 0: + left += 1 + elif n > 0: + right -= 1 +``` diff --git a/leetcode/medium/022_generate_parentheses.md b/leetcode/medium/022_generate_parentheses.md new file mode 100644 index 0000000..e74e0bb --- /dev/null +++ b/leetcode/medium/022_generate_parentheses.md @@ -0,0 +1,42 @@ +# 22. Generate Parentheses + +## Recursive Solution +- Run-time: Less than O(2^2N) +- Space: 2N +- N = Given N + +Any time we deal with different choices, recursion should be the first thing to come to mind. +This problem has a simple decision tree, whether to add a parenthesis or not. +To figure that out, we need a few more items, we need to know the number of open versus closed parenthesis already used to form the result. +Therefore, we are only considering valid parentheses as we recur down the decision tree. + +The run-time can be figured out by thinking about number of branches or children each node of tree will have, as well as the depth of the tree. +So you can use the equation O(B^D) for most recursions. +Since there are 2 decisions/branches and a depth of 2N, run-time can be considered O(2^2N). + +However, unlike other decision trees, this particular approach is only generating valid parentheses. +So a result like '((((' or '))((' cannot be created if we were to instead traverse the entire decision. +So a run-time of O(2^2N) isn't exactly correct, it is actually faster, again since we are only generating valid parentheses. +It maybe difficult to come up with the actually run-time during an interview but you should at least mention this. + +``` +class Solution: + def generateParenthesis(self, n: int) -> List[str]: + + def gen_helper(n_open, n_closed, stack): + if n_open == n and n_closed == n: + results.append(''.join(stack)) + return + if n_open != n: + stack.append('(') + gen_helper(n_open + 1, n_closed, stack) + stack.pop() + if n_open > n_closed and n_closed != n: + stack.append(')') + gen_helper(n_open, n_closed + 1, stack) + stack.pop() + + results = list() + gen_helper(0, 0, []) + return results +``` diff --git a/leetcode/medium/039_combination_sum.md b/leetcode/medium/039_combination_sum.md new file mode 100644 index 0000000..2ac73d4 --- /dev/null +++ b/leetcode/medium/039_combination_sum.md @@ -0,0 +1,29 @@ +# 39. Combination Sum + +## DFS Recursion Solution + +- Runtime: TBD +- Space: O(T) +- N = Number of elements in array +- T = Target number + +``` +class Solution: + def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]: + + def dfs(curr_idx, curr_sum, stack, results): + if curr_sum > target: + return + elif curr_sum == target: + results.append(copy.deepcopy(stack)) + return + for idx in range(curr_idx, len(candidates)): + stack.append(candidates[idx]) + dfs(idx, curr_sum+candidates[idx], stack, results) + stack.pop() + + results = list() + candidates.sort() + dfs(0, 0, [], results) + return results +``` diff --git a/leetcode/medium/049_group_anagrams.md b/leetcode/medium/049_group_anagrams.md new file mode 100644 index 0000000..70ee11d --- /dev/null +++ b/leetcode/medium/049_group_anagrams.md @@ -0,0 +1,58 @@ +# 49. Group Anagrams + +## Sort Solution + +- Runtime: O(NSlog(S)) +- Space: O(NS) +- N = Number of strings in list +- S = Longest string + +We can use a dictionary to group anagrams together. +However, to figure out wherther two strings should be grouped together, we can create the same key by sorting them. + +``` +from collections import defaultdict + +class Solution: + def groupAnagrams(self, strs: List[str]) -> List[List[str]]: + grouped_anagrams = defaultdict(list) + for word in strs: + key = ''.join(sorted(word)) + grouped_anagrams[key].append(word) + return grouped_anagrams.values() +``` + +## Hash Solution + +- Runtime: O(NS) +- Space: O(NS) +- N = Number of strings in list +- S = Longest string + +We can improve upon the previous solution by figuring out a better way to create the exact same key to group by. +We will have to create a good hash function. +Since we know the actually ranges, that is 26 letters, we can just create an array of size 26 and place the counts of each letter for each of the 26 buckets. +We have to create a tuple version of this list because you cannot hash a mutable object in Python, only immutable objects. + +It is also very important to note that there MUST be some sort of delimiter inbetween each count/bucket of the hash code. +Else you can have an input where the hash code is a bunch of 1s but you can't tell which character has a count of 1, 11 or 111 etc... +Hence, creating a bad hash code. + +``` +from collections import defaultdict + +class Solution: + def groupAnagrams(self, strs: List[str]) -> List[List[str]]: + def get_hash(word): + buckets = [0] * 26 + for ch in word: + idx = ord(ch) - ord('a') + buckets[idx] += 1 + return (tuple(buckets)) + + grouped_anagrams = defaultdict(list) + for word in strs: + key = get_hash(word) + grouped_anagrams[key].append(word) + return grouped_anagrams.values() +``` diff --git a/leetcode/medium/075_sort_colors.md b/leetcode/medium/075_sort_colors.md index 51b441a..4d26449 100644 --- a/leetcode/medium/075_sort_colors.md +++ b/leetcode/medium/075_sort_colors.md @@ -1,6 +1,6 @@ # 75. Sort Colors -## Two pointer one pass solution +## Three pointer one pass solution - Runtime: O(N) - Space: O(1) @@ -8,7 +8,7 @@ Since we know the range of numbers, 0, 1 and 2, we can just use two pointers for each end of the array. The left pointer will point to the first unsorted number != 0 and the right pointer will point to the last unsorted number != 2. -We will then traverse the array from left to right swapping the values with the current pointer to either the left or right depending if its a 0 or 2, ignoring 1s, then incrementing the left or right pointer. +We will then traverse the array from left to right swapping the values with a third pointer to either the left or right depending if its a 0 or 2, ignoring 1s, then incrementing the left or right pointer. Eventually, when we reach the right pointer, it would have sorted the entire array and all the 1s will be in the middle. ``` @@ -17,15 +17,8 @@ class Solution: """ Do not return anything, modify nums in-place instead. """ - left, right = 0, len(nums)-1 - while left > right: - if nums[left] == 0: - left += 1 - elif nums[right] == 2: - right -= 1 - else: - break - curr_index = left + right = len(nums)-1 + curr_index = left = 0 while curr_index <= right: if nums[curr_index] == 0: nums[curr_index], nums[left] = nums[left], nums[curr_index] diff --git a/leetcode/medium/078_subsets.md b/leetcode/medium/078_subsets.md index 549bd65..45e6e35 100644 --- a/leetcode/medium/078_subsets.md +++ b/leetcode/medium/078_subsets.md @@ -5,6 +5,9 @@ - Space: O(N) - N = Number of elements in array +The intuition is to recognize that for each number, we can either add it or not. +Therefore, using recursion, we can easily backtrack the solution and try choice #2. + Using the ability for recursion to backtrack will allow us to populate the result. During each recursion, we will loop through the given array, during this loop, the number represent a choosen number for the subset. The numbers that were not choosen yet will be passed to the next recursion to be choosen again, hence, creating the result. @@ -23,17 +26,19 @@ Input: [1,2,3] 9. R1: Select 3 -> [3] -> Pop 3 -> Done ``` -class Solution: - def subsets(self, nums: List[int]) -> List[List[int]]: +class Solution(object): + def subsets(self, nums): + + def subset_helper(idx, results, path=[]): + if idx == len(nums): + results.append(path) + return + # add + subset_helper(idx+1, results, path+[nums[idx]]) + # don't add + subset_helper(idx+1, results, path) + results = list() - results.append([]) - self.subset_helper(nums, [], results) + subset_helper(0, results) return results - - def subset_helper(self, nums, curr_result, results): - for index, n in enumerate(nums): - curr_result.append(n) - results.append([str(n) for n in curr_result]) - self.subset_helper(nums[index+1:], curr_result, results) - curr_result.pop() ``` diff --git a/leetcode/medium/091_decode_ways.md b/leetcode/medium/091_decode_ways.md new file mode 100644 index 0000000..de8b5d1 --- /dev/null +++ b/leetcode/medium/091_decode_ways.md @@ -0,0 +1,87 @@ +# 91. Decode Ways + +## Brute Force Recursion Solution + +- Runtime: O(2^N) +- Space: O(N) +- N = Number of characters in string + +We should be able to recognize that we can basically ask two questions, for each character, can we decode this character or if we can decode this character with the previous character? +With this, you can build a recursion function to solve this solution. + +``` +class Solution(object): + def numDecodings(self, s): + + def decode_helper(s): + if len(s) == 0: + return 1 + n_ways = 0 + if int(s[0]) != 0: + n_ways += decode_helper(s[1:]) + if len(s) >= 2 and 10 <= int(s[:2]) <= 26: + n_ways += decode_helper(s[2:]) + return n_ways + + if len(s) == 0: + return 0 + return decode_helper(s) +``` + +## Dynamic Programming Solution + +- Runtime: O(N) +- Space: O(N) +- N = Number of characters in string + +Since we can recognize that there exists a recursion function, this can tell us that there is also a dynamic programming solution too. +We already know the two sub-problems, whether to decode current character or current + previous character. + +With any dynamic programming solution, we need some sort of array. +We can see that a 1d array can be used to represent number of ways to decode a character. +With this, we can store the previous calculated numbers and check them as we go left to right in the string. + +So given s='123', dp[0] will represent an empty string, dp[1] will represent '1', dp[2] will represent '2' and so forth. +We can then deduct that at any given character, dp[n] = dp[n-1] + dp[n-2]. + +``` +class Solution(object): + def numDecodings(self, s): + if len(s) == 0: + return 0 + s = '0' + s # represent empty string + n_ways = [0] * (len(s)) + n_ways[0] = 1 # set empty string + for idx, ch in enumerate(s[1:], 1): + if int(ch) != 0: + n_ways[idx] += n_ways[idx-1] + if 10 <= int(s[idx-1:idx+1]) <= 26: + n_ways[idx] += n_ways[idx-2] + return n_ways[-1] +``` + +## Optimal Solution + +- Runtime: O(N) +- Space: O(1) +- N = Number of characters in string + +You can further optimize space by just keeping just two variables to represent the previous character and the one before that previous character. We don't need the entire N array, once we use up the past DP elements, they will no longer be used anymore. + +``` +class Solution(object): + def numDecodings(self, s): + if len(s) == 0: + return 0 + s = '0' + s # represent empty string + prev_prev, prev = 0, 1 + for idx, ch in enumerate(s[1:], 1): + result = 0 + if int(ch) != 0: + result += prev + if 10 <= int(s[idx-1:idx+1]) <= 26: + result += prev_prev + prev_prev = prev + prev = result + return prev +``` diff --git a/leetcode/medium/094_binary_tree_inorder_traversal.md b/leetcode/medium/094_binary_tree_inorder_traversal.md index 1879a68..e4f75db 100644 --- a/leetcode/medium/094_binary_tree_inorder_traversal.md +++ b/leetcode/medium/094_binary_tree_inorder_traversal.md @@ -47,15 +47,13 @@ class Solution: def inorderTraversal(self, root: TreeNode) -> List[int]: stack, result = list(), list() curr_node = root - while True: - if curr_node is not None: # Going down the tree + while stack or curr_node: + if curr_node: stack.append(curr_node) - curr_node = curr_node.left - elif len(stack) > 0: # Going up the tree (backtracking) - curr_node = stack.pop() - result.append(curr_node.val) - curr_node = curr_node.right + curr_node = curr_node.left # go left else: - break + curr_node = stack.pop() # go up + result.append(curr_node.val) + curr_node = curr_node.right # go right return result ``` diff --git a/leetcode/medium/1008_construct_binary_search_tree_from_preorder_traversal.md b/leetcode/medium/1008_construct_binary_search_tree_from_preorder_traversal.md new file mode 100644 index 0000000..18bc6c7 --- /dev/null +++ b/leetcode/medium/1008_construct_binary_search_tree_from_preorder_traversal.md @@ -0,0 +1,34 @@ +# 1008. Construct Binary Search Tree from Preorder Traversal + +## Recursive Solution with Ranges + +- Runtime: O(N) +- Space: O(N) +- N = Number of elements in array + +I've selected this question because it would be good to know at least one reconstruction method of a BST. +Preorder traversal was selected because it is one of the easier ones to understand. +Unlike the other traversals, preorder has a property of knowing what the root node is by looking at the first element of the list. + +We can use the fact that we know what the root node is and set a upper and lower bound for the values in the left and right recursion calls. When we are out of bounds for either recursion, we now know we have reached the furthest possible value in the list and backtrack up to the parent node. + +``` +class Solution: + curr_idx = 0 + def bstFromPreorder(self, preorder: List[int]) -> TreeNode: + def bst_creator(lower, upper): + if self.curr_idx >= len(preorder): + return None + root_val = preorder[self.curr_idx] + if not lower <= root_val < upper: + return None + new_node = TreeNode(root_val) + self.curr_idx += 1 + left_node = bst_creator(lower, root_val) + right_node = bst_creator(root_val, upper) + new_node.left = left_node + new_node.right = right_node + return new_node + + return bst_creator(float('-inf'), float('inf')) +``` diff --git a/leetcode/medium/1048_longest_string_chain.md b/leetcode/medium/1048_longest_string_chain.md new file mode 100644 index 0000000..8eb24e7 --- /dev/null +++ b/leetcode/medium/1048_longest_string_chain.md @@ -0,0 +1,38 @@ +# 1048. Longest String Chain + +## Memoization with Recursion Solution + +- Runtime: O(NC) + O(C^C) +- Space: O(C) +- N = Number of words in list +- C = Longest character word + +We can come up with a recursive solution quite easily by removing each letter and calling the next recursion function on each newly formed word. +However, this would equate to a run-time of O(C^C), since we have to do this N times it would then be O(N) \* O(C^C). + +We can improve our run-time by using memoization, instead of redoing checks, we just check once for each path and save that result. +So if we can only build a chain of 3 with 'abcd', if given 'abcde', when it comes to removing 'e', we don't need to check each character of 'abcd' again, instead just return 3. +This would mean our run-time goes down tremendously to O(NC) + O(C^C). +This is because we only need to do the O(C^C) check once and not for every N words. + +``` +class Solution: + def longestStrChain(self, words: List[str]) -> int: + + def chain_helper(word): + if word in memo: + return memo[word] + longest_chain = 1 + for idx in range(len(word)): + new_word = word[:idx] + word[idx+1:] + if new_word in word_set: + longest_chain = max(chain_helper(new_word)+1, longest_chain) + memo[word] = longest_chain + return longest_chain + + memo = dict() + word_set = set(words) + for word in words: + chain_helper(word) + return max(memo.values(), default=0) +``` diff --git a/leetcode/medium/105_construct_binary_tree_from_preorder_and_inorder_traversal.md b/leetcode/medium/105_construct_binary_tree_from_preorder_and_inorder_traversal.md new file mode 100644 index 0000000..9c1570d --- /dev/null +++ b/leetcode/medium/105_construct_binary_tree_from_preorder_and_inorder_traversal.md @@ -0,0 +1,37 @@ +# 105. Construct Binary Tree from Preorder and Inorder Traversal + +## Recursive Solution +- Runtime: O(N) +- Space: O(N) (Due to hash table) +- N = Number of elements in list + +The preorder traversal is the pinnacle element. +The property to be able to find the root node of the preorder traversal is key. + +Given preorder of [A,B,D,E,C,F,G] and inorder of [D,B,E,A,F,C,G]. +Using the preorder, we can see that A is the root of the entire tree. +Using A, looking at the inorder we know that the left-subtree is [D,B,E] and the right-subtree is [F,C,G]. +Going back to the preorder with this information, we can see that [B,D,E] is our left-subtree, and [C,F,G] is our right. + +Now we can then use recursion to further build the nodes, we can take preorder [B,D,E] and inorder [D,B,E] to build the left-subtree, to similar effect with the right as well. +Using the same rules applied above, we know B is the root node and [D] is on the left and [E] is on the right. + +To find the associated index of the inorder value using the preorder value would take O(N), however, using an enumerated hash table can make this O(1). + +``` +class Solution: + def buildTree(self, preorder: List[int], inorder: List[int]) -> TreeNode: + + def build_helper(preorder_start, preorder_end, inorder_start, inorder_end): + if preorder_start > preorder_end or inorder_start > inorder_end: + return None + inorder_root_idx = val_to_inorder_idx[preorder[preorder_start]] + left_size = inorder_root_idx - inorder_start + node = TreeNode(preorder[preorder_start]) + node.left = build_helper(preorder_start+1, preorder_start+1+left_size, inorder_start, inorder_root_idx-1) + node.right = build_helper(preorder_start+1+left_size, preorder_end, inorder_root_idx+1, inorder_end) + return node + + val_to_inorder_idx = {val: i for i, val in enumerate(inorder)} + return build_helper(0, len(preorder)-1, 0, len(inorder)-1) +``` diff --git a/leetcode/medium/106_construct_binary_tree_from_inorder_and_postorder_traversal.md b/leetcode/medium/106_construct_binary_tree_from_inorder_and_postorder_traversal.md new file mode 100644 index 0000000..414c6a5 --- /dev/null +++ b/leetcode/medium/106_construct_binary_tree_from_inorder_and_postorder_traversal.md @@ -0,0 +1,31 @@ +# 106. Construct Binary Tree from Inorder and Postorder Traversal + +## Recursive Solution +- Runtime: O(N) +- Space: O(N) (Due to hash table) +- N = Number of elements in list + +Similar to question 105. +Postorder allows us to know which is the root node by using the last element of the array. +With this, we can figure out the left and right sub-trees in the inorder traversal. +Using recursion, we can continue to break up the sub-trees once we know which is the root of the sub-tree using this method. + +To allow for quicker look ups for roots, we can build an enmuerated hash table to find the indexes for each value of the inorder traversal. + +``` +class Solution: + def buildTree(self, inorder: List[int], postorder: List[int]) -> TreeNode: + + def build_helper(inorder_start, inorder_end, postorder_start, postorder_end): + if inorder_start > inorder_end or postorder_start > postorder_end: + return None + inorder_root_idx = val_to_inorder_idx[postorder[postorder_end]] + right_size = inorder_end - inorder_root_idx + node = TreeNode(postorder[postorder_end]) + node.left = build_helper(inorder_start, inorder_root_idx-1, postorder_start, postorder_end - right_size - 1) + node.right = build_helper(inorder_root_idx + 1, inorder_end, postorder_end - right_size, postorder_end - 1) + return node + + val_to_inorder_idx = {val: i for i, val in enumerate(inorder)} + return build_helper(0, len(inorder)-1, 0, len(postorder)-1) +``` diff --git a/leetcode/medium/1091_shortest_path_in_binary_matrix.md b/leetcode/medium/1091_shortest_path_in_binary_matrix.md index cb6da2a..daa399e 100644 --- a/leetcode/medium/1091_shortest_path_in_binary_matrix.md +++ b/leetcode/medium/1091_shortest_path_in_binary_matrix.md @@ -1,7 +1,7 @@ # 1091. Shortest Path in Binary Matrix ## BFS Solution -- Runtime: O(N) +- Run-time: O(N) - Space: O(N) - N = Number of elements in grid @@ -33,7 +33,7 @@ class Solution: bfs_queue.append((_x, _y)) grid[_x][_y] = -1 return -1 - + def get_neighbors(self, grid, x, y): axises = [(1,0),(0,1),(0,-1),(-1,0),(-1,-1),(1,-1),(-1,1),(1,1)] for _x, _y in axises: @@ -41,21 +41,61 @@ class Solution: _y += y if self.within_bounds(_x, _y, grid) and grid[_x][_y] == 0: yield (_x, _y) - + def within_bounds(self, x, y, grid): return x >= 0 and x < len(grid) and y >= 0 and y < len(grid[0]) ``` ## Bi-directional BFS Best Solution -- Runtime: O(N) +- Run-time: O(N) - Space: O(N) - N = Number of elements in grid We can further improve the run-time of the BFS by using a bi-directional BFS. If you imagine a circle, every time you expand the circle, the number of neighbors being visited almost doubles. Now if you used two circles and expanded them both simultaneously, when they touch, it would have visited less neighbors compared to using one circle. -This increases the run-time even more for some inputs, worst case, is the same as the first BFS solution. ``` +from collections import deque + +class Solution(object): + def shortestPathBinaryMatrix(self, grid): + + def get_neighbors(row_idx, col_idx): + dirs = [(0,1),(1,0),(0,-1),(-1,0),(1,1),(-1,-1),(-1,1),(1,-1)] + for r, c in dirs: + r += row_idx + c += col_idx + if 0 <= r < len(grid) and 0 <= c < len(grid[0]): + yield (r, c) + def bfs_once(queue, our_num, other_num): + for _ in range(len(queue)): + node = queue.pop() + for x, y in get_neighbors(node[0], node[1]): + if grid[x][y] == other_num: # intersection + return True + elif grid[x][y] == 0: + grid[x][y] = our_num + queue.appendleft((x, y)) + return False + + if not grid or len(grid) == 0 or len(grid[0]) == 0: # is grid empty? + return -1 + elif len(grid) == 1 and len(grid[0]) == 1: # does grid only contain one element? + return 1 + elif grid[0][0] == 1 or grid[-1][-1] == 1: # are start or end blocked? + return -1 + queue1 = deque([(0, 0)]) + queue2 = deque([(len(grid)-1, len(grid[0])-1)]) + grid[0][0], grid[-1][-1] = 2, 3 # numbers represent two visited sets + n_moves = 2 + while queue1 and queue2: + if bfs_once(queue1, 2, 3): + return n_moves + n_moves += 1 + if bfs_once(queue2, 3, 2): + return n_moves + n_moves += 1 + return -1 ``` diff --git a/leetcode/medium/1135_connecting_cities_with_minimum_cost.md b/leetcode/medium/1135_connecting_cities_with_minimum_cost.md new file mode 100644 index 0000000..5ae2bce --- /dev/null +++ b/leetcode/medium/1135_connecting_cities_with_minimum_cost.md @@ -0,0 +1,74 @@ +# 1135. Connecting Cities With Minimum Cost + +## Prim's Algorithm Solution with Heaps +- Run-time: O((V + E)logV) +- Space: O(V) +- V = Vertices +- E = Edges + +This question requires a minimum spanning tree (MST) algorithm. +I choose to use Prim's algorithm over Kruskal's algorithm because it is similar to Dijkstra's algorithm, both are greedy algorithms. +Knowing one will help you learn the other. + +The pseudocode could be represented as such: +1. Select an arbitrary vertex to start with and add it to the heap. +2. While the heap isn't empty: + - Select the min edge/weight of an unvisited vertex from top of heap. + - Add the selected vertex to the visited vertices. + - Take note of the cost or path taken. + - Add/Update selected vertex's neighboring edges/weights of unvisited vertices into the heap. + +As of the time of writing this, Python 3.7 doesn't have an adaptable priority queue implementation. +If there was a real adaptable priority queue, there would be a feature to update a specific city's cost in the heap and bubble up and down the heap to maintain the heap property. +To read more about adaptable priority queues, check out [Data Structures and Algorithms in Python 1st Edition](https://www.amazon.com/Structures-Algorithms-Python-Michael-Goodrich/dp/1118290275). + +This example is more of a lazy heap implementation, looking at the heapq documentation, they show an implementation of a lazy adaptable priority queue. +This will help us get O((V + E)logV) run-time. +Without performing the lazy delete technique with a heaq, we would get O((V + E)^2logV) run-time. +That is because in a undirected dense graph, we would be adding duplicate entries of the same nodes into the heap. +There are a lot of incorrect implementations of Prim's Algorithm using heapq that do not perform lazy deletes out there, so be warned. + +``` +from collections import defaultdict + +class Solution: + def minimumCost(self, N: int, connections: List[List[int]]) -> int: + + def create_adj_list(): + adj_list = defaultdict(list) + for n in range(1, N + 1): # incase there are cities with no connections + adj_list[n] + for city1, city2, cost in connections: + adj_list[city1].append((city2, cost)) + adj_list[city2].append((city1, cost)) + return adj_list + + def add_city(new_city, new_cost): + new_node = [new_cost, new_city, False] + city_to_heap_node[new_city] = new_node + heapq.heappush(min_heap, new_node) + + adj_list = create_adj_list() + start_node = [0, 1, False] # cost, city, remove + min_heap = list([start_node]) # start at city #1 + city_to_heap_node = dict({1 : start_node}) + visited, min_cost = set(), 0 + while min_heap: + cost, city, remove = heapq.heappop(min_heap) + if remove: # lazy delete + continue + visited.add(city) + min_cost += cost + for next_city, next_cost in adj_list[city]: + if next_city in visited: + continue + elif next_city in city_to_heap_node: + heap_node = city_to_heap_node[next_city] + if next_cost < heap_node[0]: # lower cost? + heap_node[2] = True # lazy delete + add_city(next_city, next_cost) + else: + add_city(next_city, next_cost) + city_to_heap_node.pop(city) + return min_cost if len(visited) == N else -1 +``` diff --git a/leetcode/medium/133_clone_graph.md b/leetcode/medium/133_clone_graph.md new file mode 100644 index 0000000..bc201d0 --- /dev/null +++ b/leetcode/medium/133_clone_graph.md @@ -0,0 +1,32 @@ +# 133. Clone Graph + +## Dictionary and DFS Solution +- Runtime: O(V + E) +- Space: O(V) +- V = Vertices +- E = Edges + +We can use a dictionary as a map to the cloned node, then by using DFS, we can figure out the cloned node in relation to the nodes in the graph. +As we visit each node with DFS, we can create a clone of each neighbor and add them to the current node's cloned node using the dictionary. +After that, we can call DFS on each neighbor. + +``` +class Solution: + def cloneGraph(self, node: 'Node') -> 'Node': + + def dfs_clone(curr): + if curr in visited: + return + visited.add(curr) + for neighbor in curr.neighbors: + if neighbor not in node_to_clone: + node_to_clone[neighbor] = Node(neighbor.val, []) + node_to_clone[curr].neighbors.append(node_to_clone[neighbor]) + dfs_clone(neighbor) + + node_to_clone = dict() + node_to_clone[node] = Node(node.val, []) + visited = set() + dfs_clone(node) + return node_to_clone[node] +``` diff --git a/leetcode/medium/142_linked_list_cycle_II.md b/leetcode/medium/142_linked_list_cycle_II.md new file mode 100644 index 0000000..a702df3 --- /dev/null +++ b/leetcode/medium/142_linked_list_cycle_II.md @@ -0,0 +1,38 @@ +# 142. Linked List Cycle II + +## Solution +- Runtime: O(N) +- Space: O(1) +- N = Nodes in linked list + +By using a fast and slow pointer, slow moves once while fast moves twice, we can figure out if there is a cycle. +However, there is a second property in doing this, at the point where slow and fast meet in the cycle, it is exactly K steps away from the start of the cycle as the head is to the beginning of the cycle. + +Reason why this works is that, no matter where the cycle begins or if the number of nodes is even or odd: +1. Slow and fast will always end up moving an even number of times. +2. Fast will always move twice as much as slow. +3. Fast would have at least went around the cycle one time. + +``` +class Solution(object): + def detectCycle(self, head): + """ + :type head: ListNode + :rtype: ListNode + """ + if head is None: + return None + slow = fast = head + while fast is not None and fast.next is not None: + fast = fast.next.next + slow = slow.next + if fast is slow: + break + else: + return None + slow = head + while fast is not slow: + slow = slow.next + fast = fast.next + return slow +``` diff --git a/leetcode/medium/146_lru_cache.md b/leetcode/medium/146_lru_cache.md new file mode 100644 index 0000000..3344147 --- /dev/null +++ b/leetcode/medium/146_lru_cache.md @@ -0,0 +1,124 @@ +# 146. LRU Cache + +## Solution +- Runtime: O(1) +- Space: O(C) +- C = Capacity + +This is a very popular interview question. + +Since both get() and put() need to be O(1), this can be achieved by using a dictionary and a queue. + +Will we need some type of ordering to figure out what keys are not longer needed and we will need a dictionary to determine if the key exists. +Combining both will help us achieve O(1). + +Since the get() will reorder our nodes in the queue, for example, if we get a key and the key is currently in the middle of queue. +We need a way to move it to the front of the queue. +To do that, the dictionary will be a pointer to that node in the queue. +For the dictionary, we can use the key as the key and the value will be node in the queue. +Another thing to note is that each node in the queue will need to be a doubly linked list. +During the removal process of a node in the queue, we would need to know what the previous and next nodes are. + +Put() will simply add to the front of the queue and update the dictionary. +If the capacity has been reached, we can remove what is at the tail of the queue and remove the corresponding key from the dictionary. + +As an added bonus, you can use a circular linked list as your queue, although it is not required. +You can also utilize a dummy node to eliminate the four cases of removing and adding nodes to a linked list. + +If you plan to use OrderedDict or some other library, the interviewer will ask you to implement it from scratch. +So don't rely on outside libraries for your hash table and queue. + +``` +class LRUCache: + + def __init__(self, capacity: int): + self.queue = CircularLinkedList() + self.key_to_node = dict() + self.cap = capacity + + def get(self, key: int) -> int: + if key in self.key_to_node: + node = self.key_to_node[key] + self.queue.remove(node) + new_node = self.queue.add(node) + return self.key_to_node[key].val + return -1 + + def put(self, key: int, value: int) -> None: + if key in self.key_to_node: + node_to_del = self.key_to_node[key] + self.queue.remove(node_to_del) + new_node = Node(key, value) + self.queue.add(new_node) + self.key_to_node[key] = new_node + if len(self.key_to_node) > self.cap: + node_to_del = self.queue.tail + self.queue.remove(node_to_del) + del self.key_to_node[node_to_del.key] + +class CircularLinkedList(object): + def __init__(self): + self.dummy_node = Node(0, 0) + self.dummy_node.next = self.dummy_node + self.dummy_node.prev = self.dummy_node + + def add(self, new_node): + if new_node is None or new_node is self.dummy_node: + return + new_node.prev = self.dummy_node + new_node.next = self.dummy_node.next + self.dummy_node.next.prev = new_node + self.dummy_node.next = new_node + + def remove(self, node): + if node is None or node is self.dummy_node: + return + p = node.prev + n = node.next + p.next = n + n.prev = p + + @property + def tail(self): + if self.dummy_node.prev is not self.dummy_node: + return self.dummy_node.prev + return None + +class Node(object): + def __init__(self, key, val): + self.prev = None + self.next = None + self.val = val + self.key = key +``` + +## Solution with OrderedDict +- Runtime: O(1) +- Space: O(C) +- C = Capacity + +For those interested in the implementation with an OrderedDict. Here is an example. + +``` +from collections import OrderedDict + +class LRUCache: + + def __init__(self, capacity: int): + self.cap = capacity + self.cache = OrderedDict() + + def get(self, key: int) -> int: + if key in self.cache: + val = self.cache.pop(key) + self.cache[key] = val + return val + return -1 + + def put(self, key: int, value: int) -> None: + if key in self.cache: + self.cache.pop(key) + self.cache[key] = value + if len(self.cache) > self.cap: + self.cache.popitem(last=False) +``` diff --git a/leetcode/medium/148_sort_list.md b/leetcode/medium/148_sort_list.md new file mode 100644 index 0000000..b784161 --- /dev/null +++ b/leetcode/medium/148_sort_list.md @@ -0,0 +1,42 @@ +# 148. Sort List + +## Merge Sort Solution +- Run-time: O(Nlog(N)) +- Space: O(N) +- N = Number of list Nodes + +By replicating merge sort with a linked list, we can achieve a solution. +We need to make use of the fast/slow technique to find the middle node of the list. +Once found, we can sever the link at the middle node and create two linked lists. +Then its a matter of calling merge sort on these two lists and merging them together after. +Each merge sort will return a sorted linked list, so you will end up with two sorted linked list that you need to merge and return back up the call. + +``` +class Solution: + def sortList(self, head: ListNode) -> ListNode: + + def merge_sort_ll(head): + if head is None or head.next is None: + return head + prev, fast, slow = None, head, head + while fast and fast.next: + fast = fast.next.next + prev = slow + slow = slow.next + prev.next = None + l1 = merge_sort_ll(head) + l2 = merge_sort_ll(slow) + return merge(l1, l2) + + def merge(l1, l2): + curr = dummy = ListNode(0) + while l1 and l2: + if l1.val < l2.val: + l1, curr.next, curr = l1.next, l1, l1 + else: + l2, curr.next, curr = l2.next, l2, l2 + curr.next = l1 or l2 + return dummy.next + + return merge_sort_ll(head) +``` diff --git a/leetcode/medium/200_number_of_islands.md b/leetcode/medium/200_number_of_islands.md index d8a3d00..c32b651 100644 --- a/leetcode/medium/200_number_of_islands.md +++ b/leetcode/medium/200_number_of_islands.md @@ -5,134 +5,74 @@ - Space: O(N) - N = Number of elements in grid -Using a visited set and recursion to achieve a solution. -``` -class Solution: - def numIslands(self, grid: List[List[str]]) -> int: - def traverse_helper(grid): - n_islands = 0 - visited = set() - for y_index, y in enumerate(grid): - for x_index, x in enumerate(y): - if (x_index, y_index) not in visited and x == '1': - n_islands += 1 - traverse_islands_dfs_recursion(x_index, y_index, grid, visited) - return n_islands - - def traverse_islands_dfs_recursion(x, y, grid, visited): - if not within_bounds(x, y, grid) or (x,y) in visited or grid[y][x] == '0': - return - visited.add((x,y)) - for neighbor_x, neighbor_y in get_neighbors_gen(x, y, grid): - traverse_islands_dfs_recursion(neighbor_x, neighbor_y, grid, visited) - - def get_neighbors_gen(x, y, grid): - yield x, y-1 # top - yield x, y+1 # bottom - yield x-1, y # left - yield x+1, y # right - - def within_bounds(x, y, grid): - if y >= 0 and y < len(grid) and x >= 0 and x < len(grid[0]): - return True - return False - - return traverse_helper(grid) -``` +By going around the grid, we can DFS on the first '1' encountered per island. +The DFS will only call the recursion on neighboring elements if the element is a '1', else skip it. -## BFS Recursive Solution -- Runtime: O(N) -- Space: O(N) -- N = Number of elements in grid +We can save some space by reusing the given the grid. +You should ask the interviewer if you are allowed to modify the original grid. +We can then use another number such as "2" to represented an already visited island, therefore, no longer needing a visited set during our BFS or DFS. ``` +class Solution(object): + def numIslands(self, grid): + + def dfs(r, c): + grid[r][c] = '2' + for x, y in get_neighbors(r, c): + if grid[x][y] == '1': + dfs(x, y) + def get_neighbors(x, y): + dirs = [(1,0),(0,1),(-1,0),(0,-1)] + for _x, _y in dirs: + _x += x + _y += y + if 0 <= _x < len(grid) and 0 <= _y < len(grid[0]): + yield (_x, _y) + + n_islands = 0 + for r, row in enumerate(grid): + for c, col in enumerate(row): + if col == '1': + n_islands += 1 + dfs(r, c) + return n_islands ``` ## BFS Iterative Solution -- Runtime: O(N) -- Space: O(N) -- N = Number of elements in grid +- Runtime: O(N * M) +- Space: O(N * M) +- N = Number of Rows +- M = Number of Columns + +Remember BFS uses a queue, gathering all the neighboring elements and adding them into the queue. -Similar concept to the previous DFS recursion but now using BFS and a stack. -``` -class Solution: - def numIslands(self, grid: List[List[str]]) -> int: - def traverse_helper(grid): - n_islands = 0 - visited = set() - for y_index, y in enumerate(grid): - for x_index, x in enumerate(y): - if (x_index, y_index) not in visited and x == '1': - n_islands += 1 - traverse_islands_bfs_iterative(x_index, y_index, grid, visited) - return n_islands - - def traverse_islands_bfs_iterative(x, y, grid, visited): - stack = list() - stack.append((x,y)) - while len(stack) > 0: - x_index, y_index = stack.pop() - visited.add((x_index, y_index)) - for x_neighbor, y_neighbor in get_neighbors_gen(x_index, y_index, grid): - if within_bounds(x_neighbor, y_neighbor, grid) \ - and (x_neighbor, y_neighbor) not in visited \ - and grid[y_neighbor][x_neighbor] == '1': - stack.append((x_neighbor, y_neighbor)) - - def get_neighbors_gen(x, y, grid): - yield x, y-1 # top - yield x, y+1 # bottom - yield x-1, y # left - yield x+1, y # right - - def within_bounds(x, y, grid): - if y >= 0 and y < len(grid) and x >= 0 and x < len(grid[0]): - return True - return False - - return traverse_helper(grid) ``` +from collections import deque -## O(1) Space BFS Iterative Solution -- Runtime: O(N) -- Space: O(1) -- N = Number of elements in grid +class Solution(object): + def numIslands(self, grid): -We can achieve O(1) space by reusing the given the grid. You should ask the interviewer if you are allowed to modify the original grid. We can then use another number such as "-1" to represented an already visited island, therefore, no longer needing a visited set during our BFS or DFS. -``` -class Solution: - def numIslands(self, grid: List[List[str]]) -> int: - def traverse_helper(grid): - n_islands = 0 - for y_index, y in enumerate(grid): - for x_index, x in enumerate(y): - if x == '1': - n_islands += 1 - traverse_islands_bfs_iterative(x_index, y_index, grid) - return n_islands - - def traverse_islands_bfs_iterative(x, y, grid): - stack = list() - stack.append((x,y)) - while len(stack) > 0: - x_index, y_index = stack.pop() - grid[y_index][x_index] = '-1' - for x_neighbor, y_neighbor in get_neighbors_gen(x_index, y_index, grid): - if within_bounds(x_neighbor, y_neighbor, grid) \ - and grid[y_neighbor][x_neighbor] == '1': - stack.append((x_neighbor, y_neighbor)) - - def get_neighbors_gen(x, y, grid): - yield x, y-1 # top - yield x, y+1 # bottom - yield x-1, y # left - yield x+1, y # right - - def within_bounds(x, y, grid): - if y >= 0 and y < len(grid) and x >= 0 and x < len(grid[0]): - return True - return False - - return traverse_helper(grid) + def get_neighbors(x, y): + dirs = [(1,0),(0,1),(-1,0),(0,-1)] + for _x, _y in dirs: + _x += x + _y += y + if 0 <= _x < len(grid) and 0 <= _y < len(grid[0]): + yield (_x, _y) + + n_islands = 0 + for r, row in enumerate(grid): + for c, col in enumerate(row): + if col == '1': + n_islands += 1 + grid[r][c] == '2' + queue = deque([(r, c)]) + while queue: + x, y = queue.pop() + for _x, _y in get_neighbors(x, y): + if grid[_x][_y] == '1': + grid[_x][_y] = '2' + queue.appendleft((_x, _y)) + return n_islands ``` diff --git a/leetcode/medium/207_course_schedule.md b/leetcode/medium/207_course_schedule.md index 71cdea4..5dffdbd 100644 --- a/leetcode/medium/207_course_schedule.md +++ b/leetcode/medium/207_course_schedule.md @@ -11,39 +11,38 @@ The core of the problem is to find a cycle in the graph, as example 2 of the pro We will need to create a graph, as it is not provided to us, it can be an adjacent list or a matrix, doesn't matter. For any dfs, you will need a global visited and a local visited. -The global visited will tell us if we need to dfs starting at this node, this is to reduce run-time, else it will be O(N^N). +The global visited will tell us if we need to dfs starting at this node, this is to reduce run-time, else it will be O(N^2). The local visited is for when we are traversing the graph via. dfs and looking for cycles. -I decided to use a dictionary to simplify the code, -1 will be used during the dfs, then after the dfs, changed into a 1, showing that its already visited and has no cycles. You can always use two separate visited sets but I find the code gets clunky. - ``` from collections import defaultdict class Solution: def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool: - adj_list = self.create_adj_list(prerequisites) - visited = defaultdict(int) - for node in adj_list: - if not self.dfs(node, adj_list, visited): - return False - return True - - def dfs(self, node, adj_list, visited): - if visited[node] == -1: # currently visiting, cycle - return False - if visited[node] == 1: # already visited, no cycle + def create_graph(): + graph = defaultdict(list) + for course, prereq in prerequisites: + graph[course].append(prereq) + graph[prereq] + return graph + + def dfs(course, graph, visited, global_visited): + if course in visited: + return False # found cycle + if course in global_visited: + return True + visited.add(course) + global_visited.add(course) + for prereq in graph[course]: + if not dfs(prereq, graph, visited, global_visited): + return False + visited.remove(course) return True - visited[node] = -1 - for neighbor in adj_list[node]: - if not self.dfs(neighbor, adj_list, visited): + + graph = create_graph() # key: course, val: list of prereqs + global_visited = set() + for course in graph: + if not dfs(course, graph, set(), global_visited): # cycle return False - visited[node] = 1 return True - - def create_adj_list(self, prereqs): - adj_list = defaultdict(list) - for course, prereq in prereqs: - adj_list[course].append(prereq) - adj_list[prereq] - return adj_list ``` diff --git a/leetcode/medium/210_course_schedule_II.md b/leetcode/medium/210_course_schedule_II.md new file mode 100644 index 0000000..2f05696 --- /dev/null +++ b/leetcode/medium/210_course_schedule_II.md @@ -0,0 +1,51 @@ +# 210. Course Schedule II + +## Topological Sort +- Runtime: O(V + E) +- Space: O(V) +- V = Vertices +- E = Edges + +There are plenty of Topological Sort explainations. +I find this one most helpful: [Toplogical Sort Video](https://www.youtube.com/watch?v=eL-KzMXSXXI&t=671s) + +You can think of topological sort as an extension of DFS. + +Also the question has a misleading edge case, given numCourses = 1 and prerequisites = [], expected output is [0]. +They are just saying that there exists one course and that course is course 0. Essentially islands in the graph or courses with no prerequisites. + +``` +from collections import defaultdict + +class Solution: + def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]: + def get_adj_list(): + adj_list = defaultdict(list) + for course, prereq in prerequisites: + adj_list[course].append(prereq) + for n in range(numCourses): + adj_list[n] + return adj_list + + def top_sort(node, visited=set()): + if node in visited: # cycle + return False + if node in global_visited: + return True + visited.add(node) + global_visited.add(node) + for neighbor in adj_list[node]: + if not top_sort(neighbor, visited): + return False + ordering.append(node) + visited.remove(node) + return True + + adj_list = get_adj_list() + global_visited = set() + ordering = list() + for node in adj_list: + if not top_sort(node): + return [] + return ordering +``` diff --git a/leetcode/medium/236_lowest_common_ancestor_of_a_binary_tree.md b/leetcode/medium/236_lowest_common_ancestor_of_a_binary_tree.md new file mode 100644 index 0000000..b582c0d --- /dev/null +++ b/leetcode/medium/236_lowest_common_ancestor_of_a_binary_tree.md @@ -0,0 +1,49 @@ +# 236. Lowest Common Ancestor of a Binary Tree + +## Recursive Solution + +- Runtime: O(N) +- Space: O(H) +- N = Number of nodes in tree +- H = Height of tree + +From the perspective of a root node, there are only three options to consider. +- The LCA exists on the left sub-tree. +- The LCA exists on the right sub-tree. +- The LCA is the root node. + +To figure out if an LCA exists would mean we need to find p and q in either sub-tree. +If either are found, we have to let the parent know of their existence. + +The other question is when to evaluate these conditions. +We generally don't want to traverse a sub-tree if the LCA is already found, so if the recursion function returns the LCA, we should instead return it back up the tree. +Secondly, if LCA has not been found from either side, we need to know if either p or q were found. +So a number would be returned to the parent node. +With this number, we can check if our root is p or q and add it with the returned number found. +If the number happens to be 2, we found the LCA. +In summary, we need a post-order traversal recursion call. + +The worst case is that we have to traverse the entire tree to find p and q. However, we will never need more than height of the tree O(H) space to find p or q. + +``` +from collections import namedtuple + +LCA = namedtuple('LCA', ['n_found', 'lca']) + +class Solution: + def lowestCommonAncestor(self, root: 'TreeNode', p: 'TreeNode', q: 'TreeNode') -> 'TreeNode': + + def LCA_helper(root): + if root is None: + return LCA(n_found=0, lca=None) + left = LCA_helper(root.left) + if left.n_found == 2: + return left + right = LCA_helper(root.right) + if right.n_found == 2: + return right + n_found = left.n_found + right.n_found + (1 if root is p or root is q else 0) + return LCA(n_found=n_found, lca=root if n_found == 2 else None) + + return LCA_helper(root).lca +``` diff --git a/leetcode/medium/240_search_a_2D_matrix_II.md b/leetcode/medium/240_search_a_2D_matrix_II.md new file mode 100644 index 0000000..98335be --- /dev/null +++ b/leetcode/medium/240_search_a_2D_matrix_II.md @@ -0,0 +1,75 @@ +# 240. Search a 2D Matrix II + +## Seudo-Binary Search + +- Runtime: O(Rlog(C)) or O(Clog(R)) +- Space: O(1) +- R = Number of rows +- C = Number of columns + +If we binary search each row, we can find if a number exists on that row fairly easy. +However, the worst case is that we have to binary search every row for the target. +You can also binary search in the opposite direction by each column, will end up with the same solution. + +``` +class Solution: + def searchMatrix(self, matrix, target): + + def binary_search(nums): + left = 0 + right = len(nums)-1 + last_index = -1 + while left <= right: + mid = left + (right-left // 2) + if nums[mid] == target: + return mid + elif nums[mid] < target: # go right + last_index = mid + left = mid+1 + else: # go left + right = mid-1 + return last_index + + if len(matrix) == 0 or len(matrix[0]) == 0: + return False + for row in matrix: + if row[0] <= target <= row[-1]: + col_idx = binary_search(row) + if row[col_idx] == target: + return True + return False +``` + +## Best Solution + +- Runtime: O(R+C) +- Space: O(1) +- R = Number of rows +- C = Number of columns + +The previous solution partially used the properties of the sorted matrix, but not at the fullest extend. +Depending on where your starting point is on the matrix, for this example, the top-right most element in the matrix. +We can ask, does this element exist on this row? +We will check if the current element is greater than or equal to the target. +If yes, we move to the left, if not, we move down one row. +We repeat this until we have exhausted possible numbers. + +This method works because of the relationship that the current element shows whether the bottom portion is worth searching. +Couple that with whether the current row is worth searching once we reach a number that is larger than the target. +These two cases will eliminate our search space over this sorted matrix. + +``` +class Solution: + def searchMatrix(self, matrix, target): + if not any(matrix): + return False + row_idx, col_idx = 0, len(matrix[0])-1 # start at top-right most element + while row_idx < len(matrix) and col_idx >= 0: + if matrix[row_idx][col_idx] == target: + return True + elif matrix[row_idx][col_idx] > target: # go left + col_idx -= 1 + elif matrix[row_idx][col_idx] < target: # go down + row_idx += 1 + return False +``` diff --git a/leetcode/medium/253_meeting_rooms_II.md b/leetcode/medium/253_meeting_rooms_II.md new file mode 100644 index 0000000..0005dbc --- /dev/null +++ b/leetcode/medium/253_meeting_rooms_II.md @@ -0,0 +1,33 @@ +# 253. Meeting Rooms II + +## Sorting Solution +- Runtime: O(Nlog(N)) +- Space: O(N) +- N = Number of total start and end points in intervals + +By separating the start and end points of each interval, we can then sort them based on their time, if the times are the same, we can break them depending on if its a start or end point. + +This allows us to perform one pass over the sorted points to find number of occupied rooms. +If its a start point, we can increment the number of rooms and vice versa. + +The one edge case to consider is an input like [[0,5], [5,10]] which results in 1 room needed. +Notice that the start and end points overlap between the two intervals. This means when sorting the points, we should give precedence for end points over start points during tiebreakers. + +``` +from collections import namedtuple + +Point = namedtuple('Point', ['time', 'is_start']) + +class Solution: + def minMeetingRooms(self, intervals: List[List[int]]) -> int: + points = [Point(time=x[0], is_start=1) for x in intervals] + [Point(time=x[1], is_start=0) for x in intervals] + points.sort(key=lambda x: (x.time, x.is_start)) + n_used_rooms = max_rooms = 0 + for point in points: + if point.is_start: + n_used_rooms += 1 + else: + n_used_rooms -= 1 + max_rooms = max(max_rooms, n_used_rooms) + return max_rooms +``` diff --git a/leetcode/medium/300_longest_increasing_subsequence.md b/leetcode/medium/300_longest_increasing_subsequence.md index a0f63db..bedd027 100644 --- a/leetcode/medium/300_longest_increasing_subsequence.md +++ b/leetcode/medium/300_longest_increasing_subsequence.md @@ -1,33 +1,31 @@ # 300. Longest Increasing Subsequence -## Iterative Solution -- Runtime: O(N^2) +## Dynamic Programming Solution +- Run-time: O(N^2) - Space: O(N) - N = Number of elements in array If we were to start from the left to the right, we would have seen the longest subsequence on the left side as we are going to the right. -Inorder for us to know what those longest subsequences were, we will need a way to store that, hello dynamic programming. +In-order for us to know what those longest subsequences were, we will need a way to store that, hello dynamic programming. -For each element, we would need look at the numbers less than the current element on the left side. +For each element, we would need look at the numbers on the left that are less than the current element. Now that we know which numbers those are, we can then look at their corresponding longest subsequence in the dynamic programming array. -From this list, get the longest subsequence. -This will tell us what to set as our longest subsequence for this current element + 1. - -We are basically building the longest increasing subsequence from the bottom up. +Therefore, dp[i] = max(for previous dp[j] elements which num[i] < num[j]). **Example:** +``` +Input(I): [10,9,2,5,3,7,101,18] -I(Input): [10,9,2,5,3,7,101,18] - -DP: [1,1,1,1,1,1,1,1] - -1. DP[1] = 1 + max([]) **(I[1] not > I[0])** -2. DP[2] = 1 + max([]) **(I[2] not > I[0],I[1])** -3. DP[3] = 1 + max(DP[2]) **(I[3] > I[2] and I[3] not > I[0],I[1])** -4. DP[4] = 1 + max(DP[2]) **(I[4] > I[2] and I[4] not > I[0],I[1],I[3])** -5. DP[5] = 1 + max(DP[1], DP[2], DP[3]) **(I[5] > I[1],I[2],I[3] and I[5] not > I[0])** -6. DP[6] = 1 + max(DP[0], DP[1], DP[2], DP[3], DP[4], DP[5]) **(I[6] > all prev numbers)** -7. DP[7] = 1 + max(DP[0], DP[1], DP[2], DP[3], DP[4], DP[5]) **(I[7] > all prev numbers except I[6])** +DP: +10 [1, 1, 1, 1, 1, 1, 1, 1] +9 [1, 1, 1, 1, 1, 1, 1, 1] -> DP[1] = 1 + max([]) (I[1] not > I[0]) +2 [1, 1, 1, 1, 1, 1, 1, 1] -> DP[2] = 1 + max([]) (I[2] not > I[0],I[1]) +5 [1, 1, 1, 2, 1, 1, 1, 1] -> DP[3] = 1 + max(DP[2]) (I[3] > I[2] and I[3] not > I[0],I[1]) +3 [1, 1, 1, 2, 2, 1, 1, 1] -> DP[4] = 1 + max(DP[2]) (I[4] > I[2] and I[4] not > I[0],I[1],I[3]) +7 [1, 1, 1, 2, 2, 3, 1, 1] -> DP[5] = 1 + max(DP[1], DP[2], DP[3]) (I[5] > I[1],I[2],I[3] and I[5] not > I[0]) +101 [1, 1, 1, 2, 2, 3, 4, 1] -> DP[6] = 1 + max(DP[0], DP[1], DP[2], DP[3], DP[4], DP[5]) (I[6] > all prev numbers) +18 [1, 1, 1, 2, 2, 3, 4, 4] -> DP[7] = 1 + max(DP[0], DP[1], DP[2], DP[3], DP[4], DP[5]) (I[7] > all prev numbers except I[6]) +``` ``` class Solution: @@ -37,3 +35,58 @@ class Solution: dp[index] = 1 + max([dp[j] for j in range(index) if nums[index] > nums[j]], default=0) return max(dp, default=0) ``` + +## Binary Search Solution +- Run-time: O(Nlog(N)) +- Space: O(N) +- N = Number of elements in array + +If we instead used a sorted array to keep a subsequence, we can decide whether or not we can create a larger subsequence with binary search. +For each n, we can binary search the sorted array for the left-most number that is larger or equal to n. +If that number is found, we can replace that number with n. +If no number is found, we can append n to the end of the array, this basically means we can create a larger subsequence. + +Since we replace numbers in the sorted array, the sorted array does not represent the 'actual' longest subsequence. +We need to replace numbers in the array because we can potentially create a larger subsequence. +For example, given an input of [1,2,3,99,4,5,6], when we reach number 5, if we didn't replace 99 in the sorted array, we can never increase the subsequence by 1 when the sorted array was [1,2,3,99] instead of [1,2,3,4]. + +``` +Input: [1,3,6,7,9,4,10,5,6] + +[] +[1] +[1, 3] +[1, 3, 6] +[1, 3, 6, 7] +[1, 3, 6, 7, 9] +[1, 3, 4, 7, 9] +[1, 3, 4, 7, 9, 10] +[1, 3, 4, 5, 9, 10] +[1, 3, 4, 5, 6, 10] +``` + +``` +class Solution(object): + def lengthOfLIS(self, nums): + + def binary_search_insertion_idx(n): + left, right = 0, len(sorted_subseq) - 1 if len(sorted_subseq) != 0 else -1 + insert_idx = -1 + while left <= right: + mid_idx = left + ((right - left) // 2) + if sorted_subseq[mid_idx] >= n: + insert_idx = mid_idx + right = mid_idx - 1 # go left + else: # go right + left = mid_idx + 1 + return insert_idx + + sorted_subseq = list() + for n in nums: + idx = binary_search_insertion_idx(n) + if idx == -1: # no number found greater than n + sorted_subseq.append(n) + else: # found a number that is greater than n + sorted_subseq[idx] = n + return len(sorted_subseq) +``` diff --git a/leetcode/medium/307_range_sum_query_mutable.md b/leetcode/medium/307_range_sum_query_mutable.md new file mode 100644 index 0000000..93d7143 --- /dev/null +++ b/leetcode/medium/307_range_sum_query_mutable.md @@ -0,0 +1,88 @@ +# 307. Range Sum Query - Mutable + +## Segment Tree Solution +- Run-time: create_tree() is O(N), sumRange() is O(logN), update() is O(logN) +- Space: O(N) +- N = Number of given nums + +A segment tree is similar to a binary tree, in fact, the only difference is that segment trees traversal via. the range of indexes. +Each node of the tree, instead of storing a value, stores the values between the range of indexes. +Segment trees are good if you have a lot of numbers and need to either find the max, min, sum, etc... of a given range. + +Naive methods would require O(N) time traversal to find the result of a given range. +Other methods use a 2d array of O(N^2) space and O(N^2) run-time to pre-process every range but O(1) look up after. +These methods don't work well when there are billions of numbers, therefore, segment trees are a great alternative. + +``` +class NumArray: + + def __init__(self, nums: List[int]): + self.tree = SegmentTree(nums) + + def update(self, i: int, val: int) -> None: + self.tree.update(i, val) + + def sumRange(self, i: int, j: int) -> int: + return self.tree.get_range(i, j) + +class SegmentTree(object): + + def __init__(self, nums): + self.root = self._create_tree(nums) + + def _create_tree(self, nums): + + def tree_builder(left, right): + if left > right: + return None + if left == right: + return Node(nums[left], left, right) + mid_idx = (left + right) // 2 + left_node = tree_builder(left, mid_idx) + right_node = tree_builder(mid_idx + 1, right) + total_sum = left_node.sum if left_node is not None else 0 + total_sum += right_node.sum if right_node is not None else 0 + new_node = Node(total_sum, start=left, end=right, left=left_node, right=right_node) + return new_node + + return tree_builder(0, len(nums) - 1) + + def update(self, idx, val): + + def update_helper(curr): + if curr.start_idx == curr.end_idx == idx: # leaf + curr.sum = val + return + mid_idx = (curr.start_idx + curr.end_idx) // 2 + if idx <= mid_idx: # go left + update_helper(curr.left) + else: # go right + update_helper(curr.right) + curr.sum = curr.left.sum + curr.right.sum + + update_helper(self.root) + + def get_range(self, l, r): + + def sum_helper(curr, left, right): + if left == curr.start_idx and right == curr.end_idx: # total overlap + return curr.sum + mid_idx = (curr.start_idx + curr.end_idx) // 2 + if right <= mid_idx: # range is only on the left subtree? + return sum_helper(curr.left, left, right) + elif left >= mid_idx + 1: # range is only on the right subtree? + return sum_helper(curr.right, left, right) + # ranges are in both left and right subtrees + return sum_helper(curr.left, left, mid_idx) + sum_helper(curr.right, mid_idx + 1, right) + + return sum_helper(self.root, l, r) + +class Node(object): + + def __init__(self, _sum, start, end, left=None, right=None): + self.start_idx = start + self.end_idx = end + self.sum = _sum + self.left = left + self.right = right +``` diff --git a/leetcode/medium/322_coin_change.md b/leetcode/medium/322_coin_change.md index 5c1ccaf..f5c8a96 100644 --- a/leetcode/medium/322_coin_change.md +++ b/leetcode/medium/322_coin_change.md @@ -30,3 +30,21 @@ class Solution: amount_to_min_n_coins[curr_amount-coin]+1) return amount_to_min_n_coins[amount] if amount_to_min_n_coins[amount] <= amount else -1 ``` + +``` +import sys + +class Solution: + def coinChange(self, coins: List[int], amount: int) -> int: + coins = list(filter(lambda x: x <= amount, coins)) # remove coins out of range + min_amounts = [-1] * (amount+1) + min_amounts[0] = 0 + for a in range(1, amount+1): + curr_min = sys.maxsize + for coin in coins: + prev_min = a - coin + if prev_min >= 0 and min_amounts[prev_min] != -1: + curr_min = min(curr_min, min_amounts[prev_min]+1) + min_amounts[a] = curr_min if curr_min != sys.maxsize else -1 + return min_amounts[amount] +``` diff --git a/leetcode/medium/347_top_k_frequent_elements.md b/leetcode/medium/347_top_k_frequent_elements.md new file mode 100644 index 0000000..528bc3c --- /dev/null +++ b/leetcode/medium/347_top_k_frequent_elements.md @@ -0,0 +1,28 @@ +## Heap Solution + +- Runtime: O(Nlog(K)) +- Space: O(K) +- N = Number of elements in array +- K = K frequent elements + +We can first iterate through the numbers and count their occurances, we can store this into a dictionary, key: number and value: occurance. +Then we can iterate a second time but with the dictionary. +Since the question wants to know the K most frequent elements, we can sort them, but instead of sorting the entirety of the dictionary elements, we can just sort K amount. +This can be achieved by using a heap of K size. +If we find an occurance that is greater than what is on top of the heap, we can pop it off and add our new element. +This means a min heap would be great for this. + +``` +from collections import Counter + +class Solution: + def topKFrequent(self, nums: List[int], k: int) -> List[int]: + counter_map = Counter(nums) + min_heap = list() + for num, counter in counter_map.items(): + if len(min_heap) == k: + heapq.heappushpop(min_heap, (counter, num)) + else: + heapq.heappush(min_heap, (counter, num)) + return map(lambda x: x[1], min_heap) +``` diff --git a/leetcode/medium/348_design_tic-tac-toe.md b/leetcode/medium/348_design_tic-tac-toe.md new file mode 100644 index 0000000..c2eeb9f --- /dev/null +++ b/leetcode/medium/348_design_tic-tac-toe.md @@ -0,0 +1,70 @@ +# 348. Design Tic-Tac-Toe + +## Solution + +- Runtime: O(1) +- Space: O(1) +- N = Given N + +This is a fair production code quality example. +Make sure when you code, you break down your methods else you may fail this question. + +To achieve O(1) run-time, we can use a summation system for each row, column and the two diagonals. +Player 1 will increment by 1 and player 2 will decrement by 1. +This means that for any given row, column or diagonal, if either equal N or -N, there is a straight line for the same player. + +``` +# (0,0) (0,1) (0,2) (0,3) +# (1,0) (1,1) (1,2) (1,3) +# (2,0) (2,1) (2,2) (2,3) +# (3,0) (3,1) (3,2) (3,3) + +class TicTacToe: + + def __init__(self, n: int): + """ + Initialize your data structure here. + """ + self.row_sum = [0] * n + self.col_sum = [0] * n + self.diagonal1 = 0 + self.diagonal2 = 0 + self.n = n + + def move(self, row: int, col: int, player: int) -> int: + """ + Player {player} makes a move at ({row}, {col}). + @param row The row of the board. + @param col The column of the board. + @param player The player, can be either 1 or 2. + @return The current winning condition, can be either: + 0: No one wins. + 1: Player 1 wins. + 2: Player 2 wins. + """ + increment = (1 if player == 1 else -1) + self.row_sum[row] += increment + self.col_sum[col] += increment + if row == col: + self.diagonal1 += increment + if row + col == self.n-1: + self.diagonal2 += increment + if any([self.check_horizontal(player, row), \ + self.check_vertical(player, col), \ + self.check_diagonals(player, row, col)]): + return player + return 0 + + def check_horizontal(self, player, row): + return abs(self.row_sum[row]) == self.n + + def check_vertical(self, player, col): + return abs(self.col_sum[col]) == self.n + + def check_diagonals(self, player, row, col): + return abs(self.diagonal1) == self.n or abs(self.diagonal2) == self.n + +# Your TicTacToe object will be instantiated and called as such: +# obj = TicTacToe(n) +# param_1 = obj.move(row,col,player) +``` diff --git a/leetcode/medium/380_insert_delete_getRandom_O(1).md b/leetcode/medium/380_insert_delete_getRandom_O(1).md new file mode 100644 index 0000000..5198643 --- /dev/null +++ b/leetcode/medium/380_insert_delete_getRandom_O(1).md @@ -0,0 +1,59 @@ +# 380. Insert Delete GetRandom O(1) + +## Solution +- Run-time: O(1) +- Space: O(N) +- N = Number of values + +When we insert or delete at O(1), we think of a dictionary. +When we getRandom() at O(1) we think of a randomized index from an array. +If we merge the two, we can achieve O(1). +We just have to do some clever swapping and popping from the last element of the array. + +If we use the dictionary to store the value's index in relation to the array. +When we insert, we can insert the new value to the end of the array and keep its value to index relationship in the dictionary. + +When it comes to removing the value, we can fetch the value's corresponding index then swap it with the last element in the array. +Then we can just pop that element from the array. +This will help us achieve O(1) run-time. + +``` +class RandomizedSet: + + def __init__(self): + """ + Initialize your data structure here. + """ + self.val_to_idx = dict() + self.values = list() + + def insert(self, val: int) -> bool: + """ + Inserts a value to the set. Returns true if the set did not already contain the specified element. + """ + if val in self.val_to_idx: + return False + self.values.append(val) + self.val_to_idx[val] = len(self.values) - 1 + return True + + + def remove(self, val: int) -> bool: + """ + Removes a value from the set. Returns true if the set contained the specified element. + """ + if val not in self.val_to_idx: + return False + idx = self.val_to_idx[val] + self.val_to_idx[self.values[-1]] = idx + self.values[idx], self.values[-1] = self.values[-1], self.values[idx] + del self.val_to_idx[val] + self.values.pop() + return True + + def getRandom(self) -> int: + """ + Get a random element from the set. + """ + return self.values[random.randrange(0, len(self.values))] +``` diff --git a/leetcode/medium/394_decode_string.md b/leetcode/medium/394_decode_string.md new file mode 100644 index 0000000..04d36f7 --- /dev/null +++ b/leetcode/medium/394_decode_string.md @@ -0,0 +1,65 @@ +# 394. Decode String + +## Iterative Stack Solution + +- Runtime: O(N) +- Space: O(N) +- N = Number of elements in array + +This was actually a Google onsite interview question given to me once. + +Given "3[a]2[bc]". This is how the stack should look like going from left to right. + +``` +['3'] +['3', '['] +['3', '[', 'a'] +['aaa'] +['aaa', '2'] +['aaa', '2', '['] +['aaa', '2', '[', 'b'] +['aaa', '2', '[', 'b', 'c'] +['aaa', 'bcbc'] +``` + +Assuming you understand the stack, we can just go left to right and store each character into the stack. +When we get a ']', we have to concatenate or calculate the sub-string. +The reason we need to store the '[' is to know when to stop and gather the number for each string. +The '[' will also help when we have k[encoded_string] patterns within one another. + +For example: "3[a2[c]]" + +``` +['3'] +['3', '['] +['3', '[', 'a'] +['3', '[', 'a', '2'] +['3', '[', 'a', '2', '['] +['3', '[', 'a', '2', '[', 'c'] +['3', '[', 'a', 'cc'] +['accaccacc'] +``` + +As in recursion, we need to concatenate each sub-answer together in the end. +We can have a multiple sub-strings in our stack. + +``` +class Solution: + def decodeString(self, s: str) -> str: + stack = list() + for ch in s: + if ch == ']': + string = '' + while len(stack) != 0 and stack[-1] != '[': + string = stack.pop() + string + stack.pop() # pop '[' + num = '' + while len(stack) != 0 and stack[-1].isdigit(): + num = stack.pop() + num + if num == '': + num = 1 + stack.append(string * int(num)) + else: + stack.append(ch) + return ''.join(stack) +``` diff --git a/leetcode/medium/399_evaluate_division.md b/leetcode/medium/399_evaluate_division.md new file mode 100644 index 0000000..dcc3876 --- /dev/null +++ b/leetcode/medium/399_evaluate_division.md @@ -0,0 +1,87 @@ +# 399. Evaluate Division + +## DFS Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of unique nodes + +The first intuition is to see this relationship, if A/B = 2 then B/A = 1/2. +This is rather simple math. +Once this is noticed, you can see that there is a realtionship pattern here and a graph approach is possible. +After that you can build a graph out of the equations and values given and perform a DFS or BFS for the solution. + +``` +from collections import defaultdict + +class Solution: + def calcEquation(self, equations: List[List[str]], values: List[float], queries: List[List[str]]) -> List[float]: + + def create_graph(): + graph = defaultdict(dict) + for equation, val in zip(equations, values): + start, end = equation + graph[start][end] = val + graph[end][start] = 1.0 / val + return graph + + def dfs(graph, start, end, visited): + if start in visited or start not in graph: + return -1.0 + if start == end: + return 1.0 + visited.add(start) + for neighbor, val in graph[start].items(): + result = dfs(graph, neighbor, end, visited) + if result > 0: + return result * val + return -1.0 + + graph = create_graph() + results = list() + for start, end in queries: + results.append(dfs(graph, start, end, set())) + return results +``` + +## BFS Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of unique nodes + +``` +from collections import deque +from collections import defaultdict + +class Solution: + def calcEquation(self, equations: List[List[str]], values: List[float], queries: List[List[str]]) -> List[float]: + def create_graph(): + graph = defaultdict(dict) + for eq, val in zip(equations, values): + start, end = eq + graph[start][end] = val + graph[end][start] = 1.0 / val + return graph + + def bfs(start, end, results): + queue = deque([(start, 1.0)]) + visited = set([start]) + while len(queue) != 0: + node, curr_prod = queue.pop() + if node not in graph: + continue + if node == end: + results.append(curr_prod) + break + for neighbor, val in graph[node].items(): + if neighbor not in visited: + visited.add(neighbor) + queue.appendleft((neighbor, curr_prod * val)) + else: + results.append(-1.0) + + graph = create_graph() + results = list() + for start, end in queries: + bfs(start, end, results) + return results +``` diff --git a/leetcode/medium/416_partition_equal_subset_sum.md b/leetcode/medium/416_partition_equal_subset_sum.md new file mode 100644 index 0000000..8274902 --- /dev/null +++ b/leetcode/medium/416_partition_equal_subset_sum.md @@ -0,0 +1,107 @@ +# 416. Partition Equal Subset Sum + +## Recursive Solution +- Run-time: O(2^N) +- Space: O(N) +- N = Number of Nums + +Splitting an array into equal parts can be rephrased to, find a sub-array that is half of the total sum. +With this we can generate a solution that focuses on finding a combination of numbers that contain such a sum. + +To find this target sum, we can use recursion, simply ask two questions, do we want this number or don't want this number? + +Lastly, there are some optimizations we can do, if we have an input of [1,1,1,...,1,1,100] that cannot be partitioned. +We would end up repeating recursion calls for each 1, to keep 1 or not. +When we end up not using the number, we would ask the same 2 questions again. +This creates a time limit exceeded issue and we can optimize this by skipping repeated numbers when we don't want to use this number. + +``` +class Solution: + def canPartition(self, nums: List[int]) -> bool: + + def does_subset_sum_exist(target, idx=0): + if target == 0: + return True + if target < 0 or idx > len(nums)-1: + return False + if does_subset_sum_exist(target-nums[idx], idx+1): + return True + while idx+1 < len(nums) and nums[idx] == nums[idx+1]: # skip duplicate elements + idx += 1 + return does_subset_sum_exist(target, idx+1) + + if len(nums) == 0: + return True + num_sum = sum(nums) + if num_sum % 2: # odd + return False + return does_subset_sum_exist(num_sum//2) +``` + +## Dynamic Programming Solution +- Run-time: O(T * N) +- Space: O(T * N) +- N = Number of Nums +- T = Sum of Nums divided by 2 + + +The sub-problem of finding the subset sum is essentially a 0/1 knapsack problem. +This particular variant of the knapsack problem can be solved using a 2d array of booleans. +To identify that this is a 0/1 is the fact we cannot split the numbers, if we could then a greedy algorithm would work here. If splitting was allowed, we could sort it then take the smallest numbers until we reach the max and split the difference for the result. + +We can construct a solution starting at sum 0 to the target sum. +Each column will represent the nums and each row will be the sum. + +``` +Given [1,2,3,4] +Rows: sums from 0 to target sum +Columns: numbers + +Initial DP: + 0 1 2 3 4 +0 [[T, F, F, F, F], +1 [F, F, F, F, F], +2 [F, F, F, F, F], +3 [F, F, F, F, F], +4 [F, F, F, F, F], +5 [F, F, F, F, F]] + +Final DP: + 0 1 2 3 4 +0 [[T, T, T, T, T], +1 [F, T, T, T, T], +2 [F, F, T, T, T], +3 [F, F, T, T, T], +4 [F, F, F, T, T], +5 [F, F, F, T, T]] +``` + +The intuition is gathered by using the two questions we asked earlier, whether to keep this number or not. + +To not keep this number is simply looking at the current sum and the previous number. +That is because if we could not use this number which can be defined that previous number + 0, then the previous number needs to be True to add up to this current sum. + +To keep this number, we have to look at the current sum - current number of the previous number. +That would mean some previous sum from nums[:i] must have been True for us to perform sum(nums[:i]) + nums[i] == current sum. + +dp\[curr_sum][curr_num] = dp\[curr_sum][prev_num] or dp\[curr_sum-curr_num][prev_num] + +``` +class Solution: + def canPartition(self, nums: List[int]) -> bool: + + def does_subset_sum_exist(target): + dp = [[False] * (len(nums)+1) for _ in range(target+1)] + dp[0][0] = True + for curr_sum in range(0, target+1): + for n_idx, num in enumerate(nums, 1): + dp[curr_sum][n_idx] = dp[curr_sum][n_idx-1] or (dp[curr_sum-num][n_idx-1] if curr_sum-num >= 0 else False) + return dp[-1][-1] + + if len(nums) == 0: + return True + num_sum = sum(nums) + if num_sum % 2: # odd + return False + return does_subset_sum_exist(num_sum//2) +``` diff --git a/leetcode/medium/449_serialize_and_deserialize_BST.md b/leetcode/medium/449_serialize_and_deserialize_BST.md new file mode 100644 index 0000000..dd65236 --- /dev/null +++ b/leetcode/medium/449_serialize_and_deserialize_BST.md @@ -0,0 +1,44 @@ +# 449. Serialize and Deserialize BST + +## Solution +- Runtime: O(N) +- Space: O(N) +- N = Number of nodes in tree + +We can build and save the tree with either preorder or postorder traversals. +This solution reuses the solution from question 1008. + +``` +class Codec: + + curr_idx = 0 + + def serialize(self, root): + + def get_preorder(root): + if root is None: + return '' + return ','.join([str(root.val), get_preorder(root.left), get_preorder(root.right)]) + + return get_preorder(root) + + + def deserialize(self, data): + + def get_bst_from_preorder(preorder): + + def bst_builder(left_bound=float('-inf'), right_bound=float('inf')): + if self.curr_idx >= len(preorder) or not left_bound < preorder[self.curr_idx] <= right_bound: + return None + val = preorder[self.curr_idx] + root = TreeNode(preorder[self.curr_idx]) + self.curr_idx += 1 + root.left = bst_builder(left_bound, val) + root.right = bst_builder(val, right_bound) + return root + + preorder = [int(x) for x in preorder.split(',') if x != ''] + return bst_builder() + + return get_bst_from_preorder(data) +``` diff --git a/leetcode/medium/494_target_sum.md b/leetcode/medium/494_target_sum.md index 97450ea..a249e0f 100644 --- a/leetcode/medium/494_target_sum.md +++ b/leetcode/medium/494_target_sum.md @@ -1,7 +1,7 @@ # 494. Target Sum ## Recursive Brute Force -- Runtime: 2^N +- Run-time: 2^N - Space: 2^N - N = Number of elements in array @@ -12,22 +12,19 @@ Noticed how the run-time is not big O of 2^N, its because this brute force will class Solution: def findTargetSumWays(self, nums: List[int], S: int) -> int: - def find_sum_ways_helper(nums, curr_sum, start_i): - if curr_sum == S and start_i >= len(nums): + def sum_helper(curr_sum, idx): + if curr_sum == S and idx >= len(nums): return 1 - elif start_i >= len(nums): + if idx >= len(nums): return 0 - n_sums = 0 - n_sums += find_sum_ways_helper(nums, curr_sum + nums[start_i], start_i+1) - n_sums += find_sum_ways_helper(nums, curr_sum - nums[start_i], start_i+1) - return n_sums + return sum_helper(curr_sum+nums[idx], idx+1) + sum_helper(curr_sum-nums[idx], idx+1) - return find_sum_ways_helper(nums, 0, 0) + return sum_helper(0, 0) ``` ## Iterative Solution with Map -- Runtime: O(N^2) -- Space: O(N^2) +- Run-time: O(2^N) +- Space: O(2^N) - N = Number of elements in array We can use a dictionary to keep track of the sums and how many paths there are for each sum. @@ -35,11 +32,22 @@ We just need to maintain a rolling dictionary as we traverse across the numbers. Each traversal we will create new sums and add them into a new dictionary. We will move the values across from the old dictionary as well. -For the run-time, you may think that the run-time hasn't changed. -Why is this an improvement? -An input like [1,10,100,1000,10000...] will achieve N^2 run time. -However, given any other input, since its add and subtract and not multiply or divide, its unlikely and its more likely we will have overlapping sums. -So the run time is actually less than O(2^N) on most cases while the brute force solution above will always be ran at 2^N. +The dictionary is used to exploit the fact that there can be overlapping sums. +You can imagine the dictionary used for each height/level of the recursion tree, gathering all the sums from the previous summation and reusing it to recalcuate for the current height. + +``` + Sums for each height, Key: sum, Val: n_paths + 1 {1: 1, -1: 1} + +/ \- + 1 1 {2: 1, 0: 2, -2: 1} + +/ \- +/ \- + 1 1 1 1 {3: 1, 1: 3, -1: 3, -3: 1} +``` + +You may think to yourself that the run-time hasn't changed. +You are correct, if given a set of numbers that would create unique sums for each height of the tree, this would end up being O(2^N). +However, since this question is done with addition and subtract, it is more likely there will be overlapping sums. +So the run-time is actually less than O(2^N) for the average case. ``` from collections import defaultdict diff --git a/leetcode/medium/518_coin_change_2.md b/leetcode/medium/518_coin_change_2.md new file mode 100644 index 0000000..e4a2fc7 --- /dev/null +++ b/leetcode/medium/518_coin_change_2.md @@ -0,0 +1,78 @@ +# 518. Coin Change 2 + +## Sub-Optimal Dynamic Programming Solution +- Runtime: O(A(C^2)) +- Space: O(AC) +- C = Number of Coins +- A = Amount + +In this first attempt, I will use this example to illustrate the eventual optimal solution. +To understand the solution lets look at this example: + +Given coins: [1,2,5] +``` +Amounts : Coin Combos : Total Combos +0: [0] = 1 +1: [1] = 1 +2: [1+1, 2] = 2 +3: [1+1+1, 2+1] = 2 +4: [1+1+1+1, 2+1+1, 2+2] = 3 +5: [1+1+1+1+1, 2+1+1+1, 2+2+1, 5] = 4 +``` + +You should notice a pattern here, if we are at amount 5, we can figure out the combinations by using the previous combinations. +So if we are at amount=5 and coin=2, we can take 5-2=3, which is our previous amount. +We find that at amount=3 and coin=2, there is 2+1 as a combination, so we can create 2+1+2 as a new combination. + +You can also think of this by two questions, given a previous amount, what are the number of combinations with my coin? +Also, what are the number of combinations without my coin? + +Since the question is asking for minimum combinations, we can instead just store the number of combinations per coin for each amount as a dynamic programming 2d array. +Rows are the amounts and the coins are the columns. + +However, you will notice that we have to figure out all the previous combinations for all the coins. +This will eat up our run-time. + +``` +class Solution: + def change(self, amount: int, coins: List[int]) -> int: + coins = list(filter(lambda x: x <= amount, coins)) + n_combos = [[0] * (len(coins)+1) for _ in range(amount+1)] + n_combos[0][0] = 1 + for a in range(amount+1): + for c_idx, coin in enumerate(coins, 1): + if coin == a: + n_combos[a][c_idx] += 1 + else: + prev_amount = a - coin + n_combos[a][c_idx] += sum(n_combos[prev_amount][c] for c in range(0, c_idx+1)) + return sum(n_combos[-1]) +``` + +## Optimal Dynamic Programming Solution +- Runtime: O(AC)) +- Space: O(C) +- C = Number of Coins +- A = Amount + +The previous example was essentially calculating all permutations of the solution. + +Instead of building a 2d array, we can just build a 1d array, each element representing number of combinations while each column represents 0 to total amounts. + +By condensing each row from the previous solution, we no longer need to sum up all previous combinations. +We can just take all combinations found so far from the last coin to then build upon to the next coin. + +So it is important that the loops are in this order, instead of iterating by amounts first, we will iterate by coins. +Remember the idea is to build the next combination from the previous coin, so if we start at coin 1, we will be looking back at coin 0 to see what its combinations were, then we move one to coin 2, coin 2 will be looking back at what coin 1 did, then coin 5 will be looking at coin 2, so on and so forth. + +``` +class Solution: + def change(self, amount: int, coins: List[int]) -> int: + coins = list(filter(lambda x: x <= amount, coins)) + n_combos = [0] * (amount+1) + n_combos[0] = 1 + for coin in coins: + for a in range(coin, amount+1): + n_combos[a] += n_combos[a-coin] + return n_combos[amount] +``` diff --git a/leetcode/medium/560_subarray_sum_equals_k.md b/leetcode/medium/560_subarray_sum_equals_k.md new file mode 100644 index 0000000..8341bbb --- /dev/null +++ b/leetcode/medium/560_subarray_sum_equals_k.md @@ -0,0 +1,63 @@ +# 560. Subarray Sum Equals K + +## Brute-Force Solution + +- Runtime: O(N^2) +- Space: O(1) +- N = Number of elements in array + +Simple iteration on all possible combinations of sub-arrays. + +``` +class Solution: + def subarraySum(self, nums: List[int], k: int) -> int: + n_subarrays = 0 + for start in range(len(nums)): + rolling_sum = 0 + for end in range(start, len(nums)): + rolling_sum += nums[end] + if rolling_sum == k: + n_subarrays += 1 + return n_subarrays +``` + +## Dictionary Solution + +- Runtime: O(N) +- Space: O(N) +- N = Number of elements in array + +To visialize this solution, let's take this example. +The dashes represent a solution set, result = 3. +Let's look at one of the solution set. + +``` +k=1 +[-1,1,-1,1] + ------ <-- k + --------- <-- sum + -- <-- x + +x = sum - k +-1 = 1 - 1 +``` + +To find X, we need to take the current rolling sum and subtract it with k. +So this means, if we use a hash map (Key: sum, Value: occurance), iterate from the beginning to the end and store the current sums into the hash map. +We can then check if X exists in our hash map to see that we can make a sub-array. + +``` +from collections import defaultdict + +class Solution: + def subarraySum(self, nums: List[int], k: int) -> int: + sum_map = defaultdict(int) + rolling_sum = n_subarrays = 0 + sum_map[0] = 1 + for n in nums: + rolling_sum += n + if sum_map[rolling_sum-k] != 0: + n_subarrays += sum_map[rolling_sum-k] + sum_map[rolling_sum] += 1 + return n_subarrays +``` diff --git a/leetcode/medium/621_task_scheduler.md b/leetcode/medium/621_task_scheduler.md index bf2bce2..d9696d1 100644 --- a/leetcode/medium/621_task_scheduler.md +++ b/leetcode/medium/621_task_scheduler.md @@ -1,10 +1,9 @@ # 621. Task Scheduler ## Heap solution -- Runtime: O(Nlog(U)) -- Space: O(U) +- Runtime: O(N) or O(N(log(26)) +- Space: O(1) or 26 - N = Number of elements in array -- U = Number of unique elements in array This question requires a greedy algothrim. We want to use the task that occurs the most first so we can reduce the amount of idle time there is. @@ -30,27 +29,33 @@ ABCADEAFGA--A--A 16 ``` +We can first count the occurances for each character and place them into a heap. +Each element in the heap will represent a different character of A-Z. +We can then pop off at most N+1 amount of items from the heap. +Each popped off item will have their occurances decremented and placed back into the heap if non-zero. +We can repeat this process until there is nothing left in the heap. + +Since we will have at most 26 characters in the heap, due to the restriction of having only a A-Z character range. +We can then assume that sorting the heap will have a constant run-time of O(log(26)) or O(1). +However, we will have sort the heap O(N) times for each element in the input. +You can also think of it this way, since the heap will hold just occurances of each letter, the occurances all add up to N elements of the input array. + ``` from collections import Counter class Solution: def leastInterval(self, tasks: List[str], n: int) -> int: - n_intervals = 0 - ch_to_count = Counter(tasks) - max_heap = [-count for count in ch_to_count.values()] + counter = Counter(tasks) + max_heap = list([-freq for freq in counter.values()]) heapq.heapify(max_heap) - while len(max_heap) > 0: + n_intervals = 0 + while len(max_heap): popped_items = list() - for _ in range(n+1): - if len(max_heap) > 0: - popped_items.append(heapq.heappop(max_heap)) - else: - break - - max_heap += [count+1 for count in popped_items if count+1 != 0] - heapq.heapify(max_heap) - - n_intervals += len(popped_items) if len(max_heap) == 0 else n+1 - + for _ in range(min(len(max_heap), n+1)): + popped_items.append(heapq.heappop(max_heap)) + for freq in popped_items: + if freq != -1: + heapq.heappush(max_heap, freq+1) + n_intervals += n+1 if len(max_heap) else len(popped_items) return n_intervals ``` diff --git a/leetcode/medium/743_network_delay_time.md b/leetcode/medium/743_network_delay_time.md new file mode 100644 index 0000000..59c5822 --- /dev/null +++ b/leetcode/medium/743_network_delay_time.md @@ -0,0 +1,104 @@ +# 743. Network Delay Time + +## Simple Dijkstra's Algorithm Solution +- Run-time: O(V^2) +- Space: O(V) +- V = Number of Vertices + +This version of Dijkstra is fairly straightforward. +For each iteration from 1 to N. +We find the minimum distance of an unvisited vertex V. +Add it as visited and calculate if there is a better distance from K for all neighbors of V. +This can be calculated by using a dictionary of distances where the key is the node and the value is the distance, all initialized to infinite except K, which starts at 0. +We can then say that if the weight to the neighbor from V + distance of K to V is less than current shortest path of K to neighbor, then that is a shorter path. + +We end up with a dictionary where each distance represents K to the node. + +[Dijkstra's Algorithm Video](https://www.youtube.com/watch?v=pVfj6mxhdMw&t) + +``` +from collections import defaultdict + +class Solution: + def networkDelayTime(self, times: List[List[int]], N: int, K: int) -> int: + + def create_adj_list(times): + adj_list = defaultdict(list) + for source, to, weight in times: + adj_list[source].append((to, weight)) + return adj_list + + distances = defaultdict(lambda: float('inf')) + for n in range(1, N+1): + distances[n] + distances[K] = 0 + visited = set() + adj_list = create_adj_list(times) + while len(visited) != N: + distance, vertex = min([(d, v) for v, d in distances.items() if v not in visited]) + visited.add(vertex) + for neighbor, weight in adj_list[vertex]: + distances[neighbor] = min(distances[neighbor], distances[vertex] + weight) + result = max(distances.values()) + return result if result != float('inf') else -1 +``` + +## Dijkstra's Algorithm Solution with Heaps + +- Run-time: O((V + E)logV) +- Space: O(V) +- V = Vertices +- E = Edges + +As explained in question #1135, there is no adaptable heap implementation in Python 3.7. +Instead, we will be using a lazy deletion method, similar to question #1135. + +The only difference between Prim vs. Dijkstra is adding/updating with the previous weights in the heap nodes. +In Prim's we only care about the next weight. +In Dijkstra, since we want the shortest paths, we need to reevaluate the weights by checking if we can produce an even smaller weight. +Therefore, we need to compare between the shortest path to neighboring vertex vs. the current vertex's shortest path + current weight to neighboring vertex. + +``` +from collections import defaultdict + +class Solution: + def networkDelayTime(self, times: List[List[int]], N: int, K: int) -> int: + + def create_adj_list(): + adj_list = defaultdict(list) + for source, target, time in times: + adj_list[source].append([target, time]) + return adj_list + + def add_target(target, time): + new_node = [time, target, False] + target_to_heap_node[target] = new_node + heapq.heappush(min_heap, new_node) + + adj_list = create_adj_list() + start_node = [0, K, False] # time, source, remove + target_to_heap_node = dict({K: start_node}) # key=target, val=time + min_heap = list([start_node]) + times = defaultdict(lambda: float('inf')) # key=source, val=time + times[K] = 0 + visited = set() + while min_heap: + time, source, remove = heapq.heappop(min_heap) + if remove: # lazy delete + continue + visited.add(source) + for next_target, t in adj_list[source]: + if next_target in visited: + continue + next_time = t + time + if next_target in target_to_heap_node: + node = target_to_heap_node[next_target] + if next_time < node[0]: + node[2] = True # lazy delete + add_target(next_target, next_time) + else: + add_target(next_target, next_time) + target_to_heap_node.pop(source) + times[source] = min(times[source], time) + return max(times.values()) if len(times) == N else -1 +``` diff --git a/leetcode/medium/767_reorganize_string.md b/leetcode/medium/767_reorganize_string.md new file mode 100644 index 0000000..92f3c27 --- /dev/null +++ b/leetcode/medium/767_reorganize_string.md @@ -0,0 +1,50 @@ +# 767. Reorganize String + +## Greedy Heap Solution +- Runtime: O(Nlog(N)) +- Space: O(N) +- N = Number of characters in string + +Playing around with different examples like 'aabb', 'aabbc', 'aabbcc' or 'aaabbc'. +We can see a pattern where a greedy approach can be taken. +We can build the string by taking the most occurring element. +This leads us to using a max heap to determine this. +We will need a dictionary to count the occurrences and use that to store tuples of (occurrences, character) pairs into the heap. + +Another thing we notice is that there can be a scenario where the most occurring element on top of the heap is the same as the last character we just used to build the string, for example, 'aaaabb'. This means that we need to pop two elements from the heap to guarantee that we can use one of these characters. + +Last case is when we cannot build a valid string. +This can be determined if the occurrence of the last element in the heap is of one or not after the above sub-solution has be done processing and created the longest valid string possible. + +``` +from collections import Counter + +class Solution: + def reorganizeString(self, S: str) -> str: + counter = Counter(S) + max_heap = list((-v, k) for k, v in counter.items()) + heapq.heapify(max_heap) + str_builder = list() + while len(max_heap) >= 2: + val1, ch1 = heapq.heappop(max_heap) + val2, ch2 = heapq.heappop(max_heap) + if len(str_builder) == 0 or (len(str_builder) and str_builder[-1] != ch1): + str_builder.append(ch1) + if val1 != -1: + heapq.heappush(max_heap, (val1+1, ch1)) + else: + heapq.heappush(max_heap, (val1, ch1)) + if len(str_builder) and str_builder[-1] != ch2: + str_builder.append(ch2) + if val2 != -1: + heapq.heappush(max_heap, (val2+1, ch2)) + else: + heapq.heappush(max_heap, (val2, ch2)) + if len(max_heap): # last node in heap + val, ch = heapq.heappop(max_heap) + if val != -1: + return '' + else: + str_builder.append(ch) + return ''.join(str_builder) +``` diff --git a/leetcode/medium/912_sort_an_array.md b/leetcode/medium/912_sort_an_array.md new file mode 100644 index 0000000..c49c705 --- /dev/null +++ b/leetcode/medium/912_sort_an_array.md @@ -0,0 +1,102 @@ +# 912. Sort an Array + +## Quick Sort In-Place + +- Runtime: O(Nlog(N)), worst case O(N^2) +- Space: O(log(N)) +- N = Number of elements in list + +It is important to know the inner workings of a quicksort and how to implement it. + +Quick sort works by selecting a pivot, usually the last index of the array. +Then checking each number in the partition N times to the pivot. +We will need to keep a separate index to where the last unsorted number is in the array. +If we encounter a number that is less than or equal to the pivot, we would swap it with the last unsorted number and increment the last unsorted index. +Lastly, swap the pivot to the last unsorted index, this represents the only truely sorted number in the entire array. +The pivot is now in the correct place. +We then partition the array in two halves and call a recursion on each half, they will be figuring out where the correct placement of the next pivot should be. +This is where the log(N) comes into play, since each time we call the recursion, the array is half of what it use to be. + +If the pivot is always selected to be the largest number of the array, we can have a worst case scenario of O(N^2). +Hence, never creating two partitions. + +Note: There is an unstable version of in-place quick sort that has an O(1) space complexity. + +``` +class Solution(object): + def sortArray(self, nums): + + def quick_sort(l_idx, r_idx): + if l_idx >= r_idx: + return + pivot = nums[r_idx] + last_unsorted_idx = l_idx + for idx in range(l_idx, r_idx): + if nums[idx] <= pivot: + nums[last_unsorted_idx], nums[idx] = nums[idx], nums[last_unsorted_idx] + last_unsorted_idx += 1 + nums[last_unsorted_idx], nums[r_idx] = nums[r_idx], nums[last_unsorted_idx] + quick_sort(l_idx, last_unsorted_idx-1) + quick_sort(last_unsorted_idx+1, r_idx) + + quick_sort(0, len(nums)-1) + return nums +``` + +## Merge Sort + +- Runtime: O(Nlog(N)) +- Space: O(N) +- N = Number of elements in list + +Merge sort first breaks up the list into elements of one. +Then from those small elements, merges them together by comparing each left and right list. +Each left and right list that gets returned is in sorted order, so its a simple two pointer solution to merge the two lists into a larger sorted list. Return this larger sorted list up the recursion and repeat until the entire subset is sorted. + +Merge sort is considered a stable sort. + +``` +class Solution: + def sortArray(self, nums: List[int]) -> List[int]: + + def merge_sort(nums): + if len(nums) <= 1: + return nums + mid_idx = len(nums) // 2 + left = merge_sort(nums[:mid_idx]) + right = merge_sort(nums[mid_idx:]) + left_idx = right_idx = 0 + merged = list() + while left_idx < len(left) and right_idx < len(right): + if left[left_idx] < right[right_idx]: + merged.append(left[left_idx]) + left_idx += 1 + else: + merged.append(right[right_idx]) + right_idx += 1 + merged += left[left_idx:] + merged += right[right_idx:] + return merged + + return merge_sort(nums) +``` + +## Bubble Sort + +- Runtime: O(N^2) +- Space: O(1) +- N = Number of elements in list + +Each time we perform a bubble sort, we are essential pushing the max value of the array to the back of the array. + +The advantage of bubble sort is the lack of memory or constant memory space. + +``` +class Solution: + def sortArray(self, nums: List[int]) -> List[int]: + for i in reversed(range(0, len(nums))): + for curr in range(0, i): + if nums[curr] > nums[curr+1]: + nums[curr], nums[curr+1] = nums[curr+1], nums[curr] + return nums +``` diff --git a/reading_list.md b/reading_list.md new file mode 100644 index 0000000..1cebb02 --- /dev/null +++ b/reading_list.md @@ -0,0 +1,45 @@ +## This is a list of recommended software related books. + +I will attempt to update this list as I also move forward in my career. Take this as a good rough guide to some very good books. I perfer to read books that are less dry than your average textbook. + +### The Basics +Data structures and algorithms are the bread and butter of creating good code, so are good design and clean structure. +- Cracking the coding interview +- Elements of programming interviews +- Clean Code by Robert Martin +- Head First Design Patterns: A Brain-Friendly Guide + +### Python +- Python Tricks: A Buffet of Awesome Python Features + +## C++ +TODO + +### Linux +Linux is a very powerful operating system and its the most common OS used in the software industry. Its the most customizable, flexible OS there is. Maintained by many contributors, very few bugs/crash, high up-time (no need to restart after updates), and less prone to viruses due to permissions. You will stumble upon some flavor of Linux during your software career. +- The Linux Command Line by William Shotts +- How Linux Works, 2nd Edition: What Every Superuser Should Know +- [UNIX and Linux System Administration Handbook (5th Edition)](https://www.amazon.com/gp/product/0134277554/ref=ox_sc_act_title_1?smid=ATVPDKIKX0DER&psc=1) + +### Architecture and System Design +Once you get a few months of being a programmer, its important to learn how to architect your code. +This will allow you to merge what you know about design patterns into code that is scalable and maintainable throughout the years to come. +Also its important to dig into designing the system as a whole, a more broader picture of the project and various issues that come with distributed systems. +- Clean Architecture: A Craftsman's Guide to Software Structure and Design +- [Designing Data-Intensive Applications Book](https://www.amazon.com/gp/product/1449373321?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=NZSW6YF36GPNR9EM27XB) + +### DevOps: Continuous Integration and Continuous Deployment +You don't need to necessarily understand the full scope of DevOps as a programmer but quite franking you should at-least understand the perspective of a DevOps engineer you are coding for. Ultimately, your code has to be executed and if its coded in a way that is hard to configure, deploy, compile or test its going to making the operation teams' lives difficult. Additionally, too many companies do not have a good software development pipeline in place. They do not follow the idea of continuous testing per commit nor do they have automatic ways to mitigating risk of deploying bad code to production. +- The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations + +### Life Skills +Life skills is rather important, as software engineers tend to silo themselves rather than work together with the team. +You can't finish the project alone and you have to leverage your teammates at some point to get the job done. +How you communicate with your peers is just as important as how you code. +- How to Win Friends & Influence People +- The Schmuck in My Office: How to Deal Effectively with Difficult People at Work + +### Management +Idea meritocracy, an environment where the best idea wins, an interesting topic indeed. +It might be too idealistic and hard to achieve but something that would make a team very efficient if done at modest doses. +- Principles by Ray Dalio diff --git a/real_interview_questions/Google/sort_a_partially_sorted_array.md b/real_interview_questions/Google/sort_a_partially_sorted_array.md new file mode 100644 index 0000000..a45fc84 --- /dev/null +++ b/real_interview_questions/Google/sort_a_partially_sorted_array.md @@ -0,0 +1,88 @@ +# Question +Given an array of positive integers (possibly with duplicates) such that the numbers have been sorted only by 28 most significant bits. Sort the array completely. + +Example 1: + +Input: [0, 15, 12, 17, 18, 19, 33, 32] +Output: [0, 12, 15, 17, 18, 19, 32, 33] + +``` +Explanation: +The integers in their binary representation are: + 0 = 0000 0000 0000 0000 0000 0000 0000 0000 +15 = 0000 0000 0000 0000 0000 0000 0000 1111 +12 = 0000 0000 0000 0000 0000 0000 0000 1100 +17 = 0000 0000 0000 0000 0000 0000 0001 0001 +18 = 0000 0000 0000 0000 0000 0000 0001 0010 +19 = 0000 0000 0000 0000 0000 0000 0001 0011 +33 = 0000 0000 0000 0000 0000 0000 0010 0001 +32 = 0000 0000 0000 0000 0000 0000 0010 0000 + +In sorted order: + 0 = 0000 0000 0000 0000 0000 0000 0000 0000 +12 = 0000 0000 0000 0000 0000 0000 0000 1100 +15 = 0000 0000 0000 0000 0000 0000 0000 1111 +17 = 0000 0000 0000 0000 0000 0000 0001 0001 +18 = 0000 0000 0000 0000 0000 0000 0001 0010 +19 = 0000 0000 0000 0000 0000 0000 0001 0011 +32 = 0000 0000 0000 0000 0000 0000 0010 0000 +33 = 0000 0000 0000 0000 0000 0000 0010 0001 +``` + +``` +Example 2: + +Input: [100207, 100205, 100204, 100206, 100203] +Output: [100203, 100204, 100205, 100206, 100207] +Explanation: +The integers in their binary representation are: +100207 = 0000 0000 0000 0001 1000 0111 0110 1111 +100205 = 0000 0000 0000 0001 1000 0111 0110 1101 +100204 = 0000 0000 0000 0001 1000 0111 0110 1100 +100206 = 0000 0000 0000 0001 1000 0111 0110 1110 +100203 = 0000 0000 0000 0001 1000 0111 0110 1011 + +In sorted order: +100203 = 0000 0000 0000 0001 1000 0111 0110 1011 +100204 = 0000 0000 0000 0001 1000 0111 0110 1100 +100205 = 0000 0000 0000 0001 1000 0111 0110 1101 +100206 = 0000 0000 0000 0001 1000 0111 0110 1110 +100207 = 0000 0000 0000 0001 1000 0111 0110 1111 +``` + +Expected O(n) time solution. + +## Solution + +- Runtime: O(N) +- Space: O(1), assuming result doesn't use space +- N = Number of elements in array + +The idea is to use bucket sort, since the array is already partially sorted by the first 28 bits. +During this time, we will place the numbers into 1 of 16 buckets based on the last 4 bits. +We can just iterate from left to right until the first 28 bits are different. +Then we can save the results we have in the buckets and reset. +With this approach, we will never need more than 16 buckets at a given time. + +``` +def sort_partial_sorted_28b(nums): + if len(nums) == 0: + return nums + mask = ~0 << 4 + curr_28b = nums[0] & mask + buckets = [0] * 16 # 16 buckets -> n occurances + results = list() + for num in nums: + if (num & mask) != curr_28b: # start sort + for bucket, occurance in enumerate(buckets): + for _ in range(occurance): + results.append(curr_28b | bucket) + curr_28b = num & mask # set to next 28 bit group + buckets = [0] * 16 # reset + # add to buckets + buckets[num & 15] += 1 + for bucket, occurance in enumerate(buckets): + for _ in range(occurance): + results.append(curr_28b | bucket) + return results +``` diff --git a/real_interview_questions/Other/diff_two_strings.md b/real_interview_questions/Other/diff_two_strings.md new file mode 100644 index 0000000..ad96eb2 --- /dev/null +++ b/real_interview_questions/Other/diff_two_strings.md @@ -0,0 +1,57 @@ +# Diff Between Two Strings + +Given two strings of uppercase letters source and target, list (in string form) a sequence of edits to convert from source to target that uses the least edits possible. + +For example, with strings source = "ABCDEFG", and target = "ABDFFGH" we might return: ["A", "B", "-C", "D", "-E", "F", "+F", "G", "+H" + +More formally, for each character C in source, we will either write the token C, which does not count as an edit; or write the token -C, which counts as an edit. + +Additionally, between any token that we write, we may write +D where D is any letter, which counts as an edit. + +At the end, when reading the tokens from left to right, and not including tokens prefixed with a minus-sign, the letters should spell out target (when ignoring plus-signs.) + +In the example, the answer of A B -C D -E F +F G +H has total number of edits 4 (the minimum possible), and ignoring subtraction-tokens, spells out A, B, D, F, +F, G, +H which represents the string target. + +If there are multiple answers, use the answer that favors removing from the source first. + +Constraints: + +[time limit] 5000ms + +[input] string source +2 ≤ source.length ≤ 12 + +[input] string target +2 ≤ target.length ≤ 12 + +[output] array.string + +# Solution + +The trickiest part of all this is to build the intuition for how to handle the fork. +You should notice that there are only two possible ways when a char from target and source do not match. +Whether to delete S or add T. +We would have to traverse the entire solution space because we don't know if we can build a smaller list. +After that, its a simpler recursive call and returning the min list. + +``` +def diffBetweenTwoStrings(source, target): + def diff_helper(s, t): + if len(t) == 0 and len(s) > 0: + return ['-' + ch for ch in s] + elif len(t) > 0 and len(s) == 0: + return ['+' + ch for ch in t] + elif len(t) == 0 and len(s) == 0: + return [] + if s[0] == t[0]: + return [s[0]] + diff_helper(s[1:], t[1:]) + # s[0] != t[0] + result1 = diff_helper(s[1:], t) # skip s, delete s + result2 = diff_helper(s, t[1:]) # skip t, add t + if len(result1) <= len(result2): + return ['-' + s[0]] + result1 + else: + return ['+' + t[0]] + result2 + + return diff_helper(source, target) +``` diff --git a/system_design/README.md b/system_design/README.md index 8b67dc6..4105529 100644 --- a/system_design/README.md +++ b/system_design/README.md @@ -1,14 +1,23 @@ -### Grokking the System Design Interview -Start here. -- https://www.educative.io/collection/5668639101419520/5649050225344512 +I'll be using this section to fill in some content that may have not been covered enough. +Other than that, this will be just a reference for other material that will better explain system design than I can. -### YouTube suggestions -I recommend checking this guy's channel (Success in Tech). I would actually avoid Tushar Roy's channel, the quality of system design isn't on par. Another possible channel is Gaurav Sen. -- https://www.youtube.com/channel/UC-vYrOAmtrx9sBzJAf3x_xw -- https://www.youtube.com/channel/UCRPMAqdtSgd0Ipeef7iFsKw +I would recommend starting with Grokking the System Design Interview eCourse. +Worth the money and will fill in a lot of the gaps for a beginner to system design. +Use the YouTube channels as supplement and Google's SRE eBook for context to some real life scenario debugging. +I highly recommend picking up "Designing Data-Intensive Applications" book, really good read and to the point. -### Rest Api and HTTP -- https://www.youtube.com/watch?v=rhTkRK53XdQ +## Start Here +- [Grokking the System Design Interview](https://www.educative.io/collection/5668639101419520/5649050225344512) +- https://github.com/donnemartin/system-design-primer -### Other Resources +## YouTube suggestions +I recommend checking out "Success in Tech" and "Gaurav Sen" on YouTube. +I would avoid Tushar Roy's YouTube channel, the quality of system design isn't on par, his channel is only good for DS & Algos. +- [Success in Tech's System Design YouTube Playlist](https://www.youtube.com/watch?v=0163cssUxLA&list=PLA8lYuzFlBqAy6dkZHj5VxUAaqr4vwrka) +- [Gaurav Sen's System Design YouTube Playlist](https://www.youtube.com/watch?v=quLrc3PbuIw&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX) + +## Other Resources +- [Designing Data-Intensive Applications Book](https://www.amazon.com/gp/product/1449373321?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=NZSW6YF36GPNR9EM27XB) - http://highscalability.com/ +- [Free Google SRE eBook](https://landing.google.com/sre/sre-book/toc/) +- [Kubernetes Beginner Comic](https://cloud.google.com/kubernetes-engine/kubernetes-comic/) diff --git a/system_design/System_design.md b/system_design/System_design.md deleted file mode 100644 index 820605a..0000000 --- a/system_design/System_design.md +++ /dev/null @@ -1,36 +0,0 @@ -# CAP Theroem -This was created by Eric Brewer in the 2000s to help model distrubuted system design. -His idea was that a system can provide three basic needs to a user, consistency, avaliability and partition tolerance. -However, a system can only optimized two of the three while the third would have to be relaxed or ignored completely. - -### Consistency -A system operates fully or not at all. -Every read receives the most recent write or an error. - -### Avaliability -A system is always able to answer a request. -Every request recieves a non-error response. - -### Partition Tolerance -If a node or more fails, the system can continue to function. -The system continues to operate despite an arbitrary number of messages being dropped by the network between nodes. - -# ACID vs BASE -### ACID -- Atomicity - - If the task or tasks in a transaction must sucessed entirely or all the tasks must fail. -- Consistency - - The transaction must meet all protocols or rules during the beginning and the end of a transaction -- Isolation - - No transaction can access other transactions that are in an unfinished state. Therefore, each transaction is independent from each other. -- Durability - - Once a transaction is complete, it will stay persist even due to failures like power loss or system breakdowns. - -For example, an SQL server may follow the ACID properities. -Banks usually use SQL servers and therefore follow the ACID properities. -If there was a tranasction and a user's account balance was not updated, any requests for that balance would be denied or postponed until that update completes. - -### BASE -- Basically Avaliable -- Soft State -- Evenutal Consistency diff --git a/system_design/Thought_process.md b/system_design/Thought_process.md deleted file mode 100644 index 31aa4e9..0000000 --- a/system_design/Thought_process.md +++ /dev/null @@ -1,7 +0,0 @@ -### 80/20 Rule -Almost every service you are providing will have this 80/20 rule applied to it. The 80/20 rule means, 80% of the most used features will make up 20% of your possible services. -For example, Chrome browser, many people use it for the browsing, gmail, youtube, bookmarks, refresh, print feature, back button, etc... But the average user won't care much about the options, extensions, or the fact that you can open 10 tabs with one button etc... -If you think of the problem this way, you can then focus a lot of your attention on the 20% of your services. How you can setup your system design around this 20%? - -If most of your traffic is concentrated in one area of your services, then you should essentially separate these into two different architectures. -For example, if an average user posts something on their feed, it may or may not be that important that your friends may get that new message 1 min from now or 5 mins from now. But if a famous celebrity were to post something, a lot more people will notice that feeds would come at different times if your design was setup this way. So this would become a real issue and one way to solve this is to dedicate a network of databases and applications for just the very famous or ones will over X million followers. diff --git a/system_design/bloom_filters.md b/system_design/bloom_filters.md new file mode 100644 index 0000000..2f6c39b --- /dev/null +++ b/system_design/bloom_filters.md @@ -0,0 +1,38 @@ +# Bloom Filters + +## Use Case +Given a word, figure out if it already exists or not. + +On a system design level, a hash table can work. +However, if you have a lot of words, say a billion+ words, you start running into performance issues. +You cannot store this in memory and so there will be some overhead with disk input output and storage. +You could try to optimize as much as you can, like sharding the data into buckets with sub-hash tables but this doesn't 100% solve the latency issue. + +This is where bloom filters come in, is it a popular usage for databases. +If you imagine an API like check(word) and it returns True or False. +However, the API is probabilistic, if it gives you a False it is 100% accurate, if it returns True its 90% accurate, more or less, depends. +The difference is that bloom filter uses a lot less memory than the hash table method. + +## How it works +1. Starting with a bit array of a set size, say 00000000 of 8 bits. +2. Given a word, "cat", we will run this past multiple hash functions, each hash function outputs an index. +For example, two hash functions hash1('cat') and hash2('cat') gives us two indexes 2 and 5. +We will then set the bits to 00100100 in respect to its indexes. +3. Then given another word, "dog", we will run it past the hash functions as well, giving us indexes 7 and 2. +Again, setting the bit array accordingly to 00100101. +4. If we wanted to check if the word "bird" exists, we would run it past the hash functions, for example it would return indexes 5 and 1. +Since index 1 isn't set, we know "bird" does not exist. +5. Simiarly if we tried another word, like "lion" and the hash functions returned 2 and 7, the API would believe that the word "lion" exists but we never saved it. + +This is why bloom filters will always accurately return if something doesn't exist but fail to 100% predict if a word does exist. +To increase the likelihood that it is correct, bloom filters will use many hash functions, this is to increase the chances to find more indexes containing zeros. + +Lastly, since the bloom filters use a bit array, we can store the bit array as a string, each character containing 8 or 16 or 32 bits dependings on your operating system. +Which results in something like A90bhl158, this can represent all the set bits in a condensed manner. + +## Limitations +Bloom filters require a rough estimate of how many unique elements would be stored as it would require the bit array to be determined beforehand. +Once the bit array is set, it will be hard to change it. +Simiarly, once we add an element into the bit, it will forever be added and can never be removed. +However, there is something called an invertible bloom filter, which can be used to determined which bits to remove. +I won't be discussing this topic here as it shouldn't be needed for interviews. diff --git a/system_design/databases.md b/system_design/databases.md deleted file mode 100644 index fe6e5cf..0000000 --- a/system_design/databases.md +++ /dev/null @@ -1,11 +0,0 @@ -# Databases - -## Indexes - -## SQL vs NoSQL - -## ACID vs. BASE - -## Read Heavy (Master-Slave Architecture) - -## Write Heavy (Database Sharding Architecture) diff --git a/system_design/edge_cases.md b/system_design/edge_cases.md new file mode 100644 index 0000000..6803a9c --- /dev/null +++ b/system_design/edge_cases.md @@ -0,0 +1,67 @@ +# Scope out your requirements +There is no secret sauce or one size fits all design for any problem. +However, its good to point out constraints before beginning your design. +The following list are some questions you should always ask for every system design interview to scope out your requirements and constraints. +Keep in mind of the CAP theorem, as there is no single design that is 100% bulletproof for all cases. + +- Volume of reads +- Volume of writes +- Volume of data stored +- Complexity of data +- Response time +- Access patterns + +# 80/20 Rule +Almost every service you are providing will have this 80/20 rule applied to it. +The 80/20 rule can be defined in many different ways, it usually means, 80% of effects come from 20% of causes. + +For example, 80% of the most used features will make up 20% of your possible services or 20% of your users generate 80% of the traffic. +Like the Chrome browser, many people use it for the browsing, gmail, youtube, bookmarks, refresh, print feature, back button, etc... +But the average user won't care much about the options, extensions, or the fact that you can open 10 tabs with one button etc... +If you think of the problem this way, you can then focus a lot of your attention on the 20% of your services. +How you can setup your system design around this 20%? + +Another case is that most of your traffic is concentrated in one area of your services, then you should separate these into two different architectures. +For example, if an average user posts something on their feed, it may or may not be that important that your friends may get that new message 1 min from now or 5 mins from now. +But if a famous celebrity were to post something, a lot more people will notice that feeds would come at different times if your design was setup this way. +So this would become a real issue and one way to solve this is to dedicate a network of databases and applications for just the very famous or ones will over X million followers. + +Another case is that maybe 20% of your data is accessed by 80% of your users. You can then consider if it is possible to store that 20% into a memcache for high availability. +While having a separate cache that would act more like a normal LRU cache for rest of the data. + +The 80/20 rule is a great thing to think about after you have laid out your design. This is when you want to further optimize the design. + +# Race Conditions +Identify any part of your system that requires more than one component to share some piece of data. +For example, if two services need to book a room at a hotel, there can be a scenario that two users want to book the same room at the hotel. You would need a locking service to block/hold any other requests for a particular room until payment is successfully processed. + +If you are making a web crawler, you need to keep track of visited webpages. +Instead of allowing the web crawlers to manage the visited set, reverse the responsibilities and dedicate a component like a URL manager to distribute the URLs to the web crawlers while keeping track of the visited URLs. +This eliminates any race conditions between crawlers and handled by one entity. + +# Fault Tolerance +During the interview, generally a good interviewer will start asking the question of "What if this dies?". +This is to move you to start thinking about single points of failures. + +General responses would be using a master-slave architecture. +However, there is always the question "What if both die?". +This is generally the cases when a service that was processing a response, suddenly dies. +Therefore, losing the request that was being processed. + +You would then need to start moving your design into a more persistent state, so you can revive the system from scratch using the data stored. +Another option is to have a second component handle the responses and resend that response if there is a timeout. +However, latency of the response may slow due to the need to write to disk for each response or additional complexity. + +# Avoid Tightly Coupled Architecture +If you were making a financial account aggregator service like Mint.com. +You would likely need to interact with many financial APIs to collect data for various accounts. + +You should dedicate a section of the design to this task. +Something like aggregator services and a message queue service with its own message DB for fault tolerance. +All aggregators interact with a proxy server. + +What you should not do, is have the same set of aggregators interact with different banks and credit card accounts. +The role and responsibilities start getting tightly coupled at that point. +Just duplicate each design, but dedicate each set of aggregators and message queues to only interact with one financial institution. +This makes it easier for deployment when their API changes and less risk of the new change in breaking each other services. +Also, future proofs your system when a financial institution is no longer around. diff --git a/system_design/load_balancing.md b/system_design/load_balancing.md deleted file mode 100644 index d7acc16..0000000 --- a/system_design/load_balancing.md +++ /dev/null @@ -1,54 +0,0 @@ -# Load Balancing -Load balancers are dedicated servers or machines made to reroute traffic or messages across a cluster of machines. -It is generally used in many large systems for this purpose. - -The network of systems can be very complex, having different sub clusters of machines to allow redundancy and avoid any single point of failures. -Load balancers are used to interweave between these clusters to distribute load evenly. -So it is very common to have multiple load balancers, at least one per cluster of machines. -If one machine is getting many tasks sent to it, it is up to the load balancer to elevate that problem. - -Generally, when a request comes in, the load balancer hashes a code, that can be a userID or messageID which will route the request to a machine that the load balancer is responsible for. - -## Techniques to load balancing - -### Random Selection -Generally the most unpopular and least efficient of the approaches. - -#### Pros -- Simple to implement on a small cluster of not that important services. - -#### Cons -- Chance of getting uneven load between the servers depending on the sequence of random selection. - -### Round Robin - -#### Pros - -#### Cons - -### LFU or LRU - -### Pros - -### Cons - -### Modding approach -When a request is sent, generally a number, usually the number of servers available, the request will be hashed aganist the mod. -The result of the mod is the server number the request will be sent to. - -#### Pros -- Fairly common among smaller services with low-medium traffic. - -#### Cons -- Cannot scale well if number of servers change, it will create uneven load during this period of time. Caches for several servers will become useless due to the change and requests will need to recomputed. If you were to remove a server, then your algothrim may shift the requests one across, making each server which was dedicated to a specific user, suddenly dealing with different set of users. - -## Consistent Hashing -You can imagine a circle, the circle represents the possible hashes when you hash the request key. -For example, a possible of 100 hashes and 3 servers. you can have hash 0 to 33 to server 1, 34 to 66 to server 2, and 67 to 100 to server 3. -When a request comes in, if the hash lands on any of these ranges, you can determine which server to send to. - -When you add a new server to the group, making a total 4, for example, the circle changes to 0 to 33 to server 1, 34 to 66 to server 2, 67 to 80 to server 3 and 81 to 100 to server 4. Notice that server 1 and server 2 become unchange. The only ones affected are server 3 and server 4, which will require their caches to be redone. Similar affect if you were to remove a server. - -You may notice that this approach creates uneven load, to elevate this problem, you can rehash multiple times to distribute or create more ranges on the ring. Therefore, reducing the range gap between servers when adding or removing servers. - -During a removal or addition of a server, it is generally done one at a time. For example, adding a new server then waiting a duration say an hour to allow the cache to get to a reasonable state before adding another server. diff --git a/system_design/osi_model.md b/system_design/osi_model.md new file mode 100644 index 0000000..0af4f70 --- /dev/null +++ b/system_design/osi_model.md @@ -0,0 +1,31 @@ +# 7 Layer OSI Model +Acronym : "All People Seem To Need Data Processing" + +## 7. Application +- Simple Mail Transfer Protocol(SMTP) File Transfer Protocol(FTP), Telnet, HyperText Transfer Protocol(HTTP), Secure Socket Layer(SSL), +- "Language" that the applications and servers use to communicate with one another. + +## 6. Presentation +- Format of data and character set conversion like ASCII, Encryption + +## 5. Session +- Establishing and terminating connections. + +## 4. Transport +- TCP, UDP, Ports +- Data integrity checking, source and destination ports and specs for breaking application data into packets. + +## 3. Network +- IPv4, IPv6, Routers +- Defines how to move packets from a source host to a destination host. + +## 2. Data Link +- MAC addresses, switches +- How physical addresses are added to the data with the use of frames. + +## 1. Physical +- Ethernet, modem +- How to send raw data across a physical medium. + +### Other Resources +- [OSI Model Youtube Video](https://www.youtube.com/watch?v=LANW3m7UgWs) diff --git a/system_design/storage.md b/system_design/storage.md new file mode 100644 index 0000000..56604b9 --- /dev/null +++ b/system_design/storage.md @@ -0,0 +1,26 @@ +# Block Storage +- Data are chopped up into blocks of bytes, leaving the system unknown as to what is stored until all the blocks are put back together. +- Data are able to be retrieved at low latency. Very high performance. However, this means that storage cannot have a lot of distance between each other, generally multiple servers will be in the same physical location from each other. +- Cannot deal with many users editing the same file. No locking ability among data. +- No metadata. Very little over head. +- NAS index tables have a max size, limited to the amount of data to be indexed or stored due to performance hit. Therefore, scalability is limited. +- Requires backup to an offsite location for redundancy. + +# File Storage +- Stores data in a file hierarchy or known as directories and sub-directories. Similar to how linux/UNIX systems organize their files. +- Has a set uncustomizable metadata for each file, things like file name, creation date, file type etc... +- Allows many users to edit the same data. Has locking features, however, is handled by the operating system and not by the file system itself. +- Designed to be on a local network or remote network. Flexible on location, however, impacting latency further the servers are. Performance is not a concern. +- NAS index tables have a max size, limited to the amount of data to be indexed or stored due to performance hit. Therefore, scalability is limited. +- Requires backup to an offsite location for redundancy. + +# Object Storage +- Stores data in objects, each object has an unique ID, metadata and the actually data itself. Each object is then stored into buckets or a group of objects, the user can decide which bucket each object can be placed in. +- Meant to organize unstructured data, whether that is videos, music, documents, or pictures into a flat organization with flexibe sized buckets. +- Limited to the number of users allowed the edit the same file at a time. If many users do edit the same file, object storage will instead create different versions of that object. +- Buckets can be stored in multiple nodes and geographic locations. This creates builtin redundancy and improves performance. +- Allows custom metadata, this allows the filtering or processing in finding the correct data by storing custom data pertaining to each object. For example, YouTube would use a category type in this metadata to find cat videos versus dog videos. +- Since Object Storage uses a GUID for each object, we can scale the number of objects easily instead of relying on NAS which uses complex file paths to determine where data are. + +## Resources +- [Block vs. File vs. Object Storage Video](https://www.youtube.com/watch?v=qTGDhvbdPzo)