Master advanced string processing techniques from naive approaches to linear-time algorithms. Learn KMP, Rabin-Karp, Z-Algorithm, Manacher's, and string hashing with complete C# implementations and real LeetCode solutions.
Before diving into advanced algorithms, understand the fundamental properties of strings in C# and how they impact algorithm design and performance.
Strings in C# are immutable—once created, they cannot be changed. Every string modification creates a new string object.
string text = "hello"; string upper = text.ToUpper(); // Creates NEW string // Original text still "hello", upper is "HELLO" string concat = text + " world"; // Creates NEW string
char is a single 16-bit Unicode character, while string is a sequence of chars. Understanding this distinction is crucial for efficient string manipulation.
// char operations are O(1) char c = 'a'; bool isLetter = char.IsLetter(c); // O(1) // string indexing is O(1) string s = "hello"; char first = s[0]; // Direct access, O(1)
C# uses Unicode (UTF-16) for characters. ASCII is a 7-bit subset (0-127). Understanding character encoding is important for pattern matching and hashing.
// Convert char to ASCII value char c = 'A'; int ascii = (int)c; // 65 // Check if ASCII (0-127) bool isAscii = ascii < 128; // Convert ASCII value back to char char fromAscii = (char)65; // 'A'
Use StringBuilder when building strings in loops to avoid O(n²) concatenation overhead.
// BAD: O(n²) due to string immutability string result = ""; for (int i = 0; i < 1000; i++) { result += "x"; // Creates new string each time } // GOOD: O(n) with StringBuilder StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000; i++) { sb.Append("x"); // Efficient } string final = sb.ToString();
The simplest approach: check every position in the text to see if the pattern matches. Easy to understand but inefficient for large texts with long patterns.
The algorithm is straightforward: for each position i in the text, check if the pattern matches starting at position i.
/// <summary> Brute force pattern search O(n*m) </summary> public static int BruteForceSearch(string text, string pattern) { int n = text.Length; int m = pattern.Length; // Try each position in text for (int i = 0; i <= n - m; i++) { bool match = true; // Check if pattern matches at position i for (int j = 0; j < m; j++) { if (text[i + j] != pattern[j]) { match = false; break; } } if (match) return i; // Found at position i } return -1; // Not found }
Find "BAC" in "ABABAC":
Linear-time pattern matching O(n + m) by avoiding redundant comparisons using a failure function that encodes pattern structure.
The failure function (or LPS—Longest Proper Prefix which is also Suffix) tells us: when we have a mismatch at position j, what is the longest prefix of the pattern that is also a suffix of pattern[0..j-1]? This lets us skip redundant comparisons.
The LPS array is built using a two-pointer technique:
/// <summary> Build LPS (Longest Proper Prefix which is also Suffix) </summary> private static int[] BuildLPS(string pattern) { int m = pattern.Length; int[] lps = new int[m]; int len = 0; // Length of previous longest prefix suffix int i = 1; while (i < m) { if (pattern[i] == pattern[len]) { // Characters match, extend the prefix lps[i] = len + 1; len++; i++; } else { if (len != 0) { // Fall back to previous state len = lps[len - 1]; } else { lps[i] = 0; i++; } } } return lps; }
Use the LPS array to skip comparisons when mismatches occur:
/// <summary> KMP pattern search O(n + m) </summary> public static int KMPSearch(string text, string pattern) { int n = text.Length; int m = pattern.Length; int[] lps = BuildLPS(pattern); int i = 0; // Index for text int j = 0; // Index for pattern while (i < n) { if (pattern[j] == text[i]) { i++; j++; } if (j == m) { // Pattern found at index i - m return i - m; } else if (i < n && pattern[j] != text[i]) { // Use LPS to skip redundant checks if (j != 0) j = lps[j - 1]; else i++; } } return -1; // Not found }
Find "BAC" in "ABABAC":
Hash-based pattern matching: compute rolling hash of text and pattern windows. Excel at finding multiple patterns in text; vulnerable to hash collisions.
Instead of comparing strings character-by-character, compute a hash of each substring and compare hashes. When sliding the window, update the hash efficiently using the rolling hash formula:
new_hash = (old_hash - first_char * base^(m-1)) * base + new_char
/// <summary> Rabin-Karp pattern search with rolling hash </summary> public static int RabinKarpSearch(string text, string pattern) { int n = text.Length; int m = pattern.Length; const long BASE = 256; // Prime for better distribution const long MOD = 101; // Prime modulus to avoid overflow long patternHash = 0; long textHash = 0; long power = 1; // BASE^(m-1) % MOD // Compute BASE^(m-1) % MOD for (int i = 0; i < m - 1; i++) power = (power * BASE) % MOD; // Compute hash of pattern and first window of text for (int i = 0; i < m; i++) { patternHash = (patternHash * BASE + pattern[i]) % MOD; textHash = (textHash * BASE + text[i]) % MOD; } // Slide the window for (int i = 0; i <= n - m; i++) { // Hash match found if (patternHash == textHash) { // Verify actual match (handle hash collisions) bool match = true; for (int j = 0; j < m; j++) { if (text[i + j] != pattern[j]) { match = false; break; } } if (match) return i; } // Calculate hash of next window if (i < n - m) { textHash = (textHash - text[i] * power) % MOD; textHash = (textHash * BASE + text[i + m]) % MOD; // Handle negative modulo in C# if (textHash < 0) textHash += MOD; } } return -1; }
Linear-time pattern matching using a Z-array that stores the length of the longest substring starting from position i that matches a prefix of the string.
The Z-array Z[i] represents the length of the longest substring starting from index i that matches a prefix of the string. Built using the "Z-box" concept for O(n) time.
/// <summary> Build Z-array: Z[i] = longest substring from i matching prefix </summary> private static int[] BuildZArray(string s) { int n = s.Length; int[] z = new int[n]; int l = 0, r = 0; // [l, r] is the rightmost Z-box for (int i = 1; i < n; i++) { if (i > r) { // Outside rightmost Z-box, compute from scratch l = r = i; while (r < n && s[r - l] == s[r]) r++; z[i] = r - l; r--; } else { // Inside rightmost Z-box, use previously computed values int k = i - l; // Corresponding index in Z-box if (z[k] < r - i + 1) { z[i] = z[k]; } else { l = i; while (r < n && s[r - l] == s[r]) r++; z[i] = r - l; r--; } } } return z; }
Concatenate pattern and text with a separator, build Z-array. If Z[i] == m, pattern is found at position i - m - 1 in text.
/// <summary> Z-Algorithm pattern search O(n + m) </summary> public static int ZAlgorithmSearch(string text, string pattern) { // Create concatenated string: pattern + separator + text string combined = pattern + "#" + text; int[] z = BuildZArray(combined); int m = pattern.Length; // Search for pattern in Z-array for (int i = m + 1; i < combined.Length; i++) { if (z[i] == m) { // Found: return position in original text return i - m - 1; } } return -1; }
For string "ABAB": Z = [0, 0, 2, 0]
Find longest palindromic substring in O(n) time by expanding around centers efficiently using a radius array and center tracking.
To handle both odd and even-length palindromes uniformly, insert a sentinel character between each character: "aba" becomes "#a#b#a#".
/// <summary> Find longest palindromic substring using Manacher's algorithm </summary> public static string LongestPalindrome(string s) { if (s.Length <= 1) return s; // Transform string to handle even-length palindromes StringBuilder transformed = new StringBuilder("#"); foreach (char c in s) { transformed.Append(c); transformed.Append('#'); } string t = transformed.ToString(); int n = t.Length; int[] p = new int[n]; // Radius of palindrome at each position int center = 0; // Center of rightmost palindrome found int right = 0; // Right edge of rightmost palindrome int maxLen = 0; int centerIndex = 0; for (int i = 0; i < n; i++) { // Mirror of i around center int mirror = 2 * center - i; // If i is within the right edge, use mirror's radius if (i < right) { p[i] = Math.Min(right - i, p[mirror]); } // Expand around center try { while (i + 1 + p[i] < n && i - 1 - p[i] >= 0 && t[i + 1 + p[i]] == t[i - 1 - p[i]]) { p[i]++; } } catch { } // Update center and right if (i + p[i] > right) { center = i; right = i + p[i]; } // Track longest palindrome if (p[i] > maxLen) { maxLen = p[i]; centerIndex = i; } } // Extract original palindrome from transformed string int start = (centerIndex - maxLen) / 2; return s.Substring(start, maxLen); }
Use polynomial rolling hash to compare substrings in O(1) time after O(n) preprocessing. Essential for competitive programming and substring problems.
Hash of string s = s[0]·base^(n-1) + s[1]·base^(n-2) + ... + s[n-1], all mod MOD.
/// <summary> String hashing with polynomial rolling hash </summary> public class StringHash { private readonly long[] _hash; private readonly long[] _pow; private const long BASE = 31; private const long MOD = 1000000007; public StringHash(string s) { int n = s.Length; _hash = new long[n + 1]; _pow = new long[n + 1]; _pow[0] = 1; // Precompute powers of base for (int i = 1; i <= n; i++) { _pow[i] = (_pow[i - 1] * BASE) % MOD; } // Precompute prefix hashes for (int i = 0; i < n; i++) { _hash[i + 1] = (_hash[i] * BASE + (s[i] - 'a' + 1)) % MOD; } } /// <summary> Get hash of substring s[l..r] in O(1) </summary> public long GetHash(int l, int r) { long h = (_hash[r + 1] - _hash[l] * _pow[r - l + 1]) % MOD; return h < 0 ? h + MOD : h; } /// <summary> Check if two substrings are equal in O(1) </summary> public bool AreEqual(int l1, int r1, int l2, int r2) { if (r1 - l1 != r2 - l2) return false; return GetHash(l1, r1) == GetHash(l2, r2); } }
/// <summary> Find longest duplicate substring using binary search + hashing </summary> public static string LongestDuplicateSubstring(string s) { int left = 1, right = s.Length - 1; string result = ""; while (left <= right) { int mid = (left + right) / 2; string duplicate = FindDuplicateOfLength(s, mid); if (duplicate != "") { result = duplicate; left = mid + 1; } else { right = mid - 1; } } return result; } private static string FindDuplicateOfLength(string s, int len) { StringHash sh = new StringHash(s); var seen = new HashSet<long>(); for (int i = 0; i <= s.Length - len; i++) { long hash = sh.GetHash(i, i + len - 1); if (seen.Contains(hash)) return s.Substring(i, len); seen.Add(hash); } return ""; }
Tree structure for storing and searching strings. Essential for autocomplete, prefix matching, and word searches. Space-time tradeoff: O(n) space for O(m) search.
/// <summary> Trie node with children and word flag </summary> public class TrieNode { public Dictionary<char, TrieNode> children = new(); public bool isWord = false; } /// <summary> Trie for efficient string search and prefix matching </summary> public class Trie { private TrieNode root = new TrieNode(); /// <summary> Insert word into trie O(m) where m = word length </summary> public void Insert(string word) { TrieNode node = root; foreach (char c in word) { if (!node.children.ContainsKey(c)) node.children[c] = new TrieNode(); node = node.children[c]; } node.isWord = true; } /// <summary> Search for exact word O(m) </summary> public bool Search(string word) { TrieNode node = FindNode(word); return node != null && node.isWord; } /// <summary> Check if any word starts with prefix O(m) </summary> public bool StartsWith(string prefix) { return FindNode(prefix) != null; } private TrieNode FindNode(string prefix) { TrieNode node = root; foreach (char c in prefix) { if (!node.children.ContainsKey(c)) return null; node = node.children[c]; } return node; } /// <summary> Get all words with given prefix </summary> public List<string> GetWordsWithPrefix(string prefix) { var result = new List<string>(); TrieNode node = FindNode(prefix); if (node != null) DFS(node, prefix, result); return result; } private void DFS(TrieNode node, string current, List<string> result) { if (node.isWord) result.Add(current); foreach (var pair in node.children) DFS(pair.Value, current + pair.Key, result); } }
Suffix array stores all suffixes sorted lexicographically. LCP (Longest Common Prefix) array helps find repeated substrings and string patterns efficiently.
/// <summary> Build suffix array by sorting all suffixes </summary> public static int[] BuildSuffixArray(string s) { int n = s.Length; var suffixes = new List<(string, int)>(); // Create all suffixes with their original indices for (int i = 0; i < n; i++) suffixes.Add((s.Substring(i), i)); // Sort lexicographically suffixes.Sort((a, b) => a.Item1.CompareTo(b.Item1)); int[] result = new int[n]; for (int i = 0; i < n; i++) result[i] = suffixes[i].Item2; return result; } /// <summary> Build LCP array from suffix array </summary> public static int[] BuildLCP(string s, int[] sa) { int n = s.Length; int[] lcp = new int[n]; int[] rank = new int[n]; // Build rank array (position of suffix in sorted order) for (int i = 0; i < n; i++) rank[sa[i]] = i; int h = 0; // Kasai algorithm for LCP construction for (int i = 0; i < n; i++) { if (rank[i] > 0) { int j = sa[rank[i] - 1]; while (i + h < n && j + h < n && s[i + h] == s[j + h]) h++; lcp[rank[i]] = h; if (h > 0) h--; } } return lcp; }
Find the longest substring that appears at least twice. The answer is the maximum value in the LCP array.
/// <summary> Find longest repeated substring using suffix array </summary> public static string LongestRepeatedSubstring(string s) { int[] sa = BuildSuffixArray(s); int[] lcp = BuildLCP(s, sa); int maxLen = 0; int maxIndex = 0; for (int i = 0; i < lcp.Length; i++) { if (lcp[i] > maxLen) { maxLen = lcp[i]; maxIndex = sa[i]; } } return s.Substring(maxIndex, maxLen); }
15+ LeetCode problems with complete C# solutions using techniques from this chapter.
Find the index of the first occurrence of substring.
// Solution using KMP (optimal O(n+m)) public int StrStr(string haystack, string needle) { if (needle.Length == 0) return 0; return KMPSearch(haystack, needle); }
Solution: Use Manacher's algorithm for O(n) optimal solution.
Check if string is made of repeated pattern using KMP LPS trick.
public bool RepeatedSubstringPattern(string s) { string doubled = s + s; return doubled.IndexOf(s, 1) < s.Length; }
Solution: Use Manacher's algorithm for O(n) optimal solution.
Check if string is made of repeated pattern using KMP LPS trick.
public bool RepeatedSubstringPattern(string s) { // Trick: if s is made of repeated pattern, // then s appears in (s+s)[1..2n-2] string doubled = s + s; return doubled.IndexOf(s, 1) < s.Length; }
Add minimum characters to make string palindrome using KMP on (pattern + '#' + text).
Use KMP failure function directly—answer is LPS[n-1].
public string LongestPrefix(string s) { int[] lps = BuildLPS(s); return s.Substring(0, lps[s.Length - 1]); }
Count distinct subsequences of S that equals T using DP.
Sliding window with character frequency map.
Classic DP: dp[i][j] = edit distance between s1[0..i-1] and s2[0..j-1].
Find all pairs that form palindrome. Use Trie + KMP trick.
DP: check if s3 is interleaving of s1 and s2.
Simulate the look-and-say sequence.
Parse integer from string handling edge cases and whitespace.
Use suffix array with LCP array or DP approach.
DP or greedy with * and ? wildcards.
DP with . and * metacharacters.
Quick reference to choose the right string algorithm for your problem.
| Algorithm | Time | Space | Best For |
|---|---|---|---|
| Brute Force | O(n·m) | O(1) | Small patterns, learning |
| KMP | O(n+m) | O(m) | Single pattern search, optimal |
| Rabin-Karp | O(n+m) | O(1) | Multiple patterns, hashing |
| Z-Algorithm | O(n+m) | O(n+m) | All occurrences, pattern analysis |
| Manacher | O(n) | O(n) | Longest palindrome, optimal |
| Trie | O(m) | O(n·m) | Autocomplete, prefix matching |
| Suffix Array | O(n log n) | O(n) | All substrings, advanced queries |
| String Hash | O(n) | O(n) | Substring comparison, duplicate finding |
Patterns, mistakes, and strategies for string algorithm questions.