Neetcode - Word Break
Every line is a seed - plant wisely.
Word Break
Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of dictionary words.
You are allowed to reuse words in the dictionary an unlimited number of times. You may assume all dictionary words are unique.
Input: s = "neetcode", wordDict = ["neet","code"]
Output: true
Explanation: Return true because "neetcode" can be split into "neet" and "code".
Input: s = "applepenapple", wordDict = ["apple","pen","ape"]
Output: true
Explanation: Return true because "applepenapple" can be split into "apple", "pen" and "apple". Notice that we can reuse words and also not use all the words.
Input: s = "catsincars", wordDict = ["cats","cat","sin","in","car"]
Output: false
This problem is similar to Neetcode - Maximum Product Subarray , where you try all possible combinations. However, it differs in that you can use each word from the dictionary an unlimited number of times. For example, with wordDict = ["a", "b"], you can use "a" as many times as you want.
In my first attempt, I only tried using each word once, which made the problem unsolvable for me. Although I considered combinations and string replacements, I didn’t fully combine these approaches.
To solve this correctly, you need to check each word’s length, move the current position along the string accordingly, and treat each new position as a new start point for further exploration.
Here’s an example to illustrate:
Given the string s = "applepenapple" and a word dictionary wordDict, the DFS process proceeds as follows:
dfs(0, s, wordDict);
Start: 0
Matched word: "apple"
-> dfs(5, s, wordDict);
Start: 5
Matched words: "apple" (no match), then "pen"
-> dfs(8, s, wordDict);
Start: 8
Matched word: "apple"
-> dfs(13, s, wordDict);
Finally, the result is propagated upward. To avoid time limit exceeded (TLE) errors, memoization is applied.
class Solution {
unordered_map <int, bool> memo;
public:
bool wordBreak(string s, vector<string>& wordDict) {
return dfs(0, s, wordDict);
}
bool dfs(int start, string s, vector<string>& wordDict) {
if (start == s.size()) return true;
if (memo.count(start)) return memo[start];
for (const string& word : wordDict) {
int len = word.length();
if (start + len <= s.size() && s.substr(start, len) == word) {
if (dfs(start + len, s, wordDict)) {
return memo[start] = true;
}
}
}
return memo[start] = false;
}
};
DFS with Memoization - CSY
Here, a bottom-up dynamic programming approach is also provided.
class Solution {
public:
bool wordBreak(string s, vector<string>& wordDict) {
int n = s.size();
vector<bool> dp(n + 1, false);
dp[n] = true;
for (int i = n - 1; i >= 0; --i) {
for (const auto& w: wordDict) {
int wn = w.size();
if ((i + wn) <= n && s.substr(i, wn) == w) {
dp[i] = dp[i + wn];
}
if (dp[i]) break;
}
}
return dp[0];
}
};Bottom-Up DP - Neetcode
The procedure would look like this:
Start from i = 12 → down to i = 0
s = "applepenapple"
- index 8 → matches "apple" → dp[13] = true → set dp[8] = true
- index 5 → matches "pen" → dp[8] = true → set dp[5] = true
- index 0 → matches "apple" → dp[5] = true → set dp[0] = true
This task is essentially designed to train those skills. However, it seems I still haven't fully grasped the overlapping substructure involved. In addition, my string manipulation skills are not yet strong enough. These two solutions don’t differ much; it’s mainly a matter of traversing from left to right versus right to left.
Conclusion
Actually, this problem helps strengthen skills essential for natural language processing tasks such as tokenization and sequence planning. Although it may seem like it only tests recursive thinking, it contains deeper insights than it initially appears to.