Word Break I, II & Concatenated Words

NO IMAGE
1 Star2 Stars3 Stars4 Stars5 Stars 給文章打分!
Loading...

Word Break

連結:https://leetcode.com/problems…

這種找一個詞由多個片語成的題,是拿dp或者dfs來解,dp本質上其實也是dfs。這道題要判斷輸入的詞能否由字典裡的單片語成,那麼可以用個boolean的dp陣列。

initialize dp[s.length() 1], dp[0] = true

dp function: dp[i] = dp[j] & (s[j, i] in dict)

result: dp[s.length()]

第二步的dp function,兩種找的方法,一個是j從0到i – 1迴圈,還有一種是traverse整個dict,j = i – word.length()。當字典很大,s不長的時候,用第一種,當字典不大,但是s很長的時候用第二種。這題現在給的dict是個list不是set沒法O(1)判斷s[j, i] in dict,所以用第二種來寫。

public class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
/* boolean dp[s.length()   1]
* 1. initial: dp[0] = true
* 2. function: dp[i] = dp[j] & (s[j, i] in dict)
* 3. result: dp[s.length()]
*/
boolean[] dp = new boolean[s.length()   1];
dp[0] = true;
for(int i = 1; i < dp.length; i  ) {
for(String word : wordDict) {
int j = i - word.length();
if(j >= 0 && dp[j] && s.substring(j, i).equals(word)) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}

Word Break II

連結:https://leetcode.com/problems…

和上一題不同的是,這道題要返回所有可能的組合。所以現在dp[i]裡面應該存可以使長度為i所有可能的String裡的最後一個word。dp有兩種寫法,一個就是直接寫成陣列:List[]的形式,不能形成的dp[i] = null。還有一個是用個hashmap:Map<Integer, List>,用hashmap的好處是如果s很長而且用dict能組合成的長度不是很多的話,map用的空間相對少。
dp結束之後,第二步就是通過dp裡面儲存的word,一步一步回溯找到所有結果。

public class Solution {
public List<String> wordBreak(String s, List<String> wordDict) {
/* dp: 
* map<Integer, List<String>> dp
* dp function: put(i, word) if s[j, i] = word & j is a key of dp
* result: dp[s.length()] backtracking
*/
dp.put(0, new ArrayList());
dp.get(0).add("");
for(int i = 1; i < s.length()   1; i  ) {
for(String word : wordDict) {
int j = i - word.length();
if(j >= 0 && s.substring(j, i).equals(word) && dp.containsKey(j)) {
if(!dp.containsKey(i)) dp.put(i, new ArrayList());
dp.get(i).add(word);
}
}
}
List<String> result = new ArrayList();
if(!dp.containsKey(s.length())) return result;
// backtracking
dfs(result, s.length(), "");
return result;
}
Map<Integer, List<String>> dp = new HashMap();
private void dfs(List<String> result, int pos, String cur) {
// base case
if(pos == 0) {
result.add(cur);
return;
}
for(String word : dp.get(pos)) {
int j = pos - word.length();
if(j >= 0 && dp.containsKey(j)) {
dfs(result, j, word   (cur.equals("") ? "" : " "   cur));
}
}
}
}

dp本質上就是dfs,這題可以dfs來做。直接的dfs會超時,考慮記憶化搜尋。兩種方式:一個是用dp[i]來記錄(i,n)有valid的string,參考這個人的部落格:
http://www.cnblogs.com/grandy…

public class Solution {
public List<String> wordBreak(String s, List<String> wordDict) {
List<String> result = new ArrayList();
dp = new boolean[s.length()   1];
Arrays.fill(dp, true);
// backtracking
dfs(result, 0, "", s, wordDict);
return result;
}
boolean[] dp;
private void dfs(List<String> result, int pos, String cur, String s, List<String> wordDict) {
// base case
if(pos == s.length()) {
result.add(cur);
return;
}
if(!dp[pos]) return;
for(String word : wordDict) {
int i = pos   word.length();
if(i <= s.length() && s.substring(pos, i).equals(word)) {
int size = result.size();
dfs(result, i, (cur.equals("") ? "" : cur   " ")   word, s, wordDict);
if(size == result.size()) dp[i] = false;
}
}
}
}

還有一種,直接拿hashmap記錄走過的路,這樣就不會重複搜尋了,和dp其實是一樣的,但是這裡把整個路徑都儲存了,之後就不需要再backtracking找路徑了,參考discussion裡面給的:

public class Solution {
public List<String> wordBreak(String s, List<String> wordDict) {
// backtracking
return dfs(s, wordDict);
}
Map<String, List<String>> map = new HashMap();
private List<String> dfs(String s, List<String> wordDict) {
if(map.containsKey(s)) return map.get(s);
// bottom up
List<String> result = new ArrayList();
if(s.length() == 0) {
result.add("");
return result;
}
for(String word : wordDict) {
int i = word.length();
if(i <= s.length() && s.substring(0, i).equals(word)) {
List<String> subs = dfs(s.substring(i), wordDict);
for(String sub : subs) {
result.add(word   (sub.equals("") ? "" : " "   sub));
}
}
}
map.put(s, result);
return result;
}
}

這種記憶化dfs的寫法原理和path sum的有點像。

Concatenated Words

連結:https://leetcode.com/problems…

這道題可以用dp的方法,和word break一樣,多加個迴圈,複雜度是O(N^3),這道題注意下,字典比較大,用第二種來寫dp function會超時,只能用第一種。

public class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> result = new ArrayList();
Set<String> set = new HashSet(Arrays.asList(words));
for(String word : words) {
set.remove(word);
if(wordBreak(set, word)) result.add(word);
set.add(word);
}
return result;
}
private boolean wordBreak(Set<String> words, String s) {
if(s == null || s.length() == 0) return false;
boolean[] dp = new boolean[s.length()   1];
dp[0] = true;
for(int i = 1; i < dp.length; i  ) {
for(int j = i-1; j >= 0; j--) {
if(dp[j] && words.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}

看了discussion裡面的優化,感覺很好,思路是一個word要想被其他片語成,其他詞的長度必然是<這個詞的。所以事先對words排序。這個lc裡面一開始沒加“word.length() > 1”的條件,測試裡面會出現一個字母的結果,很神奇啊,到現在也不知道錯在哪。。

public class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
List<String> result = new ArrayList();
Arrays.sort(words, (a, b) -> a.length() - b.length());
Set<String> set = new HashSet();
for(String word : words) {
if(word.length() > 1 && wordBreak(set, word)) result.add(word);
set.add(word);
}
return result;
}
private boolean wordBreak(Set<String> set, String s) {
if(s == null || s.length() == 0 || set.isEmpty()) return false;
boolean[] dp = new boolean[s.length()   1];
dp[0] = true;
for(int i = 1; i < dp.length; i  ) {
for(int j = i-1; j >= 0; j--) {
if(dp[j] && set.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}

不用set儲存word,用trie tree一個一個往裡加word和查詢,其他都和前一種方法一樣。

public class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
tree = new Trie();
List<String> result = new ArrayList();
Arrays.sort(words, (a, b) -> a.length() - b.length());
for(String word : words) {
if(word.length() > 1 && dfs(word)) result.add(word);
tree.addWord(word);
}
return result;
}
Trie tree;
private boolean dfs(String s) {
if(s.length() == 0) return true;
for(int i = 1; i <= s.length(); i  ) {
if(tree.search(s.substring(0, i))) {
if(dfs(s.substring(i))) return true;
}
}
return false;
}
class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isWord;
}
class Trie {
TrieNode root;
private int getIndex(char c) {
return c - 'a';
}
public Trie() {
root = new TrieNode();
}
public Trie(String[] words) {
root = new TrieNode();
for(String word : words) addWord(word);
}
public void addWord(String word) {
TrieNode node = root;
for(int i = 0; i < word.length(); i  ) {
if(node.children[getIndex(word.charAt(i))] == null) node.children[getIndex(word.charAt(i))] = new TrieNode();
node = node.children[getIndex(word.charAt(i))];
}
node.isWord = true;
}
public boolean search(String word) {
TrieNode node = root;
for(int i = 0; i < word.length(); i  ) {
if(node.children[getIndex(word.charAt(i))] == null) return false;
node = node.children[getIndex(word.charAt(i))];
}
return node.isWord;
}
}
}

直接用trie tree, 沒有優化,結果stackoverflow了。。

public class Solution {
public List<String> findAllConcatenatedWordsInADict(String[] words) {
tree = new Trie(words);
List<String> result = new ArrayList();
for(String word : words) {
if(word.length() > 1 && dfs(word, 0)) result.add(word);
}
return result;
}
Trie tree;
private boolean dfs(String s, int pos) {
if(s.length() == pos) return true;
for(int i = pos; i <= s.length(); i  ) {
if(pos == 0 && i == s.length()) return false;
if(tree.search(s.substring(pos, i))) {
if(dfs(s, i)) return true;
}
}
return false;
}
class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isWord;
}
class Trie {
TrieNode root;
private int getIndex(char c) {
return c - 'a';
}
public Trie() {
root = new TrieNode();
}
public Trie(String[] words) {
root = new TrieNode();
for(String word : words) addWord(word);
}
public void addWord(String word) {
TrieNode node = root;
for(int i = 0; i < word.length(); i  ) {
if(node.children[getIndex(word.charAt(i))] == null) node.children[getIndex(word.charAt(i))] = new TrieNode();
node = node.children[getIndex(word.charAt(i))];
}
node.isWord = true;
}
public boolean search(String word) {
TrieNode node = root;
for(int i = 0; i < word.length(); i  ) {
if(node.children[getIndex(word.charAt(i))] == null) return false;
node = node.children[getIndex(word.charAt(i))];
}
return node.isWord;
}
}
}

相關文章

程式語言 最新文章