Class Trie


  • public class Trie
    extends java.lang.Object
    Trie(https://en.wikipedia.org/wiki/Trie) is an efficient information retrieval data structure that we can use to search a word in O(M) time, where M is maximum string length. However the penalty is on trie storage requirements.
    Author:
    Umberto
    • Constructor Summary

      Constructors 
      Constructor Description
      Trie​(boolean caseSensitive, java.nio.charset.Charset charset)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void add​(java.lang.String word)
      Inserts a word into the trie.
      int countWordStartsWith​(java.lang.String prefix)
      Return how many words starting with prefix.
      java.nio.charset.Charset getCharset()  
      java.util.stream.Stream<Node> getLeafNodes​(Node node)
      Return a List containing the Leaf Nodes starting from a node using Recursive Depth-first search (DFS).
      java.util.List<Node> getLeafNodesJava7​(Node node)
      Return a List containing the Leaf Nodes starting from a node using Recursive Depth-first search (DFS)
      int getNumberOfWords()  
      java.util.Map<java.lang.String,​java.lang.Integer> getSimilarityMap​(java.lang.String word, int maxDistance)
      The search function returns a list of all words that are less than the given maximum distance from the target word, using Levenshtein distance References: http://stevehanov.ca/blog/index.php?id=114 https://en.wikipedia.org/wiki/Levenshtein_distance
      java.util.stream.Stream<java.lang.String> getWordStartsWith​(java.lang.String prefix)
      Return words starting with prefix.
      java.util.List<java.lang.String> getWordStartsWithJava7​(java.lang.String prefix)
      Return words starting with prefix
      void initFalse​(Node node)
      Set to unvisited all the Tries's node.
      boolean isCaseSensitive()  
      void printVector​(java.util.Vector<java.lang.Integer> vec)  
      java.util.Map<java.lang.String,​java.lang.Integer> RecursiveLevenshteinDistance​(Node node, char letter, java.lang.String word, java.util.Vector<java.lang.Integer> previousRow, java.util.Map<java.lang.String,​java.lang.Integer> results, int maxDistance)  
      boolean remove​(java.lang.String word)
      Removes a word from the trie.
      boolean search​(java.lang.String word)
      Returns if the word is in the trie.
      boolean search​(java.lang.String word, boolean doPreprocess)
      Returns if the word is in the trie.
      void setCharset​(java.nio.charset.Charset charset)  
      void show()
      Show The Trie.
      java.lang.String similarity​(java.lang.String word, int maxDistance)
      Returns the word most similar to the target word.
      boolean startsWith​(java.lang.String prefix)
      Returns if there is any word in the trie that starts with the given prefix.
      boolean startsWith​(java.lang.String prefix, boolean doPreprocess)
      Returns if there is any word in the trie that starts with the given prefix.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Trie

        public Trie​(boolean caseSensitive,
                    java.nio.charset.Charset charset)
        Constructor.
        Parameters:
        caseSensitive - set if this is a case sensitive trie
        charset -
    • Method Detail

      • add

        public void add​(java.lang.String word)
        Inserts a word into the trie.
        Parameters:
        word -
      • remove

        public boolean remove​(java.lang.String word)
        Removes a word from the trie.
        Parameters:
        word -
        Returns:
      • startsWith

        public boolean startsWith​(java.lang.String prefix)
        Returns if there is any word in the trie that starts with the given prefix.
        Parameters:
        prefix -
        Returns:
        true|false
      • startsWith

        public boolean startsWith​(java.lang.String prefix,
                                  boolean doPreprocess)
        Returns if there is any word in the trie that starts with the given prefix.
        Parameters:
        prefix -
        doPreprocess -
        Returns:
        true|false
      • search

        public boolean search​(java.lang.String word)
        Returns if the word is in the trie.
        Parameters:
        word -
        Returns:
        true|false
      • search

        public boolean search​(java.lang.String word,
                              boolean doPreprocess)
        Returns if the word is in the trie.
        Parameters:
        word -
        doPreprocess -
        Returns:
        true|false
      • countWordStartsWith

        public int countWordStartsWith​(java.lang.String prefix)
        Return how many words starting with prefix.
        Parameters:
        prefix -
        Returns:
        how many words starting with prefix
      • getWordStartsWith

        public java.util.stream.Stream<java.lang.String> getWordStartsWith​(java.lang.String prefix)
        Return words starting with prefix.
        Parameters:
        prefix -
        Returns:
        a Stream containing words starting with prefix
      • getLeafNodes

        public java.util.stream.Stream<Node> getLeafNodes​(Node node)
        Return a List containing the Leaf Nodes starting from a node using Recursive Depth-first search (DFS).
        Parameters:
        node -
        Returns:
        a Stream containing the Leaf Nodes
      • getWordStartsWithJava7

        public java.util.List<java.lang.String> getWordStartsWithJava7​(java.lang.String prefix)
        Return words starting with prefix
        Parameters:
        prefix -
        Returns:
        a list containing words starting with prefix
      • getLeafNodesJava7

        public java.util.List<Node> getLeafNodesJava7​(Node node)
        Return a List containing the Leaf Nodes starting from a node using Recursive Depth-first search (DFS)
        Parameters:
        node -
        Returns:
        List containing the Leaf Nodes
      • similarity

        public java.lang.String similarity​(java.lang.String word,
                                           int maxDistance)
        Returns the word most similar to the target word.
        Parameters:
        word -
        maxDistance -
        Returns:
      • getSimilarityMap

        public java.util.Map<java.lang.String,​java.lang.Integer> getSimilarityMap​(java.lang.String word,
                                                                                        int maxDistance)
        The search function returns a list of all words that are less than the given maximum distance from the target word, using Levenshtein distance References: http://stevehanov.ca/blog/index.php?id=114 https://en.wikipedia.org/wiki/Levenshtein_distance
        Parameters:
        word -
        maxDistance -
      • RecursiveLevenshteinDistance

        public java.util.Map<java.lang.String,​java.lang.Integer> RecursiveLevenshteinDistance​(Node node,
                                                                                                    char letter,
                                                                                                    java.lang.String word,
                                                                                                    java.util.Vector<java.lang.Integer> previousRow,
                                                                                                    java.util.Map<java.lang.String,​java.lang.Integer> results,
                                                                                                    int maxDistance)
        Parameters:
        node -
        letter -
        word -
        previousRow -
        results -
        maxDistance -
        Returns:
      • initFalse

        public void initFalse​(Node node)
        Set to unvisited all the Tries's node.
        Parameters:
        node -
      • show

        public void show()
        Show The Trie.
      • getNumberOfWords

        public int getNumberOfWords()
      • isCaseSensitive

        public boolean isCaseSensitive()
      • getCharset

        public java.nio.charset.Charset getCharset()
      • setCharset

        public void setCharset​(java.nio.charset.Charset charset)
      • printVector

        public void printVector​(java.util.Vector<java.lang.Integer> vec)