Class AlgoCM_ClaSP


  • public class AlgoCM_ClaSP
    extends java.lang.Object
    This is an implementation of the ClaSP algorithm. ClaSP was proposed by A. Gomariz et al. in 2013. NOTE: This implementation saves the pattern to a file as soon as they are found or can keep the pattern into memory, depending on what the user choose. Copyright Antonio Gomariz PeƱalver 2013 This file is part of the SPMF DATA MINING SOFTWARE (http://www.philippe-fournier-viger.com/spmf). SPMF is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. SPMF is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with SPMF. If not, see .
    Author:
    agomariz
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected Trie FrequentAtomsTrie
      Trie root that starts with the empty pattern and from which we will be able to access to all the frequent patterns generated by ClaSP
      long joinCount  
      protected long mainMethodEnd
      Start and End points in order to calculate the time taken by the main part of CloSpan algorithm
      protected long mainMethodStart
      Start and End points in order to calculate the time taken by the main part of CloSpan algorithm
      protected double minSupAbsolute
      The absolute minimum support threshold, i.e.
      long overallEnd
      Start and End points in order to calculate the overall time taken by the algorithm
      long overallStart
      Start and End points in order to calculate the overall time taken by the algorithm
      protected long postProcessingEnd
      Start and End points in order to calculate the time taken by the post-processing method of CloSpan algorithm
      protected long postProcessingStart
      Start and End points in order to calculate the time taken by the post-processing method of CloSpan algorithm
    • Constructor Summary

      Constructors 
      Constructor Description
      AlgoCM_ClaSP​(double support, AbstractionCreator abstractionCreator, boolean findClosedPatterns, boolean executePruningMethods)
      Constructor of the class that calls ClaSP algorithm.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void claSP​(SequenceDatabase database, long minSupAbsolute, boolean keepPatterns, boolean verbose, boolean findClosedPatterns, boolean executePruningMethods)
      The actual method for extracting frequent sequences.
      void clear()
      It clears all the attributes of AlgoClaSP class
      Trie getFrequentAtomsTrie()
      Get the trie (internal structure used by ClaSP).
      int getNumberOfFrequentClosedPatterns()  
      int getNumberOfFrequentPatterns()  
      long getRunningTime()
      It gets the time spent by the algoritm in its execution.
      java.lang.String printStatistics()
      Method to show the outlined information about the search for frequent sequences by means of ClaSP algorithm
      void runAlgorithm​(SequenceDatabase database, boolean keepPatterns, boolean verbose, java.lang.String outputFilePath, boolean outputSequenceIdentifiers)
      Actual call to ClaSP algorithm.
      Saver runAlgorithm_saver​(SequenceDatabase database, boolean keepPatterns, boolean verbose, java.lang.String outputFilePath, boolean outputSequenceIdentifiers)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • minSupAbsolute

        protected double minSupAbsolute
        The absolute minimum support threshold, i.e. the minimum number of sequences where the patterns have to be
      • overallStart

        public long overallStart
        Start and End points in order to calculate the overall time taken by the algorithm
      • overallEnd

        public long overallEnd
        Start and End points in order to calculate the overall time taken by the algorithm
      • mainMethodStart

        protected long mainMethodStart
        Start and End points in order to calculate the time taken by the main part of CloSpan algorithm
      • mainMethodEnd

        protected long mainMethodEnd
        Start and End points in order to calculate the time taken by the main part of CloSpan algorithm
      • postProcessingStart

        protected long postProcessingStart
        Start and End points in order to calculate the time taken by the post-processing method of CloSpan algorithm
      • postProcessingEnd

        protected long postProcessingEnd
        Start and End points in order to calculate the time taken by the post-processing method of CloSpan algorithm
      • FrequentAtomsTrie

        protected Trie FrequentAtomsTrie
        Trie root that starts with the empty pattern and from which we will be able to access to all the frequent patterns generated by ClaSP
      • joinCount

        public long joinCount
    • Constructor Detail

      • AlgoCM_ClaSP

        public AlgoCM_ClaSP​(double support,
                            AbstractionCreator abstractionCreator,
                            boolean findClosedPatterns,
                            boolean executePruningMethods)
        Constructor of the class that calls ClaSP algorithm.
        Parameters:
        support - Absolute minimum support
        abstractionCreator - the abstraction creator
        findClosedPatterns - flag to indicate if we are interesting in only
    • Method Detail

      • runAlgorithm

        public void runAlgorithm​(SequenceDatabase database,
                                 boolean keepPatterns,
                                 boolean verbose,
                                 java.lang.String outputFilePath,
                                 boolean outputSequenceIdentifiers)
                          throws java.io.IOException
        Actual call to ClaSP algorithm. The output can be either kept or ignore. Whenever we choose to keep the patterns found, we can keep them in a file or in the main memory
        Parameters:
        database - Original database in where we want to search for the frequent patterns.
        keepPatterns - Flag indicating if we want to keep the output or not
        verbose - Flag for debugging purposes
        outputFilePath - Path of the file in which we want to store the frequent patterns. If this value is null, we keep the patterns in the main memory. This argument is taken into account just when keepPatterns is activated.
        outputSequenceIdentifiers - indicates if sequence ids should be output with each pattern found.
        Throws:
        java.io.IOException
      • runAlgorithm_saver

        public Saver runAlgorithm_saver​(SequenceDatabase database,
                                        boolean keepPatterns,
                                        boolean verbose,
                                        java.lang.String outputFilePath,
                                        boolean outputSequenceIdentifiers)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • claSP

        protected void claSP​(SequenceDatabase database,
                             long minSupAbsolute,
                             boolean keepPatterns,
                             boolean verbose,
                             boolean findClosedPatterns,
                             boolean executePruningMethods)
                      throws java.io.IOException
        The actual method for extracting frequent sequences.
        Parameters:
        database - The original database
        minSupAbsolute - the absolute minimum support
        keepPatterns - flag indicating if we are interested in keeping the output of the algorithm
        verbose - Flag for debugging purposes
        Throws:
        java.io.IOException
      • printStatistics

        public java.lang.String printStatistics()
        Method to show the outlined information about the search for frequent sequences by means of ClaSP algorithm
        Returns:
      • getNumberOfFrequentPatterns

        public int getNumberOfFrequentPatterns()
      • getNumberOfFrequentClosedPatterns

        public int getNumberOfFrequentClosedPatterns()
      • getRunningTime

        public long getRunningTime()
        It gets the time spent by the algoritm in its execution.
        Returns:
      • clear

        public void clear()
        It clears all the attributes of AlgoClaSP class
      • getFrequentAtomsTrie

        public Trie getFrequentAtomsTrie()
        Get the trie (internal structure used by ClaSP).
        Returns:
        the trie