Interface GraphEditDistanceSimilarityAlgorithm<D extends org.processmining.models.graphbased.directed.DirectedGraph<?,​?>>

  • All Known Implementing Classes:
    AbstractGraphEditDistanceSimilarityAlgorithm, GraphEditDistanceSimilarityAStar, GraphEditDistanceSimilarityExhaustive, GraphEditDistanceSimilarityGreedy, GraphEditDistanceSimilarityLexical, GraphEditDistanceSimilarityProcessHeuristic

    public interface GraphEditDistanceSimilarityAlgorithm<D extends org.processmining.models.graphbased.directed.DirectedGraph<?,​?>>
    Graph Edit Distance Similarity based on Dijkman et al.: Graph Matching Algorithms for Business Process Similarity Search The measure is based on the graph-representation of business process models where connector/gateway nodes are removed. Similarity is calculated based on the amount of necessary transformation operations:
    • Node Substitution
    • Node Insertion / Deletion
    • Edge Insertion / Deletion
    There is a cost function for every transformation. The cost of substituting node n1 with node n2 is calculated by 1 - sim(n1, n2). The default similarity metric for node similarity is LevenshteinSimilarity. Similarity of models M1, M2 with activities A1, A2 and edges E1, E2 is then calculated based on the set of substituted nodes subn, inserted/deleted nodes skipn, and inserted/deleted edges skipe as follows: fskipn = |skipn|/(|A1| + |A2|), fskipe = |skipe|/(|E1| + |E2|), fsubn = 2 * sum(1 - sim(a1,a2)) sim(M1, M2) = (wskipn*fskipn + wskipe*fskipe + wsubn*fsubn) / (wskipn + wskipe + wsubn) B. Hompes: This plug-in was copied from the ProM v5 plug-in, and modified to support Petrinets that have non-unique place labels.
    Author:
    rdijkman, b.f.a.hompes, svzelst
    • Method Detail

      • compute

        double compute​(D sg1,
                       D sg2)
        Given two graphs, returns a value by which graphs can be sorted for relevance, lowest value first. E.g. the value can be: - an edit distance (lower edit distance means better match between graphs) - 1.0 - similarity score (lower value means higher similarity score, means better match between graphs)
        Parameters:
        sg1 - A graph.
        sg2 - A graph.
        Returns:
        A value, where a lower value represents a more relevant match between graphs.