Pattern matching with wildcards and length constraints using maximum network flow
Abstract
In 2006 Chen et al. introduced an interesting text pattern matching problem with unique features. A pattern is described by a sequence of letters (literals) separated by a number of wildcards. For each pair of consecutive literals in the pattern description, a local length constraint specifies the minimum and maximum number of wildcards. Also, the pattern must occur in the text within a window for which the minimum and maximum lengths are specified by the global length constraint. Text letters are exclusively assigned to occurrences (the one-off condition). The objective is to find the maximum number of occurrences of the pattern in the text with all the constraints. For the problem (in this paper we call it Problem 1), there is a polynomial time online algorithm which does not guarantee an optimal solution. There are also polynomial time heuristic algorithms for the problem, and for some of its restricted cases. It is not known if Problem 1 is NP-hard. In this work, for two special cases of Problem 1, we give reductions to the maximum flow problem which can be solved in polynomial time.