GrammarViz approach for time series pattern discovery is based on two algorithms that have linear time and space complexity: (i) Symbolic Aggregate approXimation (SAX), that discretizes the input time series into a string, and (ii) Sequitur, that induces a context-free grammar (CFG) from it. Recently, we added another grammar induction algorithm called Re-Pair, which is a bit slower than Sequitur, but powers an additional capability of our tool -- the automated discretization parameters selection.
By exploiting the hierarchical structure of CFG, GrammarViz is able to identify rare and frequent grammar rules in real time, i.e., along with the signal acquisition. Naturally, we associate these patterns with anomalous and recurrent sub-sequences. Thanks to SAX numerosity reduction and CFG hierarchy, our approach is able to discover patterns of both types that are of variable length.
Please find details about our techniques in these publications:
GrammarViz 3.0 is developed in Java and runs on all platforms. It can be used as a stand-alone application with GUI, called from a command line, or linked as a library. For GUI, we followed the Model–view–controller (MVC) pattern which allows for the code re-use, for example in a web applicaion. Our code is hosted at GitHub, we care about it, and using Travis CI to track our builds.
Our software is under active development. Among other things, we are investigating alternative GI algorithms performance, researching the possibility to leverage a grammar's hierarchy for patterns weighting, and working on the system's performance and usability.