Tense patterns

Default

Temporal (Negative) Sequential pattern mining algorithm


Please log in to perform a job with this app.


TENSE patterns -- TEmporal Negative Sequential patterns

TENSE is an algorithm to extract negative sequential patterns from dataset of sequences. A negative sequential pattern is a pattern negative itemsets, ie specifying absence of event. Negative temporal patterns add numerical temporal constraints about durations between successive events.

The TENSE algorithm combines the mining of negative temporal patterns (NegPSpan) with a density clustering algorithm (MAFIA algorithm). It enables to add maxgap constraints.

Our software enables to evaluate alternative configurations of TENSE pattern extraction:

  • possibility to use eNSP instead of NegPSpan to extract the negative patterns
  • different configurations of NegPSpan algorithms (different semantic of negative patterns)

Input format

The algorithm process dataset of sequences. Possible file extensions are .dat or .txt.

The input format of a dataset is is the IBM format. Each line of the file represents an itemset. The line give, in that order, the sequence id, the itemset timestamp, the size of the itemset and the set of the items.

The following example illustrates the encoding of the dataset (see embeddings example):

  • 1 3 2 5 4
  • 1 (2 3) 5 4
  • 1 2 5 4
  • 1 5 4

    0 1 1 1
    0 2 1 3
    0 3 1 2
    0 4 1 5
    0 5 1 4
    1 1 1 1
    1 2 2 2 3
    1 3 1 5
    1 4 1 4
    2 1 1 1
    2 2 1 2
    2 3 1 5
    2 4 1 4
    3 1 1 1
    3 2 1 5
    3 3 1 4

Parameters

  • cneg: use NegPSpan (eNSP otherwise)
  • f: minimum support (number of transactions), for eNSP it is the minimum support for positive patterns
  • n: minimum support of negative patterns for eNSP algorithms
  • mg: max gap constraint
  • MAFIA parameters
    • MW: maximum number of windows
    • mW: minimum number of windows
    • alpha: density threshold for dense unit
    • beta: merging threshold

Limitations

This online version prevents from too long processes or too heavy memory requirements (for fair use of our servers). The following additional setting can not by modified:

  • maximum pattern length is 4
  • maximum size of negative itemsets is 2

References

PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth
Jian Pei, Jiawei Han, Behzad Mortazavi-asl, Helen Pinto, Qiming Chen, Umeshwar Dayal and Mei-chun Hsu IEEE Computer Society, 2001, pages 215
e-NSP: Efficient negative sequential pattern mining
Longbing Cao, Xiangjun Dong, Zhigang Zheng, Artificial Intelligence, 2016, 235:156–182
Mining non-redundant time-gap sequential patterns
Yen, S.-J. and Y.-S. Lee, Applied intelligence 39(4), 2013, 727–738

In input :


In output :

16/03/2018 : Version 0.1, initial version

How to use our REST API :

Think to check your private token in your account first. You can find more detail in our documentation tab.

This app id is : 175

This curl command will create a job, and return your job url, and also the average execution time

files and/or dataset are optionnal, think to remove them if not wanted
curl -H 'Authorization: Token token=<your_private_token>' -X POST
-F job[webapp_id]=175
-F job[param]=""
-F job[queue]=standard
-F files[0]=@test.txt
-F files[1]=@test2.csv
-F job[file_url]=<my_file_url>
-F job[dataset]=<my_dataset_name> https://allgo.inria.fr/api/v1/jobs

Then, check your job to get the url files with :

curl -H 'Authorization: Token token=<your_private_token>' -X GET https://allgo.inria.fr/api/v1/jobs/<job_id>