Dynamic Markov Compression ( DMcompress )

Bioinformatics Lab

School of Computer Science and Technology, Harbin Institute of Technology


About DMcompress

DMcompress is compression tool for FASTA and multi-FASTA files. DMcompression aim at exploit sequences self properties which could bring improvement at compression ratio. We use the sequences first Entropy as measure of data complexity to determine the orders of Markov models, experimental results on whole latest complete bacterial genome data show that our method could effectively compress genome with a better result.



Software Download

Input files:

Test data download: test.fasta.

User Manual

Pre-Requesites

  • DMcompress should be executed on Linux system.
Run for DMcompress

1: Input data format should in .fasta or multi fasta.

2: sudo chmod 777 DMcompressC

3: sudo chmod 777 DMcompressD

4: Then execute DMcompress by:

DMcompressC inputFile

eg: DMcompressC test.fasta;

( It will generate a file name test.fasta.c in the same directory ).

DMcompressD inputFile

eg: DMcompressD test.fasta.c;

( It will generate a file name test.fasta.c.d in the same directory, test.fasta.c.d will be same with the first input file test.fasta ).

General options include:

o: output file (default: outputFile.c);

v: (version);

t: maxProcs (default: 2);

Citation

Contact

Correspondences regarding the DMcompress should be directed to Rongjie Wang via rjwang.hit@gmail.com.