
|
.:: Extra BLAST::.
Download
Description
Extra BLAST is a perl script which appends a column to a given tabular NCBI Blast output, with info
about the subject entry. For example, turns this:
Contig1 gi|110083942|gb|DQ822993.1| 90.74 497 46 0 179 675 141 637 7e-174 620
Contig1 gi|110631513|gb|DQ784686.1| 88.89 495 55 0 181 675 136 630 3e-151 545
Contig1 gi|109900625|gb|DQ640309.1| 88.89 495 55 0 181 675 136 630 3e-151 545
Contig3 gi|124222179|dbj|AK224773.2| 93.37 392 17 2 82 472 1 384 1e-158 569
Contig3 gi|164508136|emb|AM412177.1| 100.00 89 0 0 539 627 1871 1959 2e-40 176
Contig3 gi|161085627|dbj|AB305024.1| 100.00 89 0 0 539 627 89 1 2e-40 176
Contig4 gi|1679608|emb|X62395.1|NTLTP1 87.32 142 18 0 277 418 1541 1682 5e-29 139
Into this:
Contig1 gi|110083942|gb|DQ822993.1| 90.74 497 46 0 179 675 141 637 7e-174 620 Solanum phureja pectin methylesterase inhibitor isoform mRNA, complete cds.
Contig1 gi|110631513|gb|DQ784686.1| 88.89 495 55 0 181 675 136 630 3e-151 545 Capsicum annuum cultivar Hanbyul pectin methlyesterase inhibitor protein 1 (PMEI1) gene, complete cds.
Contig1 gi|109900625|gb|DQ640309.1| 88.89 495 55 0 181 675 136 630 3e-151 545 Capsicum annuum pectin methlyesterase inhibitor protein 1 (PMEI1) mRNA, complete cds.
Contig3 gi|124222179|dbj|AK224773.2| 93.37 392 17 2 82 472 1 384 1e-158 569 Solanum lycopersicum cDNA, clone: FC14BG04, HTC in fruit.
Contig3 gi|164508136|emb|AM412177.1| 100.00 89 0 0 539 627 1871 1959 2e-40 176 Phytophthora cinnamomi tub1 gene for alpha-tubulin, strain Pr120.
Contig3 gi|161085627|dbj|AB305024.1| 100.00 89 0 0 539 627 89 1 2e-40 176 Vaccinium ciliatum DNA, microsatellite marker VM32.
Contig4 gi|1679608|emb|X62395.1|NTLTP1 87.32 142 18 0 277 418 1541 1682 5e-29 139 N.tabacum ltp1 gene for lipid transferase.
Or even this (sequences wraped):
Contig1 gi|110083942|gb|DQ822993.1| 90.74 497 46 0 179 675 141 637 7e-174 620 ggatcacactcaactagctttagttattgagaaacaaaac...
Contig1 gi|110631513|gb|DQ784686.1| 88.89 495 55 0 181 675 136 630 3e-151 545 gcacgaggaaagaattcattttttttaaaagaaaggctca...
Contig1 gi|109900625|gb|DQ640309.1| 88.89 495 55 0 181 675 136 630 3e-151 545 gcacgaggaaagaattcattttttttaaaagaaaggctca...
Contig3 gi|124222179|dbj|AK224773.2| 93.37 392 17 2 82 472 1 384 1e-158 569 tattcgggttgcagatggcggtgttgccgcgttcctcaac...
Contig3 gi|164508136|emb|AM412177.1| 100.00 89 0 0 539 627 1871 1959 2e-40 176 aggcaccagcattcttgttggccacaacttcaaggcatgg...
Contig3 gi|161085627|dbj|AB305024.1| 100.00 89 0 0 539 627 89 1 2e-40 176 tttaggtgacactatagaatactcaagctatgcatccaac...
Contig4 gi|1679608|emb|X62395.1|NTLTP1 87.32 142 18 0 277 418 1541 1682 5e-29 139 tgaacttattaaccttttgataacatgacgtcaacttaat...
Output
Extra BLAST creates the output file on a tabular format readable by most of spreadshets (e.g. Calc or Excel)
and a folder (cache) containing the GenBank entries downloaded.
Usage
First you have to turn on the execution flag with:
$> chmod +x extra_blast
An then just execute it as:
$> ./extra_blast BLAST_FILE [FILE_OUT [FIELD [CACHE_DIR]]]
Where:
- BLAST_FILE [file in]: The Blast output file on tabular format (-m 8 or -m 9).
- FILE_OUT [file out]: The file with the extra column, "extra_out.csv" by default.
- FIELD [str]: The field I must append to the blast tabular file, "DEFINITION" by default.
The available fields are: DEFINITION, ACCESSION, VERSION, KEYWORDS, COMMENT and ORIGIN. Note that
you can also use the fields SOURCE, REFERENCE and FEATURES but you can have some unespected behavior.
- CACHE_DIR [dir out]: The folder where I store the cached entries, "cache" by default.
Examples
You can take a look into the examples folder on the Extra BLAST downloaded package. See bellow the
command line used to generate each one. phureja.fasta is a multifasta not into the package.
$> # examples/blastnphureja_nr.csv
$> blastall -p blastn -i phureja.fasta -o examples/blastnphureja_nr.csv -d nr -m 8
$> # examples/blastnphureja_nr_definition.csv
$> ./extra_blast examples/blastnphureja_nr.csv examples/blastnphureja_nr_definition.csv
The examples above on the Description section are the first 7 lines of the files on the examples folder,
except the last one that can be generated by:
$> ./extra_blast examples/blastnphureja_nr.csv examples/blastnphureja_nr_definition.csv SEQUENCE
Developer
Luis M. Rodriguez
|