NCBI Blastn accessing the NCBI Qblast system

NCBI's Disclaimer and Copyright notice.

Standard nucleotide-nucleotide BLAST

Takes nucleotides sequences and compares them against the NCBI nucleotide databasesIt is better at finding sequences similar, but not identical, to your query.

The BLAST nucleotide algorithm finds similar sequences by generating an indexed table or dictionary of short subsequences called words for both the query and the database. The program can then rapidly find initial exact matches to the query words by simply looking up a particular word in the database dictionary. These initial matches serve as starting points for longer alignments that are generated in several steps, ending with a final gapped alignment.

One of the important parameters governing the sensitivity of BLAST searches is the length of the initial words (word size). The most important reason that blastn is more sensitive than MEGABLAST is that it uses a shorter default word size. Because of this, blastn is better than MEGABLAST at finding alignments to related nucleotide sequences from other organisms since the initial exact match can be shorter. The word size is adjustable in blastn and can be reduced from the default value of 11 to a minimum of 7 to increase sensitivity. This word size can also be increased to increase the search speed and limit the number of database hits.

Search for short and near exact matches

It is useful for primer or short nucleotide motif searches. Short sequences (less than 20 bases) will often not find any significant matches to the database entries under the standard nucleotide-nucleotide BLAST settings. The usual reasons for this are that the significance threshold governed by the expect value parameter is set too stringently and the default word size parameter is set too high. You can adjust both the word size and the expect value on parameter table to work with short sequences. 

Program Word Size Filter Setting Expect Value
Standard Nucleotide BLAST 11 On 10
Search for short/near exact matches 7 Off 1000

A common use of this is to check the specificity of primers used in the polymerase chain reaction (PCR) or hybridization. A useful way to check a pair of PCR primers is to concatenate them and search them as one sequence. The forward primer and the reverse primer can simply be pasted together with a string of ten or more N's between the two sequences. Since BLAST looks for local alignments and searches both strands, there is no need to reverse complement one of the primers before doing the concatenation or the search.

Notes

Nucleotide-nucleotide searches are not the recommended way to find homologous protein coding regions in other organisms. It is better to perform searches at the protein level, either with translations of the nucleotide sequences or by direct protein-protein BLAST. This is because of the degeneracy of the genetic code, the greater information available in amino acid sequence, and the more sophisticated algorithm in protein-protein BLAST.

The query sequence should contain no ambiguous bases. Consensus motifs with degenerate bases will not work for this type of search.

Parameters Setting

[COMPOSITION BASED STATISTICS] Do search with tweak parameter set to true, learn more. This will automatically perform a gapped alignment, so using UNGAPPED_ALIGNMENT also is unnecessary and will trigger a warning message from NCBI rather than generating results.

[DATABASE] Valid database name, 

[EXPECT] The statistically significant expectation value.  If the statistical significance ascribed to a match is greater than the E value, the match will not be reported.  Lower E values are more stringent, leading to a fewer chance matches being reported. Learn more

[ENTREZ_QUERY] Entrez query to limit Blast search

[FILTER] Sequence filter identifier

[GAP_OPEN_COSTS] Gap open costs

[GAP_EXTEND_COSTS] Gap extend costs

[HITLIST_SIZE] Number of hits to keep

[LCASE_MASK] Enable masking of lower case in query

[NUCL_PENALTY] Penalty for a nucleotide mismatch (blastn only)

[NUCL_REWARD] Reward for a nucleotide match (blastn only)

OTHER_ADVANCED

    *[DROPOFF] Blast extensions in bits (default if Zero), not applicable for megablast

    *[FIANL_X_DROPOFF] Final X dropoff value for gapped alignment (in bits), not applicable for megablast

    *[DB_LENGTH] Effective length of the database (use Zero for real size)

[PROGRAM] Blast program name

[QUERY_BELIEVE_DEFLINE] Whether to believe defline in FASTA query

[QUERY_FROM] Start of subsequence (one offset)

[QUERY_TO] End of subsequence (one offset)

[SEARCHSP_EFF] Effective length of the search space 

[SERVICE] Blast service which needs to be performed

[THRESHOLD] Threshold for extending hits

[UNGAPPED_ALIGNMENT] Should the ungapped alignment be performed? Note that this parameter should not be set to TRUE or YES when using COMPOSITION_BASED_STATISTICS since that will automatically perform a gapped alignment; if this parameter is on, it will trigger a warning message from NCBI rather than generating results.

[WORD SIZE] The search word size


INCOGEN Web Site
Webmaster - support@incogen.com
Copyright © 1999-2005 INCOGEN Inc., All rights reserved.