Logo
Logo Logo Logo
Logo

help page


Contents

Introdution

The alphabet

The sequence alignment block

The distribution

The translation

Trouble shooting


Back to H-BloX Back to the table of contents


Introdution

H-BloX is a web-based JavaScript application that allows one to calculate and visualize the Shannon information or the relative entropy (Kullback-Leibler "distance") of sequence alignment blocks. Various residue alphabets may be defined and useful modifications of the alignment block and the alphabet can be applied before calculation.


Back to H-BloX Back to the table of contents


The alphabet

You can either select a predefined alphabet, type in a custom alphabet or use the "smart" option. Currently there are three predefined alphabets:
1. The amino acid single letter code (ACDEFGHIKLMNPQRSTVWY)
2. The nucleotides of DNA (ATCG)
3. The nucleotides of RNA (AUCG)
As soon as you change the alphabet in the display manually, the selection switches to "custom". All custom alphabets will be sorted automatically in a lexicographical order. Double entries are removed. All characters with ASCII-codes between 32 and 255 are accepted as alphabet letters. A special option in the alphabet selection is the "smart" option. If the user selects this, the H-BloX program scans the alignment block and extracts the used alphabet from there. The alphabet is updated automatically if the alignment block changes. The number of different symbols in the alphabet is displayed directly beneath the alphabet selection boxes. This number is not editable by the user.


Back to H-BloX Back to the table of contents


The alignment block

Type in an alignment block. The block should consist only of symbols that are present in the alphabet. Your can force this by choosing the "smart" option in the alphabet selection box. However, when processing Protein, DNA or RNA sequences the following characters are ignored in the block: " " (space),"0","1","2","3","4","5","6","7","8","9" (all numbers),"_" (underscore),":" (colon),"," (comma) and ";" (semicolon). Be aware, that this changes the relative position of all symbols in the same sequence that occur after one of the mentioned characters! Also, for Protein, DNA or RNA sequences the block may consist of upper case, lower case or both letter types at the same time. For "custom" and "smart" alphabets, the case does matter!

Since version 1.4 H-Blox is able to read the FASTA and BLOCKS format. The FASTA format is very simple. Here is an example. Each sequence starts with a header line. A header line starts with">", followed by an identifier. The next line(s) contain the sequence. Empty lines are ignored.


>BBP_PIEBR/16

NFDWSNYHGKWWEVA



>ICYA_MANSE/17

DFDLSAFAGAWHEIA



>LACB_BOVIN/25

GLDIQKVAGTWYSLA



>MUP2_MOUSE/27

NFNVEKINGEWHTII

>RETB_BOVIN/14

NFDKARFAGTWYAMA

The BLOCKS format is produced by the BLOCKS program, which is accessible under http://blocks.fhcrc.org/blocks/. The Blocks program searches for ungapped alignment blocks by using two methods (MOTIF and GIBBS). H-BloX can only accept one method at a time. N.B. There can be more than one block within one BLOCKS output. All blocks are combined for the H-BloX calculations, altough in the output the single blocks can be distinguished again. A valid input to H-Blox would be, e.g. :


              **BLOCKS from MOTIF**

 

>Lipocal Five tough ones from Lawrence et al, Science 262:208-214...

5 sequences are included in 2 blocks



            LipocalA, width = 15 LipocalB, width = 11     

 BBP_PIEBR    16 NFDWSNYHGKWWEVA (  70)   101 VLSTDNKNYII

ICYA_MANSE    17 DFDLSAFAGAWHEIA (  73)   105 VLATDYKNYAI

LACB_BOVIN    25 GLDIQKVAGTWYSLA (  70)   110 VLDTDYKKYLL

MUP2_MOUSE    27 NFNVEKINGEWHTII ( 101)   143 DLSSDIKERFA

RETB_BOVIN    14 NFDKARFAGTWYAMA (  77)   106 IIDTDYETFAV

 




Back to H-BloX Back to the table of contents


The background distribution

A background distribution of the alphabet symbols can be defined. There are some presets for the different alphabets available. For all alphabets, there is the possibility to choose an even background distribution. The correct format string will be created instantly. The "custom" option allows you to define your own background distribution. The format is as follows:
"character(s)-> percent[,character(s)-> percent[,...]]",
where "character(s)" stand for one or more ASCII characters with ASCII codes between 32 and 255. Arguments in parentheses [] can be omitted. The left arrow "-> ", consisting of the two ASCII characters with ASCII codes 45 and 62 are used to separate "character(s)" from "percent". Finally, "percent" must be a String, that can be processed by the JavaScript command "eval()" and produces a number. E.g. the number "5" or the string "100/5" would both be valid strings. The background distribution string will be processed from the left to the right. That means, that any existing percent value for a specific character will be overwritten by a new entry, if this new entry appears right from it.


Back to H-BloX Back to the table of contents


The translation

Every symbol of the alphabet and the block can be translated to any character. The format for a translation expression is
"character(s)-> new_symbol[,character(s)-> new_symbol[,...]]",
where "character(s)" stand for one or more ASCII characters with ASCII codes between 32 and 255. Arguments in parentheses [] are optional. The left arrow "-> ", consisting of the two ASCII characters with ASCII codes 45 and 62 are used to separate "character(s)" from "new_symbol". Finally, "new_symbol" must be an ASCII character (ASCII code between 32 and 255). If "new_symbol" consists of more than one character, only the first one is used for the translation. The translation string will be processed from the left to the right. That means, that any translation from one character to another will be overwritten by a new translation, if that new translation appears right from it.


Back to H-BloX Back to the table of contents


Trouble shooting

H-BloX does not react at all

Be sure that your browser supports "JavaScript Version 1.2". Please refer to the documentation of your browser software in order to find out how to achieve this.

The H-Blox page looks funny

Be sure that "Style Sheets" are enabled in your browser. Please refer to the documentation of your browser software in order to find out how to achieve this.

Where are the results?

The results are printed in a separate window. That window is created, when the H-BloX program calculates the information content or the relative entropy of an alignment block for the first time. If an output window already exists due to previous use of the H-BloX tool, it will be reused. Therefore, the results window might be covered under other windows on your monitor screen.

It takes ages until the results appear

The time it takes until the results are shown is dependent only on the computing power of your machine. All operations are performed locally. This is due to the fact, that H-BloX is entirely programmed in "JavaScript". The more sequences are in the alignment block, the longer it will take to process it. Please note, that some versions of the "Microsoft Explorer" will consume quite a long time just to display the results table. At this point the main calculations of the H-BloX program are already done.

There are several "custom" distributions

This occures for all non "Netscape" browsers. Unfortunately, these browsers are not able to delete options in a selection box while the H-BloX program is running. The general function of the program is not affected by this behaviour. Just select one of the several "custom" options and go with it.

Known bugs

Alphabets containing the symbols "-" (ASCII code: 45) and "> " (ASCII code: 62), but no symbols with ASCII codes between 45 and 62 will cause some trouble with the "even" background distribution. Work around: Try to define a "custom" background distribution with a different order of the alphabet symbols in the "character(s)" string. Then, translate either "-" or "> " to another character.

Blocks in FASTA format that contain the symbol ">" (ASCII code: 62) are not handled well, because this symbol is also used as an identifier for sequence names. However, the program tries its best to distinguish between the different meanings of the symbol in such a case.

When using the "Internet Explorer" from Microsoft, the output window should be closed before the H-BloX application is reloaded into a browser window. Otherwise the browswer will report an error. If this happens just close the output window and restart the calculation.

When using the "Internet Explorer" from Microsoft, the output option "Fit output table(s) into window" in Step 6 should not be checked. This problem is not alway present, so try it first...

Back to H-BloX Back to the table of contents