Intron, Exon & UTR explorer...

Frequently Asked Questions

Question 1.
What is Intron, Exon and UTR explorer?

Intron, Exon and UTR explorer (IEUe) is an online tool programmed to isolate the individual intron, exon and UTR sequences from a gene sequence using the starting position of the entire gene and subsequently by using the ending positions / co-ordinates of each IEU.

Question 2.
What other features does this tool have?

Other than extracting individual IEU sequences, this online tool has several other features like 1. Identifying missing nucleotides. 2. Reporting base composition and percentage of each base of each IEU. 3. This tool can automatically construct the entire Intron, mRNA, CDS and UTR sequences by combining individual IEU and there after analyse its base composition and reports the presence of any missing nucleotides.

Question 3.
How does the presence of missing nucleotides affect the base composition?

It is impossible to specifically predict the exact bases in any missing nucleotides region, so while calculating the base composition of any IEU or Gene element (CDS/ mRNA / Intron / Exon), IEUe automatically subtracts the missing regions and the calculations are performed based on the remaining regions with known bases.

Question 4.
How are missing nucleotides identified?

Missing nucleotides are generally reported as 'N'. So this tool looks for presence of 'N' in the sequence. If the nucleotides are reported as anything other than A/T/G/C/N, this tool tries to identify that even before starting any calculations and asks for users input. The user is cautioned with possible impact due to the presence of such characters or symbols. Sequences can be provided either in lower case or upper case or in a combination of both. The system will automatically convert them to upper case.

Question 5.
What are the input requirement to use this tool?

The use of this tool is limited to only two input files - 1. The gene sequence containing file. 2. The IEU co-ordinates containing file.

Question 6.
What is the format of input sequence?

Sequences are reported to be present in either fasta format without any sequence identifier (i.e. GI number, accession number, sequence name or species name) or plain continuous text. However the sequence file must only contain the exact gene sequence from the beginning i.e. right from the 1st base, it should be a part of the gene. The sequence file must be saved in text format (.txt). Unfortunately no other format like .pdf / .rtf /.doc /.docx is allowed.

Question 7.
How to provide the IEU details?

Input file format is same as the sequence containing file i.e. in .txt format. However there is a little difference in providing the IEU co-ordinates. There should be a starting co-ordinate which starts with NS, where N is a integer which is equal to the starting nucleotide position of the gene. Every following IEU must be reported as XI, YE, or ZU, (X, Y, Z all are integer). Each IEU or S must end with a comma (,) for example 112S,230I,445E,560U, etc. The IEUs can be written either in a single line or each in separate line or in a combination of both. Presence of space between each IEU data is not important and the system by itself can take care of that. The most important part is that it should start with a starting co-ordinate mentioned by NS, and each IEU data must end with a comma (,) including the last one. Another important requirement is that positions co-ordinates must be in an increasing order i.e. 112S, 230I, 445E, 560U, where 112<230<445<560.

Question 8.
What are the output options?

After successfully completing the input of both essential files, the user is provided with two output options. Outputs can be instantly viewed in the browser or can be downloaded as a text (.txt) file to analyse later. In fact one can download the output after viewing the output in the browser and vice-versa. The downloadable file is coded in such a way that the data in it can be directly copied to any standalone spreadsheet application (Microsoft office excel or Libre office calc) for further analysis or comparison.

Question 9.
How long the outputs are stored in the server?

The host server does not store any output file. Outputs are generated (for both browser viewing as well as for download) on the fly. So once the user have moved away from this site or have closed the browser, the input files are no more recognized. Thus the user is required to input the sequence and IEU data files again in order to view/download the output. In case the user has not moved away from this site but have not used for a long time, the server stores the input files for a maximum of 30 minutes which in general should be more than enough to analyse any sequence. After 30 minutes the input files are deleted automatically. So if the user want to visualize the output / download the output, it is essential to initiate the entire process once again. This time limit is imposed to make the optimum use of storage and band-width. Although both are very minor in today's terms, but there are no points to unnecessarily store these files.

Question 10.
What is composition analysis?

Composition analysis is an additional feature of this online tool. In this tool by providing any nucleotide sequence, the base composition, base proportion or presence of missing nucleotides can be identified.

Question 11.
Why is IEU explorer is called as an online tool rather than a Web-Server?

The definitions of present day academic/research online tools and web-servers are not very clearly defined. This IEU explorer tool is designed to simply analyse the composition of any poly-nucleotide sequence and identify separate IEU sequences and recombine them in specific order. The main aim behind the development of this tool is minimization of human labour and thereby saving some time from performing unnecessary tedious repetitive work as well as minimization of any possible human error that are common when performing repetitive works with specific positions and less variety in sequence data (sequence only contain continuous stretches of A/T/G/C and its really hard to find a single 'N' from thousands of lines of A/T/G/C). Obviously the codes are hosted in a web-server and being written in PHP and HTML, it is essential to process it through a server application and the interface is also a web-browser, but the authors have found that based on the complexity (or simplicity in other terms) of the codes, the web-tool or online-tool term is more suitable to IEU explorer than assigning it a web-server status.