Prediction Page

This page allows users to input their FASTA sequences to predict protein functions.


Input:

Users must enter the following values on the Prediction page:

  • Minimum Confidence: This parameter specifies the minimum confidence level required for a term to be considered during output generation.
  • FASTA Sequences: The sequences to be predicted should be provided in FASTA format. Each query can include up to 50 sequences and is limited to 100,000 characters per task.
  • Add Security Password: Checkbox that determines whether the task will have a password. If not selected, all system users will be able to view the predicted proteins and terms. If the user select the option to add a security password, a unique password for the submitted task will be generated. This password will be required to access the results. You must save the password to access the results.

Errors:

If the input is empty or does not contain sequences in the FASTA format, a notification will appear indicating this error:

Form Without Fasta

Notification for submission with input empty.

If the input has more than 100,000 characters, a notification will appear indicating this error:

Form With Long Sequences

Notification for submission with more than 100,000 characters.


Submission:

If no errors are found during the analysis of submission, the task will be added to the queue for execution.

If the user did not select to add a security password, a notification will indicate the task ID:

Task ID

Notification of task ID if no security password is added.

Otherwise, if the user selected to add a security password, the notification will show the task ID and the password to access the results after processing. You must save the password to access the results:

Form Security Password

Notification of task ID and password if a security password is added.

After submission, a window indicates the ID of the submitted task. With this ID, users can track their request on the Queue page.

Queue Page

After submitting a task, users can monitor the execution queue and task completion on the Queue page.


Pending Tasks

If the task is in the queue for processing, it will appear in the Pending Tasks section. In this section, all tasks are displayed in order of their position in the queue. Each task takes approximately 5 minutes to execute, with the estimated execution time displayed alongside the task ID:

Queue Pending Tasks

Example of task in the Pending Tasks section.


Processed Tasks

After execution, tasks will be moved to the Processed Tasks section. In this section, all tasks will be available for viewing for 24 hours, considering the UTC-3 timezone.

If the task was executed without any issues, the results can be accessed using the corresponding Details button:

Queue Processed Tasks

Example of a task successfully executed in the Processed Tasks section.

If the user chose to add a security password, they will be required to enter the provided password when attempting to access the results:

Task Password

Modal window to input the security password.

Output Page

After the task execution, the results page is available for 24 hours. At the top of the results pages, users can download the results in JSON and CSV formats, as well as all the results, including JSON, CSV, and the figures generated during the evaluation:

Output Header

Top of the results page with donwload option.

Below the header, users can view the results on the screen. For each protein, the page displays the protein name, protein sequence, and the predicted functions with the confidence levels for Biological Process, Cellular Component, and Molecular Function:

Output Results

Example of results for Celluar Component.

Following the results table for each ontology and protein, an image is displayed to illustrate the organization of the predictions according to the Gene Ontology DAG. In the images, each predicted term is represented by nodes, with the relationships between them illustrated as edges, from the parent node to the child node. If the prediction confidence is equal to or greater than 0.75, the node is displayed in green. If the confidence is between 0.5 and 0.75, the node is displayed in yellow. If the confidence is lower than 0.5, the node is displayed in red:

Output Image

Example of predicted terms displayed in the organization of the Gene Ontology DAG for Cellular Component.

Contact Information

If you have any questions or need further assistance, please contact us at gabriel.oliveira@ic.unicamp.br.