EuGene est un logiciel d'annotation de génomes eucaryotes intégratif. La prédiction finale produite intègre des analyses statistiques (contenu statistique, sites d'épissage...), des données de similarités (protéines, transcrits, EST, RNASeq...) ou de conservation entre génomes, des prédictions existantes, des prédictions de régions répétées ou non fonctionnelles.
As most existing gene finders, EuGene can exploit probabilistic models like Markov models for discriminating coding from non coding sequences or to discriminate effective splice sites from false splice sites (using various mathematical models). Beyond this EuGene is able to integrate information from several signal (splice site, translation start...) prediction software, similarity with existing sequences (EST, mRNA, 5'/3' EST from full length mRNA, proteins, genomic homologuous sequences) and output of existing gene finders... Based on all the available information, EuGene will output a prediction of maximal score i.e., maximally consistent with the information provided.
Each source of information is integrated in EuGene by a small independant software component, called a "plugin". The plugin is responsible for the integration of the information but also for plotting the information on the graphical output of EuGene (if needed) and can also analyze the inconsistencies between the final prediction and the information provided.
There exists a large variety of plugins currently but if needed EuGene's users have the ability to extend EuGene. This can be done using two different approaches. One simple approach is to use the "Annotastruct" plugin. This plugin allows to inject information in EuGene using a GFF file. For the more serious user, it is possible to write a new plugin directly (in C++) and to load it dynamically into EuGene (without recompilation of eugene).