Ph.D.Dipartimento di Comunicazione ed Economia Università di Modena e Reggio Emilia Viale Allegri, 9 - 42121 Reggio Emilia (Italy) and Dipartimento di Matematica e Informatica Università di Ferrara Via Machiavelli, 35 - 44121 Ferrara (Italy)
maria.federico AT unimore.it
and fdrmra AT unife.it |
Maria Federico currently is professor with a temporary appointment for two courses at the University of Modena and Reggio Emilia and one course at the University of Ferrara.
Her main research interests include bioinformatics, multimedia content accessibility, web accessibility, computational linguistics,
combinatorial algorithms and machine learning.
She graduated cum laude in Information Technologies
at the University of Pisa in 2006 with a thesis
titled "Notions of Maximality for Motifs in Biological Sequences".
She received the PhD on March 2011 at the
Doctorate School
in Multiscale Modelling, Computational Simulations and Characterization
in Material and Life Sciences of
Modena and Reggio
Emilia University.
From January 2007 to December 2007 she was a research fellow at the
Institute of Computational Linguistics of the National
Research Council (CNR) of Pisa and her main activity was the development of stochastic
models based on the method of Maximum Entropy applied to Natural Language Processing.
Detailed information can be found in CV: english version, italian version.
The detailed PhD attended activity is available here.
The main goal of the doctorate research activity was the development of algorithms and combinatorial methods to solve problems inspired by biology. The main problems we investigated are:
The goal of this research project is the comparison of the two main exact appproaches (direct approach by using suffix trees and double stage approach) to the extraction of structured motifs (i.e., groups of simple motifs occurring relatively near each other) approximated with the Hamming distance in order to deduce a set of conditions on parameter sets which allows biologists to choose the tool more suitable when specific motif search has to be performed. As, to the best of our knowledge, there exists no tool based on the double stage approach which finds structured motifs composed of any number of boxes, we developed a tool which implements this approach.
This research project consists in the development of an ab initio tool which finds long repeats in a single sequence or in two or more sequences. In particular, it is able to find repeats characterized by length greater than 100 characters and such that each pair of occurrences shows substitutions, insertions or deletions in up to 10 to 15% of their length. The method proposed to solve this problem pre-processes the input sequences by an optimized version of the TUIUIU filter, which discards fragments of input sequences that are guaranteed not to contain any searched repeat, and uses information collected during the filtering phase in order to speed up a successive dynamic programming based alignment step performed to find the repeats. To the best of our knowledge, there exists no other tool that can deal with repeats occurring possibly several times, that have length of hundreds or thousands of bases, and whose occurrences may differ in even more than 10% of their positions in terms of substitutions and indels.
The goal of this research project is the extension of notions of maximality already defined for exact simple motifs, to the case of approximate (according to the Hamming distance) motifs. For all of them, we gave a characterization on the suffix tree data structure. This allowed us to show how to adapt a whole class of algorithms based on suffix trees, and for which available tools exist, to infer maximal motifs only. The additional computational cost due to the on-the-fly check of maximality requirements is constant or negligible. We developed a suffix-tree based tool for the extraction of maximal motifs which confirmed theoretical results when applied on biological data.
The aim of this research project is the development of a method for accurate prediction of protein interface residues using sequence information and descriptors of physical-chemical properties of amino acids.
2012 |
|
|
|
|
2011 |
|
|
|
|
2010 |
|
2009 |
|
2008 |
|
2007 |
|
2005 |
|
2012 |
|
2011 |
|
2010 |
|
2009 |
|
|
2008 |
|
|
|
|
|
|