Mining protein interaction data and its context from the scientific
PhD studentship in bioinformatics/text mining, University of Manchester
re-posted from psb.vib-ugent.be,21.05.2010
The School of Computer Science invites applications for a four year BBSRC-funded CASE studentship
commencing in the academic year 2010/11.
The studentship is open to
UK/EU applicants and will pay tuition fees in addition to a starting
£ 15,790 pa for UK students,
£ 13,000 pa for EU student.
It will also involve a research placement with the industrial CASE
partner, Pfizer Global R&D in Sandwich, Kent.
The project will involve research on the context of protein
interaction data from the scientific literature. The main archive of
life sciences literature currently contains more then 17 million
references and grows by approximately 2,000 articles every day. This
biomedically relevant information is invaluable and represents a rich
source of knowledge. However its current, let alone future size, is
rendering it virtually impossible for individuals scientists to keep
the pace with publications in their own area, let alone related ones.
This has led to the generation of secondary databases that mine
specific information from the published literature. For example, much
emphasis has been placed on using text mining (often manually) to
identify protein interactions. However, little attempt is made to
capture the context of such information, how reliable it is, what is
the nature of interaction etc. This project will study the way
findings, experiments and knowledge about protein interactions is
presented in the literature, and in particular how contextual
information that details a protein interactions are encoded and
presented. To do this we will implement a state of the art text
mining framework to extract from full-text articles, link and contrast
protein interaction contextual information with data in other
(structured) resources to support informed decisions for understanding
the complexity of interactions and identification of potential drug
To be relevant to the industrial partner (Pfizer R&D), focus
will be placed on pharmaceutically relevant protein interaction data
sets, for example, pathogens such as HIV, hepatitis viruses, malaria
etc. The knowledge extracted will be characterised by quantitative
measures that may be indicative of its quality or relevance for a
specific interaction (bibliometrics such as number of citations and
mentions; peaks and changes over time; association with specific
entities such as experimental methods, model systems, drug
associations, outcomes, etc.). Importantly, the general framework
developed for placing biomedical 'facts' in context will be applicable
to other text mining domains.
Qualifications and experience
Applicants should ideally have experience in computational biology,
bioinformatics, computer science or a related subject area. Knowledge
of a programming language and text and/or data mining would be a
For details on how apply go to link .
If you require further details, please contact:
Dr. David Rovertson , Faculty of Life Sciences, University of Manchester or
Dr. Goran Nenadic , School of Computer Science, University of Manchester.