Genome Evolution Course 2011
www.yanaiweb.com/genome
Itai Yanai, Technion – Israel
Institute of Technology
Michal Levin, Teaching assistant
Problem Set #4 assigned May 9th,
2011
To be submitted by email as a PDF
or WORD document in English or Hebrew by midnight May 23rd, 2011 to
Michal Levin mlevin@tx.technion.ac.il.
Evolution by Domains. Solve
the following five problems.
Problem 1: Evolution of structures. Offer two competing hypotheses that
explain why two superfamilies may share the same
fold.
Problem 2: Domain architecture network. Identify the network for the
proteins domains S_TKc, SH3, PDZ, PH, and GED. Each
domain is a node and each edge represents an instance(s) where the two domains
are found in the same protein (without regard to the organism in which it is
present). Use the SMART database (http://smart.embl-heidelberg.de/). Go to
“Genomic mode”. Enter your queries on the right side of the site, labeled
‘Architecture Analysis’ (for example enter query: ‘SH3 AND S_TKc’
and then click “Architecture query”. To display the domain
architecture graphically then click ‘ALL (52)’). The network should be
submitted as a network figure as shown in the lecture. For each edge note
one example protein and the organism from which it comes (for example “AGAP001683-PA
- Anopheles gambiae”).
Problem 3: Protein similarity networks. In the network below the nodes
correspond to protein sequences and edges correspond to sequence similarity
between a pair of sequences. Assume that if a pair of
proteins have a domain in common it can be detected by sequence
similarity. Given that there are 5 domains altogether among the sequences
(domain = recurrent sequence), what is the distribution of domains among the
proteins? To answer this question, for each domain (domain A, domain B, domain
C, domain D, and domain E) state in which proteins (1-8) it occurs.

Problem 4: Domain Age. Get to know the Interpro
database. Using the phylogenetic distribution associated with each Interpro domain (in the Taxonomy
section), find an example of each of the following
distributions:
For example, RanBP1
is a Eukaryotic specific domain.
Problem 5: Domain Rearrangements. I have prepared
a list of human-fly-worm orthologs with links
to the InterPro database. Each line represents a set
of human, fly and worm orthologous genes. Describe two instances
where domain architecture has evolved in any of the three lineages for a given
orthologous group. For example: Human gene Q9Y3E5 and Drosophila
gene O97067 have a “Peptidyl-tRNA hydrolase II”
domain, while the C. elegans O76387
gene has also a “Ubiquitin-associated/translation elongation factor
EF1B, N-terminal” domain.