Drosophila Genomics I
An opportunity for students to integrate different concepts of GENE and MAP, and to explore the different kinds of maps, the power of databases, and the strengths and weaknesses of some biological database searching tools.
Drosophila Genomics II
An introductory exploration of protein sequence conservation and its relationship to conserved protein structures using the Pax-6/Aniridia protein as an example.
Sequence Conservation and Information Theory
An introduction to information theoretic analysis of conserved nucleotide sequences. Introduction to information measurement, weight matrices, and relative entropy using simple and large data set examples.
DNA, RNA & Protein as Information Molecules
An introduction to thinking of DNA and proteins as information molecules; notions of sequence polarity, complementarity, codons, degeneracy. Introduction to notions of molecular machines that follow rules/algorithms;
Open Reading Frame finder is used as an example;
Introduction to pseudocode as an effective way to describe algorithms;
Introduction to notion of special consensus sequences for particular functions (Kozak/Cavener sequence for protein translation start used as example).
Sequence Alignment/Conservation: CLUSTAL W and BLASTP
Introduction to sequence alignment;
Alignment of protein sequences reveals conserved motifs (homeodomain example);
Exploring the relationship between protein sequence conservation and function;
Relating sequence conservation to conserved structure; examining 3D structure using Cn 3D program.
RNA Secondary Structure Prediction with Mfold
An example of structure prediction from sequence:
Prediction of 2-D structure from 1-D sequence (tRNA example);
Prediction algorithm minimizing free energy;
Comparison with known structure (using RasMol program).
EST Sequence Assembly
Sequence assembly from short sequence runs: assembling cDNA sequences used as example;
The sequence assembly is misled by the problem of sequence duplications within the genome (two orthologs);
Comparison with genomic sequences (using BLAST algorithm) used to resolve sequence assembly problem;
Strategies to identify sequence overlaps considered.
Drosophila Splice Site Database
Introduction to the power of relation databases to analyze large-scale genomic data; splice sites used as example;
Introduction to information theory as approach to analyze consensus sequences;
Students work with dataset of 10,057 splice sites;
Using database stored procedures, students compare splice site information near long introns and short introns;
Uses the IGS Drosophila splice site database.
Microarray Clustering Analysis
Topics include data normalization, distance measures, and using clustering algorithms (k-means, heirarchical clustering, SOMs). Comparisons are made between microchip and slide data.
An introduction to genetic nets using Object Oriented Programming.