MolGen Class “A Practical Course in Programming for Biologists” Philip M. Kim and Gary D. Bader This course is designed to teach experimental biologists the basics and hands-on knowledge of bioinformatics programming. In today’s world, most graduate students in the Molecular Genetics will encounter situations where they have to make use of computational tools and deal with large amounts of data. The main objective of this class is to give students the power of automation via Bioinformatics programming. The class teaches by example and makes students comfortable with doing basic programming in perl, adapting existing programs to their need and interfacing with standard bioinformatics software such as BLAST, clustalW or others. High level views of bioperl and R are covered as well. The class is a standard MolGen module, covering six weeks, each week there will be a total of 2.5 hours of class split into two 1.25 hour sessions. In each week, ~one homework assignments will be handed out. Completing assignments is the best way to learn programming and is an integral component to the class and a large portion of the grade will be based on it. We emphasize here that this will be a relatively time-consuming class. The effort will be worth it, the best way of learning programming is by performing the assignments. Each 45 min lecture is preceded by 30 minutes of recitation section, during
which the solution to the homework assignment will be discussed. Both the
lecture and the recitation section are meant to be highly interactive, which
is why the class initially is limited to 15 students. There will be in-class labs and exercises, so each student is expected to bring a laptop computer. Users of Mac OS X or Linux computers should ensure that a terminal application (e.g., "Terminal" for OS X or "xterm" for Linux) is installed. Users of Windows computers are asked to install the cygwin environment. All students should make sure that perl is installed on their computers (typing "perl -h" in a terminal window will reveal this) and should install a text editor application. Recommended choices are komodo edit or GNU emacs. We recommend the textbook "Learning Perl, 5th Edition". However, a number of very good (and free) online resources also exist, such as perldoc.perl.org and beginning perl. Also, a number of handy reference tables exist, such as this little cheatsheet. Final Project: Date and Time: Instructor contact info: TA contact info:TBA General course email (email your homework assignments here): progclass@kimlab.org Syllabus: Week 1: Goal getting comfortable with basic tools Lecture 1 Introduction to programming. Why programming? Typical problems. Data to be used: class1 data Lecture 2 - Perl Statements, Basic Syntax and Variables Class 2 notes and sample code: class2 In-class exercises: inclass2 Week 2: Basic perl Lecture 3 - Arrays and lists- File input/output (I/O) Class 3 notes and sample code: class3 Assignment: Lab2 Solution: Solution lab2
Lecture 4 - More on arrays and lists Week 3: Writing more complex scripts Lecture 5 - More on loops and flow control. Common programming patterns Class 5 notes and sample code: class5 In-class exercises: inclass5 Solution: Solution Lab 3 Lecture 6 - More on Hashes, more on string manipulation - Writing more complex programs: Devising subroutines for common tasks Class 6 notes and sample code: class6 In-class exercises: inclass6
Week 4: Parsing and advanced regular expressions Lecture 7 -Regular expressions, REALLY manipulating strings Please see this link for an introduction on regular expressions: http://perldoc.perl.org/perlretut.html Class 7 sample code: Class7sampleCode Inclass-solutions LectureSlidesAssignment: Lab4 Solution: Solution 4 Lecture 8 -Continuing regular expressions - Interfacing with external programs, recipes, reading directories Class 8 sample code: Class8sampleCode Inclass-solutions LectureSlides Week 5: Intro to the R programming language (statistical programming language) Lecture 9 - Intro to R, variables and constructs in R Class 9 notes and sample code: class9 In-class exercises: inclass9 LectureSlides9 Assignment: Lab5 Solution: Solution 5 Additional projects can be found here. Lecture 10 - Basic visualization in R Samples of fancy R plots here Class 10 notes and sample code: class10 In-class exercises: inclass10 LectureSlides10 Week 6: The R programming (continued)Lecture 11: Intro to Bioconductor in R - Installing and using bioconductor modules in R - Basic bioconductor usage, working with expression data Note: Make sure you install Bioconductor HOWTO : Install Bioconductor And also the affy package, by running the following commands. source("http://www.bioconductor.org/biocLite.R") Other packages may need to be installed in class this week over the internet, so ensure your computer is connected when in class. Class 11 sample code and data solution code Slides Assignment: Lab6 Lecture 12: Bioconductor in R (continued) - Continuing with bioconductor - working with sequence data Class 12 sample code and data additionalSampleCode Slides solution-code |
