Q1) GREP FUNCTION
The program grep (on Unix) or find (on Windows) is useful for writing out the lines in a file that match a specified text pattern.
Write a script that takes a text pattern and collection of command-line arguments, grep.py pattern file1 file2 file3
and writes out the pattern in the listed files. The program should print the line number. As an example running something like:
grep.py “iter=” case*.res
with input_files as case1.res, case2.res, case3.res, may result in the output
case1.res 4: iter=12 eps=1.2956E-06 case2.res 76: iter=9 eps=7.1111E-04 case2.res 1435: iter=4 eps= 9.2886E-04
That is, each line for which a match of pattern is obtained, is printed with a prefix contain- ing the filename and the line number (nicely aligned in columns, as shown).
Hint 1): You can use sys.argv option for processing the input file list.
Hint 2): Use grep function in Ubuntu to see the required format of the output. Your grep.py function should imitate
grep “iter=” case*.res in the linux/unix terminal
Hint 3) In your script there should be a main body of the program with two functions: grep and usage.
Q2) RADIUS OF GYRATION and MORE…
You will write a program to calculate the radius of gyration of a given protein. Go to www.rcsb.org and choose 4 proteins. PDB codes start with a number and followed by three letters. Send me your pdb file codes by email.
Residue range (amino acid count) should range from 50 to 300. You can see the residue count at the left side of the page of each individual protein. Check it before choosing your protein. Download the .pdb of
your proteins (from the right side of the screen).
The program will work like this:
Read a user provided list of PDB files into your program. You will use the system module for
From the linux shell:
python protein_calc.py protein1.pdb protein2.pdb protein3.pdb protein4.pdb
The input PDB files should be downloaded into your local directory then parsed by your You will open the .pdb file using the file reading methods we discussed in class, as a text file.
Calculate the rCOM (center of mass) of the protein (formula below).
Calculate Rg (radius of gyration) of the protein using the coordinates from alpha carbons(see below for more information).
Output the radius of gyration as shown in the table. It should print the name of the .pdb file, Rg and sequence length. PLEASE SEE THE RADIUS OF GYRATION PART
Using matplotlib module, plot as a function of Save the graph as RG_plot.png
Inside protein_calc.py write a function that analyzes a PDB file (filename provided on the command-line!) to find potential cross-linked leucine
The rigid cross-linker is approximately 11 Å
Your function should first compute the distance between all pairs of leucine residues.
Just like 7) write a second function that analyzes a PDB file to find potential disulphide bridges. Print possible disulfide bridge pairs and distance between
1) The distance between two sulfur atoms of CYS residues should be in the range of 2.5 Å.
DescriptionIn this final assignment, the students will demonstrate their ability to apply two majorconstructs of the C programming language – Fu
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of