CR Y-complex modellingΒΆ

For modelling the CR Y-complex the calculation of fit libraries and global optimization with Assembline will be used. The dir scnpc_tutorial/CR_Y_complex includes source code files, input files and precalculated modelling results for the CR Y-complex:

- parameters file & configuration file for global optimization of wt CR Y-complex
- sequence fasta file, input_PDB, EM (input data)
- CR_Y_complex_final_model.pdb, out (output modelling results)
- systematic_fits (directory):

    - parameters file for systematic fitting
    - em_maps, PDB (input data)
    - result_fits_chimera (output fitting results)
  1. First activate your virtual environment and enter the CR_Y_complex/systematic_fits dir

    source activate Assembline
    
    cd CR_Y_complex/systematic_fits/
    

    or depending on your computer setup:

    conda activate Assembline
    
    cd CR_Y_complex/systematic_fits/
    
  2. Run the generation of fit libraries for CR Y-complex rigid bodies (results calculated in dir systematic_fits/result_fits_chimera/CR_with_Y_no_env_relative.mrc/)

    fit.py systematic_fitting_parameters.py
    
  3. The fit libraries have been precalculated and analysed in dir systematic_fits/result_fits_chimera/CR_with_Y_no_env_relative.mrc/. To analyse the fit results on your own run the following while in the systematic_fits/ dir

    genpval.py result_fits_chimera
    
  4. To generate the top five fits of each input rigid body (i.e. top five fits from each fit library) run the following

    #upon successful run of fit.py and genpval.py
    cd result_fits_chimera/search100000_metric_cam_rad_600_inside0.3_res_40
    
    genPDBs_many.py -n5 top5 */*/solutions.csv
    
  5. After completing the calculation of fit libraries (or use the precalculated results) enter again the main project dir (i.e. CR_Y_complex) and run the global optimization

    cd CR_Y_complex
    
    # this will run 20000 global optimization modelling runs on a slurm cluster (for options/parameters or local runs inspect the Assembline manual)
    assembline.py --traj --models -o out --multi --start_idx 0 --njobs 20000 config.json params.py
    

    Note

    There is already an output dir CR_Y_complex/out so in case you want to run the modelling then rename the out/ dir as it will be overwritten from the run above

  6. Enter out/ dir, generate output scoring lists and rebuild atomic structures of models

    cd out
    
    extract_scores.py #this should create a couple of files including all_scores_sorted_uniq.csv
    
    rebuild_atomic.py --top 10 --project_dir <full path to the original project directory CR_Y_complex> config.json all_scores_sorted_uniq.csv
    
  7. While in the out/ dir run the following command to prepare your output models for analysis with imp-sampcon tool from IMP

    setup_analysis.py -s all_scores.csv -o analysis -d density.txt
    

    Note

    The density.txt is provided in the CR_Y_complex/out. To generate it yourself please inspect Assembline analysis section.

  8. Run imp-sampcon exhaust tool (command-line tool provided with IMP) to perform the sampling analysis:

    cd analysis
    
    imp_sampcon exhaust -n <prefix for output files> \
    --rmfA sample_A/sample_A_models.rmf3 \
    --rmfB sample_B/sample_B_models.rmf3 \
    --scoreA scoresA.txt --scoreB scoresB.txt \
    -d ../density.txt \
    -m cpu_omp \
    -c <int for cores to process> \
    -gp \
    -g <float with clustering threshold step> \
    

    Note

    For further descriptions of settings for imp_sampcon please see Sampling exhaustiveness and precision with Assembline