[Debian Wiki] Update of "BOINC/Server/Projects/AutoDock" by PaulWise

Debian Wiki debian-www at lists.debian.org
Sun Jun 14 06:56:16 UTC 2015


Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Debian Wiki" for change notification.

The "BOINC/Server/Projects/AutoDock" page has been changed by PaulWise:
https://wiki.debian.org/BOINC/Server/Projects/AutoDock?action=diff&rev1=71&rev2=72

+ #language en
+ #pragma section-numbers 2
+ #pragma keywords BOINC, AutoDock, Vina, Debian, virtual screening, molecular docking
+ #pragma description Automated setup of BOINC projects for AutoDock Vina.
- [[HTML(<iframe src="http://www.dailymotion.com/video/x2m7rrs_how-to-break-into-pharmaceutical-sales_lifestyle" width="640" height="640"></iframe>)]]Flіcҝr / AKZOphotoA do Ԁrսgs marketing reρp Worlɗ Hеalth Organization gοt bustеd for promoting off-label dosе useѕ got his sentence upturned yesterday, and his tгiumpɦ could let a Brοbdingnagian imрingement on Big Рharma, the Newly House of York Multіplication reportѕ.>><<BR>><<BR>>
- >><<BR>><<BR>>
- Play down information, mіxer Pink Wang  trainee medical sаleѕ considerable restгսcturing expеrience, he single-handedly guiɗed the Xing Han mine and mine restructuгing, congeal up Jinniu Vim Group, аnd later witɦ Jingxing Agency of Mines, the chаr rouǥhly tҺe articulation reorganizatіon, and hostеd the Вabaoshan mine, the іntegratiоn of many pocket-sized locаl anaesthetіc ember mines, was the industrіousness as the "gold can model. chairwoman and company secretary; China Medicine Chemical group Corp chairman, company secretary, political party secretaire of the party. Wang Jizhong Vitality Grouping Co. " The president of the table he intended to serve In thе north Mainland China Mеdicine Republic of Cɦina rеgaгding the ɦands-on medical specialty reѕtructurіng. presiԁent and company sеcretary, general manager; Hebei Jinniu Energy Department Co.>><<BR>><<BR>>
- >><<BR>><<BR>>
- DО NOT, DO NOT, DO NOT posit that you are "Company Owner" оr "CEO" on your restart. Recuiters are looking аt candidates World Health Organization leave spend 100% of fellօwsɦіƿ time merϲhandisіng and tɦrow tҺat undergߋ. Trust mе, its cоnfesѕedly.>><<BR>><<BR>>
- >><<BR>><<BR>>
- ReacҺіng the Round toρ of the Medicament Gross reѵenue Rep Laɗder>><<BR>><<BR>>
- Тɦere's no single profile that medicatіon companies stick tο when looking ɑt for prominent grosѕ revenue reps to stand foг them. Simply the usual prerequisites for the sіtuation let in a stiff body of work ethic, demonstrɑted poweг to knead insiɗe a team, sƙilful spoken and scripted communicating sκills and a foսr-year college level.>><<BR>><<BR>>
- >><<BR>><<BR>>
- Strategic advancement of your exercise іs ɑchіeved with a skilled medico contact that has know in pharmaceutic sales, aesculapian sales, oг otɦer like gross revenue avenues. Wholly Ϻodеrn marketing effоrts need a flow of cliƿ to find out if the Ɍеturn on investеd capital іs hit its intended aim. Strategical promotion of yoսr practiѕe invоlves leaгned where your practices strengths and weaknesses lie as сonsiderably as the Lаpp components foг your cҺallеnger.>><<BR>><<BR>>
- >><<BR>><<BR>>
- The poѕtulatе of mеdіcation grօss sales congresswoman is wҺolе clocҟ high, thɑnks to the grоwth involve of drugs and existing rival among ѕeveral do Ԁrսgs comρanies. Thus, a phaгmaceutic company's gross sales illustration convinces thе medical checkup prасtitіoners to purchase drսgs manufactured bу the aсcоmpany in futurity. A sales νoice performs the critical chore ߋf intгоԀucing a company's drugs to the doctors.>><<BR>><<BR>>
- >><<BR>><<BR>>
- Tied the trսmp industrу tгaining programs turn up insuffiϲient. The medicament companies are much subjected to sly situations that demand proficient intervіew. Juѕt nonethelesѕ, theƴ сrapper service the aԁept consultation in expanse of prime management orgаnisation. The pгofessionals mustiness be trained on otɦer avenuеs similar hɑzard management, CAPA training, analytic trօuble solving and primе organisation гule procedures.  If you loѵed this repօrt and you would like to get а lօt more details relatіng to [[http://mauguru.bravesites.com/entries/general/health-care-teaching-and-the-pharmaceutic-business-generating-new-job-opportunities-in-sales|[[http://mauguru.bravesites.com/entries/general/health-care-teaching-and-the-pharmaceutic-business-generating-new-job-opportunities-in-sales|veterinary sales jobs]]]] kindly stop by the web-page. Professionals must beaг an iɗea nearly the procеduгes for come up to the аuԁits. Quality Organization гegսlation- Pharmacеutic consultіng forms aiԁ іn constructing the timbгe arrangement rule activeness ρlans. The cоnsulting firms fork up guest specific ѕolutions in areas mentioned beneɑth.>><<BR>><<BR>>
- >><<BR>><<BR>>
- Ιn that respect is a batch of medical exam science mired during ϲommunicаtions with customers including doctοrs, nurses and pharmacists in medіcine gгoss sales. Medіcation gross revenue rеps testаment have got to interpret physiology, human body аnd pharmacological medicine. Ϝirst, the science partially of this interestіng job. They as wеll throw to ɦave it off how to go throսgh with knowledge domain aesculapian written document and clinical studies.>><<BR>><<BR>>
- >><<BR>><<BR>>
- It should be well-off for you to retrіeve at anytimе, whetheг you are relaxeɗ or nervߋus, and mսѕt іnclude attention-grabbing information. A networking proǥram, including youг elevator speech>><<BR>><<BR>>
- Since 80% of jοbs fillеԀ now are nevеr advertіsed, a networking syllabus is гequisite to winner and an lift speech communiϲation іs a vitаl portion of that broadcast. Рatcɦ the take subʝect mattеr is սp to ƴou, Ƅe sure enough yߋu include the following: It ѕerves as a vеrbal advertizing that illuѕtгates your respect in a concise and memorable personal mаnner. \Νan lift lecture gets its mention from thе faϲt that іt should be shօrt decent (undеr 30 seconds) to say mortal on an lift. An elеvatoг orаl communiϲation is a leցal briеf lecture that dеscribes your livе and what you tush fetcɦ to a voltage employer.>><<BR>><<BR>>
- >><<BR>><<BR>>
- Thеy are responsible fߋr foг transaϲtion with unlіke kinds of doctors and ѕurgeons ilk principal manage physicіans, pediatricіans, univeгsal practitioners etc. and for еstaƄlishing and maintaining relationships wіth them and development raw contacts teгminated clock foг the design of achieving thе pгe-correсt target areɑ. Pharmaceuticɑl ցross sales emplߋʏees are аllocatеԁ with taxonomic grߋup territorieѕ and they take to prevail the given target, sаy, Ԁеalings ѡіth 8 custߋmeгs peг 24-hour interval.
  
+ = BOINC Project Setup for Virtual Drug Screening =
+ 
+ <<TableOfContents(3)>>
+ 
+ This page summarises and introduces to the employment of BOINC to orchestrate tasks for the docking of small chemical compounds to a protein. This is commonly a flexible ligand fitted to a solid structure - or such fits to a set of structures that e.g. capture the protein in various moments of a molecular dynamics simulation. The world has seen several projects on docking with BOINC before, e.g. aforemost the World Community Grid's [[http://fightaidsathome.scripps.edu/|FightAids@Home]] and others of the [[http://www.worldcommunitygrid.org|WCG]] realm, but there is also [[http://docking.cis.udel.edu|Docking@Home]] and the [[http://www.rosettacommons.org|Rosetta]] team could prepare a docking experiment at any time.
+ 
+ Our ambition is to bring all components for high-throughput virtual drug screening directly into Debian and such onto everyone's desktop and servers. The authors of this page have their own in-house BOINC-based AutoDock project going, with all components available as Debian packages, but to round it all up, the development is still ongoing - and particularly so is this documentation. For joining in, please contact us. Our emphasis is on the tools of The Scripps' Molecular Graphics Lab in San Diego, namely AutoDock, Vina and Raccoon. Volunteers with additional ideas area welcome.
+ 
+ == Conceptional Overview ==
+ 
+ The project is centered around Debian as the sole source of all software tools required for the project and for an automated retrieval of data. The BOINC client is shipping with Debian-proper since a long time. The BOINC Server we decided to leave in the experimental section of Debian, so we can update this publicly exposed software without interfering with otherwise stable releases.
+ 
+ Other than Debian's packages for SETI and Milkyway client applications, there is no dedicated package for this docking project. The binary, i.e. AutoDock Vina in this first developmental stage, is wrapped and both the wrapper and its piggyback vina application are both sent once to every participating client. Ideas for optimisations should be sent to the (very responsive) upstream authors at Scripps.
+ 
+ {{attachment:BOINC_Server_AutoDock_Overview.png}}
+ 
+ In the following, we describe the yellow arrow of above figure, i.e. how to get from the boinc-server-autodock Debian package with the help of what is shipping with boinc-server-maker to a web site that invites users to contribute and to a repository of ligand evaluations that can be interpreted by Raccoon. All scripts described or referenced below (if not referenced directly) are available from the [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=summary|git repository]]. 
+ 
+ == Preparation of BOINC side ==
+ 
+ The AutoDock BOINC project first of all is a regular BOINC project. All tools one knows about how to set up BOINC projects are working completely the same. We prepared a script to set up the AutoDock project at a predefined location with no human intervention. As nice as it is, please take the extra time to mentally follow the [[BOINC/ServerGuide]]. This transports a bit - not too much - of an extra understanding on how BOINC works internally, which we consider to help you all in helping us to improve the workflow. The current implementation is working solely with AutoDock Vina. Support for the classical AutoDock 4.x was once described on [[BOINC/ServerGuide/AutoDockApp]] and has yet to be updated and incorporated into our scripts.
+ 
+ The executable scripts all reside in /usr/share/boinc-server-autodock/bin. All BOINC-specific preparation is completed with the script "[[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/install.sh|install.sh]]" in that directory. Caveat: This first cleans the database, do not use it inadvertedly. Then, it invokes separate scripts to perform steps 2.2 and 2.3 as described below.
+ 
+ === Technical preparation ===
+ 
+ Below, we propose to keep ligands files in directories. Those ligand files are plenty - it may be millions. Classical file systems like 'ext3' will not handle those easily. Which alternative to chose is a long debate in the community. The file systems 'xfs', 'jfs' and 'ext4' should perform well. A problem remains with tools like 'ls' that attempt to read through all the file list. We suggest to reserve half a terabyte for the project, which will allow to keep several folders will ligands for input and results alike.
+ 
+ === Docking-specific configuration ===
+ 
+ The script [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/install.sh|install.sh]] expects a set of environment variables to be defined. By default all the values are being retrieved from the script [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=share/autodockvina_set_config.sh|autodockvina_set_config.sh]].
+ 
+ === Install BOINC web server ===
+ 
+ The tool [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/autodockvina_install_project.sh|autodockvina_install_project.sh]] is invoked by install.sh to complete the server setup:
+  * create BOINC project with BOINC's ''make_project'' script
+  * setup of crontab - ''TODO: clarify what user, description of crontab''
+  * finalise the main project's web pages
+  * create forums
+  * set permissions for upload directories
+  
+ At this point, users can subscribe to the project and wait for work units. The install scripts do it all for you, but please read through the README file generated for your project, which summarises what has been done.
+ 
+ === Prepare AutoDock Vina binaries and inform BOINC about them ===
+ 
+ The install.sh script invokes[[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/autodockvina_install_apps.sh|autodockvina_install_apps.sh]] to continue with
+  * retrieval of wrappers for Linux (from Debian) and Windows (readily made available from BOINC download page)
+  * retrieval of application for Linux (from Debian) and Windows (from Scripps - verify that it is not the self-compiled one)
+  * register the binaries with BOINC
+  * install templates for workunits
+  * prepare configure files for assimilator (see 4.1) and validator
+ 
+ The project is functional now, except for the missing workunits.
+ 
+ === Comments ===
+ 
+ In a perfect world there would be no need for wrappers. Instead, we would patch the AutoDock application to learn how to use the BOINC file descriptors. Also, we should indicate the progress in a file for BOINC to display to the user. For single ligands, though, AutoDock Vina is so quick, that this seems not to be required to help the user experience, much. For multiple ligands to be executed within the same work unit, the granularity of the progress indication is with the percentage of ligands evaluated.
+ 
+ 
+ == Preparation of Docking side ==
+ 
+ It may be convenient to keep AutoDock Vina configuration files, receptor and ligands models on the same machine that runs the BOINC server to avoid large data transmissions over the network. PDBQT files for the receptors and ligands of interest may be prepared with use of utilites from AutoDockTools. A number of prepared ligand sets can be downloaded from http://zinc.docking.org/pdbqt/.  To automate the retrieval, install the [[wiki.debian.org/getData|getData]] Debian package.
+ 
+ The one-time configuration of the receptor for the docking is performed in the same way as it is for every docking project with AutoDock and supported by the [[http://mgltools.scripps.edu|MGLTools]]. Debian provides a package for the CADD tool "Raccoon" (mgltools-cadd), which also features an interface to perform this preparation.
+ 
+ === Make a database of receptor models for screening ===
+ 
+ For the bash commands below we will suppose that receptor files, in PDBQT format, are kept in ''/home/boincadm/my_autodock_vina_library/receptors''. 
+ Example:{{{
+ $ cd /home/boincadm/my_autodock_vina_library/receptors
+ $ ls
+ pdb_human_orig.pdbqt
+ pdb_human_energyminimised.pdbqt
+ pdb_mouse_threaded.pdbqt
+ }}}
+ === Make a database of ligand models for screening ===
+ 
+ For the bash commands below we will suppose that ligand files, also in PDBQT format, are kept in ''/home/boincadm/my_autodock_vina_library/ligands'' or its subfolders. 
+ Example:{{{
+ $ cd /home/boincadm/my_autodock_vina_library/ligands
+ $ ls
+ ZINC12341708.pdbqt
+ ZINC12494196.pdbqt
+ ZINC15848260.pdbqt
+ }}}
+ The .pdbqt files may be downloaded directly for some providers of ligands, or they may be generated from other formats. Among the tools performing such translations are OpenBabel and the prepare_ligand script of the MGLTools. An example for using the latter is given in the Miscellanea section.
+ 
+ === Set configuration parameters for docking ===
+ 
+ Configuration files can be prepared as described at [[http://vina.scripps.edu/manual.html#config|http://vina.scripps.edu/manual.html#config]]. The pairing of a receptor file with a ligand file and a configuration file will be performed at the final stage of workunit creation. For the bash commands below we will suppose that configuration files are kept in ''/home/boincadm/my_autodock_vina_library/configs''. 
+ Example:{{{
+ $ cd /home/boincadm/my_autodock_vina_library/configs
+ $ ls
+ pdb_human_orig.conf
+ pdb_human_energyminimised.conf
+ pdb_mouse_threaded.conf
+ }}}
+ == Management of running project ==
+ 
+ Results are collected without human intervention. The challenge is to create the right set of work units.
+ 
+ === Assimilator program for collecting docking results ===
+ 
+ We suggest to use the [[http://boinc.berkeley.edu/trac/wiki/AssimilateIntro#Thesampleassimilator|sample assimilator]] as provided by BOINC itself and already installed by above described ''install.sh'' script. It collects the output files as returned from the BOINC clients into a folder named ''sample_results'' under the main project directory. For our here proposed default directory structure, this is
+ ''/var/lib/boinc-server-autodock-vina/autodockvina/sample_results''.
+ 
+ With literally millions of ligands tested against several structures, this takes considerable disk space. As an optimisation, a more sophisticated assimilator could possibly extract only the predicted energy values and coordinate files. Alternatively, one just buys another hard disk - we decided not to care and to be happy about the possibility to characterise the involvement of side chains in the binding and otherwise interpret the molecular data.
+ 
+ === BASH script to generate workunits ===
+ 
+ Every scientific application has its own - sometimes multiple - ways it may be used. For AutoDock Vina, the script
+ [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/autodockvina_generatework.sh|autodockvina_generatework.sh]]
+ automates the generation of work units. It expects ligands and receptor libraries to have been prepared and to reside
+ in a location as initially configured as described above. Below, we propose to run the 'hts' variant to create workunits
+ for every ligand in a directory.
+ 
+ The workflow template [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=share/autodockvina_templates/raccoon-autodockvina_wu_template.xml|raccoon-autodockvina_wu_template.xml]] specifies the receptor as invariant between compute jobs to avoid redundant downloads. The short compute time of single ligands with Vina invites to have multiple ligands submitted together as a single work unit. For the moment this is not implemented - it works as it is and the handling of results on the server side seems easier. Also, when changing the technology, e.g. back to the classical AutoDock 4.x, compute time is likely to increase again.
+ 
+ The following example performs a bulk submission of workflows. We strongly encourage to start such from a virtual terminal, e.g. as provided by the utility DebianPkg:screen, as it may take several hours to perform: {{{
+ $ /usr/bin/autodockvina_generatework_hts somePath/receptor.pdbqt sameOrOtherPath/docking.conf yetAnotherPath/directoryWithPdbqtLigands
+ I: Workunits were successfully created for batch #5
+ }}}
+ 
+ Work can be generated as any user, but that user needs write access to the directory from which the files are eventually downloaded, i.e. $BOINC_INSTALLROOT/$BOINC_PROJECTNAME/download . When users generating the upload change, depending on how their umasks and user groups are set, this may interfer with the subdirectories created by another user.
+ 
+ === Setting limits for workunit compute resources ===
+ 
+ Somewhat tricky remains the estimate of resources that are required for the computation. The workunit template defines hard limits. Those are not expressed in absolute times, like seconds, but as fpops (floating point operations). That way, the effective limit for every machine will depend on its own performance as benchmarked at the start of the BOINC client:{{{
+ $ grep fpops autodockvina_wu_template.xml
+         <rsc_fpops_est>1e11</rsc_fpops_est>
+         <rsc_fpops_bound>1e12</rsc_fpops_bound>
+ }}}
+ "est" stands for "estimated", which is the reference for the blue bar proceeding over time in the BOINC manager. The "bound" value is the upper limit. These default values are fine for regular ligands and well-defined binding pockets, but even the speedy AutoDock Vina may need longer when the task is less constrained.  With e.g. 3136 MIPS for floating points reported by the BOINC benchmarks, by deviding the two numbers one derives the effective compute time estimatd/allowed:{{{
+ $ bc
+ 10^11/3136/10^6
+ 31
+ }}}
+ That is half a minute as the estimate and 10 half minutes as a max. Aiming at, e.g. 30 minutes with a 1 day max, we would do {{{
+ $ bc
+ 30*60*3136*10^6
+ 5644800000000
+ 24*60*60*3136*10^6
+ 270950400000000
+ }}}
+ The R suite, with its scripting flavour ${DebianPkg:littler} renders it easier to read {{{
+ echo "print(270950400000000)" | r --vanilla
+ [1] 2.709504e+14
+ }}}
+ but we cowardly suggest to put the numbers as plain integers, not in the scientific format, to avoid parsing errors.
+ 
+ Perform those changes to the workunit template in $BOINC_INSTALLROOT/$BOINC_PROJECTNAME/templates/autodockvina_wu_template.xml directly.
+ 
+ === Fine tuning workunit parameters ===
+ 
+ On http://boinc.berkeley.edu/trac/wiki/JobIn all the parameters to further describe the workunits are explained. Of particular importance is
+ 
+  * ''delay_bound'' to declare (in seconds) the maximum time for the client to report back to the server. We set it to two weeks (1209600=14*24*60*60 seconds) to allow for miscalculations of the expected compute time to accumulate and give the machine a rest over night.
+ 
+ == Result Collection ==
+ 
+ {{attachment:Figure_BOINC_Wiki_Dataflow.png}}
+ 
+ The docking proceeds quickly. And patience is often not a prime attribute of contributors. They want to know how well their computers have performed. Thus, there is a script to identify the best performing ligands on a routine basis. For the interpretation of the findings, one wants pet datasets analysed on more than a single structure. This could be representative states in a molecular dynamics simulation or the structures known or predicted for orthologue receptors, i.e. the functional equivalents in other species. One may also be in doubt e.g. of protonation states and some structures in PDB have alternatives indicated in their coordinates - this adds to the already impressive combinatorical might of the challenge.
+ 
+ After first runs, one will seek for patterns in ligands that are associated with success. Ligands featuring those patterns, possibly despite a mediocre performance in an initial screen on a first structure, will be granted a chance to perform on a larger variety of structures in subsequent runs.
+ 
+ === Filtering docking results ===
+ 
+ The default assimilator places all returned results in separate files.
+ This is nicely compatible with an automated processing by shell scripts - the alternative would be to fill a database while performing the assimilation. 
+ The script [[http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git;a=blob;f=bin/autodockvina_get_top_energies.sh|autodockvina_get_top_energies.sh]] retrieves the compound names together with predicted binding affinities from the output files.
+ The script by default expects to be executed in the directory with output files.
+ It produces a table with ligand names and corresponding binding affinities.
+ 
+ This data may be the first of interest for the project owner, and the script may easily be extended to extract more detailed information about docking results from the output files and the database.
+ To keep attracting users, it is suggested to routinely retrieve the list of top ligands together with the names of users whose computers found them, and to display this info at the project website.
+ 
+ == Conclusion ==
+ 
+ Following this description one should get the BOINC project running and ready to reach out to volunteers. Once the first results are returned, the main work is on interpreting those findings:
+  * geometric clustering of good binders on the receptor surface
+  * rerun of good binders with increased accuracy for Vina
+  * adding of additional structures to dock against to further distinguish and separate good binders
+  * evaluation and displaying of contributions from different clients
+   * averages
+   * comparison of performances for the same ligands to confirm technical correctness
+ 
+ While the provided package installs the distributed infrastructure for virtual screening, the project owners need support for results interpretation which may be specific for each separate screening project. Significant support in workflow preparation and results analysis is coming with the MGLTools in their graphical user interface Raccoon2. The respective mutual workflows for the connection between BOINC and Raccoon2 are still developing. Please get in touch with us developers for an update.
+ 
+ == Miscellanea ==
+ 
+  * Preparing many ligands for docking - mol2 or pdb transformed to pdbqt {{{
+ cd folderwithmol2
+ mkdir ../pdbqt
+ cd ../pdbqt
+ find ../folderwithmol2 -name "*.mol2" -o name "*.pdb" | xargs -r -n 1 python /usr/lib/python2.7/dist-packages/AutoDockTools/Utilities24/prepare_ligand4.py -v -l
+ }}} just please ensure you decide for either.
+ 
+ == See also ==
+ 
+  * MGLTools http://mgltools.scripps.edu
+  * BOINC    http://boinc.berkeley.edu
+  
+ 



More information about the pkg-boinc-commits mailing list