Getting Started with the Cambridge Structure Database


Access to the Cambridge Structure Database (CSD) is restricted. To receive access, complete the form requesting permission to use the CSD and mail or fax it to the PSC. Once approved, you will be added to the list of approved users. Unless you are on this list, you cannot access the CSD. You must also have an account at the PSC that allows you to use the Sequence Analysis Resource computer, tourney.

SSH to tourney.psc.edu and login. Set up the appropriate path and other initialization data with the command:

source  /biomed/db/cambridge/setup_csd

If you intend to use the CSD regularly you should probably put this command into your .login file.

Start the query program, called quest (for question), with the command:

questv5 -j query

where query is a unique name for the CSD session. All of the results files from the session will be named query.ext where "ext" will take on different values for different types of results. For example, query.jnl will hold the summaries of all of the hits from the search and query.dat will contain the coordinates.

Next instruct the program to save the results with the command

save fdat

QUERY ONE - queries using the names of chemical compounds.


The next task is to formulate a query. We want to retrieve all of the structures that contain zinc atoms, imidazole rings, and waters of hydration. A "name" term must be specified for each of the desired groups. That is accomplished by entering the following three commands:

t1 *name zinc
t2 *name imidazole
t3 *name hydrate

Next we combine the terms into a formal query or question with the command:

ques t1 .and. t2 .and. t3

At this point, you will get a notice that the database is being searched. Every time a compound is encountered that has these terms in the name, a summary of the compound will be displayed on the screen and you will be asked to either keep or reject that particular compound in the results of the search. A sample summary is shown below.

---------+---------+---------+---------+---------+---------+---------+

PAFWEY10

(mu!2$-Imidazole-N,N')-bis(diethylenetriamine-4-acetato)-copper-zinc

perchlorate hydrate

C15 H31 Cu1 N8 O4 Zn1 1+,Cl1 O4 1-,2.5(H2 O1)

Zong-Wan Mao,Dong Chen,Wen-Xia Tang,Kai-Bei Yu,Li Liu

Chin.J.Chem.(Huaxue Xuebao)(Engl.), 10, 45,1992

---------+---------+---------+---------+---------+---------+---------+

QUERY TWO - queries using the names of chemical compounds with ambiguity.


The question can be modified to allow either the term "hydrate" or the term "aquo" in the compound name with the following additional commands.

t4 *name aquo
ques (t1 .and. t2) .and. (t3 .or. t4)

QUERY THREE - queries using the elements present in chemical compounds.


The next question is based on the chemical elements present in the compound. The two commands necessary to retrieve all compounds containing Zinc, Carbon, Oxygen, Nitrogen, and Hydrogen are:

t5 *elem zn  +  c  +  o  +  n  +  h
ques t5

QUERY FOUR - queries using the chemical formula of a compound.


A specific chemical formula can also be used as a query with the following set of commands.

t6 *form c4 h8 n2 o8 zn1
ques t6

QUERY FIVE - queries for specific peptides using the names of amino acids.


The final example is a specialized query to find all of the crystal structures that contain the tripeptide glycyl-glycyl-glycine. There is a special query language for peptides in the CSD. The commands are:

t7 *pept
pseq gly-gly-gly
ques t7

USING the X-WINDOWS interface.


All of the above queries and more can be specified using a point and click X-windows interface. This interface displays each hit in the database as a structural drawing. This is often convenient for screening the results. Start the query program with the command:

questv5 -j query

Instruct the program to use the full menu version of the X-Windows graphical display with these two commands:

terminal x11
menu full

An X-Window display window should then appear on your workstation. Place the cursor inside of this window and click the left mouse button to begin using the X-Window system. To begin developing queries, click on the TO-SEARCH button in the upper right hand corner of the screen. Then click on the FDAT marker in the SAVE commands region of the right portion of the screen.

To develop the first query using chemical names, begin by clicking on the TEXT... sub-menu label. Then click on the *NAME label in the sub-menu. The? *NAME label will be highlighted by a red box. Enter the word "zinc" (no quotation marks) which will appear after the *NAME label and the red box will disappear. Again, click on the *NAME label in the sub-menu but this time type "imidazole" (again, no quotation marks). Click on the *NAME label in the sub-menu for a third time, and this time enter "hydrate". You have now described the three terms of the first query.

Return to the main search menu by clicking on the TO-SEARCH button in the upper right hand corner of the screen. From here, go to the quest (or question) sub-menu by clicking on the QUEST... marker found towards the bottom of the right hand edge of the window. The three *NAME terms you have developed will appear in the main part of the window. Click on the first one (T1 *NAME zinc).

Next, click on the .AND. operator in the "Question-Logic" region of the sub-menu. In order, click on the "imidazole" term, the .AND. operator, and the final "hydrate" term. As you click on these terms and operators, the question will be built in the bottom part of the window. When complete, it should read "QUEST T1 .AND. T2 .AND. T3".

At this point you can submit the query by clicking on the START-SEARCH marker found towards the bottom of the right hand edge of the window. You will first see a window that indicates that the search has been started. When a database entry that matches the query is found, another window will appear that shows a structural diagram. This display can be customized. One region of the window will be marked at the CONTROL section. Clicking the various options in this window allows you to either keep the compound for further examination, or reject it from further consideration, or quit searching. After you either keep or reject a compound, the search will continue.

NRBSC projects are made possible
by these sponsors:

NIH logo.  Pittsburgh Supercomputing Center logo.  NCRR logo.