PubChem Databases




Дата канвертавання25.04.2016
Памер31.97 Kb.
PubChem
PubChem (1) is designed to provide information on biological activities of small molecules, generally those with molecular weight less than 500 daltons(2). PubChem's integration with NCBI's Entrez (3) information retrieval system provides sub/structure, similarity structure, bioactivity data as well as links to biological property information in PubMed and NCBI's Protein 3D Structure Resource.
PubChem Databases
PubChem is comprised of three linked databases --
PubChem Compound,

PubChem Substance and

PubChem Bioassay
PubChem Compound (unique structures with computed properties)
PubChem Compound (4) is a searchable database of chemical structures with validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. PubChem Compound includes over 5M compounds.


  • Molecular Name Searches (e.g., Tylenol, Benzene) allow searching with a variety of chemical synonyms,




  • Chemical Property Range Searches (e.g., Molecular Weight between 100 and 200, Hydrogen Bond Acceptor Count between 3 and 5) allow searching for compounds with a variety of physical/chemical properties, and descriptors.




  • Simple Elemental Searches (all compounds containing Gallium) allow searching with specific element restrictions.

PubChem Substance (deposited structures)


PubChem Substance (5) is a searchable database containing descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results available in PubChem BioAssay. PubChem Substance includes over 8M records. Substances with known content are linked to PubChem Compound.


  • Molecule Synonym Searches (e.g. all substances with 'deoxythymidine' as a name fragment, or substances that contain 3'-Azido-3'-deoxythymidine).




  • Biology Links Search (e.g. substances with tested, active or inactive bioassays).




  • Combined Searches (e.g. substances that are 'Active in any BioAssay' and contain the element Ruthenium).

PubChem BioAssay


PubChem BioAssay (6) is a searchable database containing bioactivity screens of chemical substances described in PubChem Substance. PubChem BioAssay includes over 180 bioassays. Searchable descriptions of each bioassay are provided that include descriptions of screening procedural conditions and readouts.


  • To Search for BioAssay Data Sets (e.g. HIV growth inhibition).

  • To Browse or Download PubChem BioAssay Results (NCI AIDS Antiviral Assay)

Searching PubChem



PubChem Text Search
PubChem Text Search for searching compound name, synonym or ID that defaults to

PubChem Compound. The search results page offers a pull down 'databases' menu that

allows searching in PubChem Substance, PubChem BioAssay and a variety of other Entrez

databases.


PubChem Chemical Structure Search
PubChem Chemical Structure Search (7) has the following options: Search SMILES (including SMARTS or InChI) or Formula which includes a 'Sketch' link to a drawing program that converts structural diagrams to SMILES(exact), SMARTS(substructure) or InChI(exact) strings for searching.
Clicking 'Done' on the 'structure editor' converts the structural diagram to the appropriate string and transfers it to the search box.
Select Structure File allows importation of standard and common chemical file formats (8).
Specify Search Type allows restriction to: same compound, similar compounds (9), formula or

substructure.


PubChem Indexes and Index Search
PubChem Indexes and Index Search allows fielded/range searching from either the PubChem homepage or Entrez search page. A extensive list of field aliases and examples of range searching is provided (10).

PubChem Search Results (11)


PubChem Compound
PubChem Compound results are derived from PubChem Substance records that provide structures. Since compounds are structurally unique, one compound may link to multiple substances.

The default display is a compound summary with thumbnails with cross links(12) to each PubChem database, other NCBI databases, and depositor's databases.


Clicking either the structure or SID link gives the full display which includes the compound's property data, description, related substance information, neighboring structures, and cross links.



PubChem Substance


PubChem Substance has unique records if the structure is not known or supplied. For example, Sulfated polymannuroguluronate, a novel anti-acquired immune deficiency syndrome (AIDS) drug candidate, and other natural products.
The PubChem Substance Summary Record,
---------------------------------------------------------------------------------------------------------

SID: 3724242

Links






Sulfated polymannuroguluronate, AIDS218087 ...

Source: NIAID(218087)



---------------------------------------------------------------------------------------------------------
is linked to the full record by clicking on the SID number (PubChem's substance identifier). This displays the full substance record, that includes links: to PubMed and the source; the Medical Subject Annotation (MESH Substance Name) and a MESH PubMed search link; and depositor supplied synonyms and comments.

PubChem BioAssay


The PubChem BioAssay Summary Record,
-------------------------------------------------------------------------------

AID: 179

Links

NCI AIDS Antiviral Assay
Source: DTP/NCI
15 Readouts, 37678 substances tested

-------------------------------------------------------------------------------


is linked to the full record by clicking on the AID number (PubChem's assay (protocol) identifier). This displays the full bioassay record, that includes: links to the substances tested (all, active, inactive, inconclusive) and related PubMed, Protein, Taxonomy, OMIM and related BioAssay records; and a description of the assay possibly with protocols and comments.

References:


1a. PubChem

http://pubchem.ncbi.nlm.nih.gov/
1b. PubChem - Overview

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_Overview
1c. PubChem FAQ

http://pubchem.ncbi.nlm.nih.gov/help.html#faq
1d. PubChem Glossary

http://pubchem.ncbi.nlm.nih.gov/help.html#Glossary
1e. PubChem - Help

http://pubchem.ncbi.nlm.nih.gov/help.html

"Provides tips and examples for searches of the three PubChem databases by text term/keyword, as well as tips for searching PubChem Compound by chemical properties. The help documents for structure search provide tips on using chemical information for basic and advanced structure search options in the PubChem Structure Search."


2. NIH Roadmap for Medical Research. Molecular Libraries and Imaging. http://nihroadmap.nih.gov/molecularlibraries
3. Entrez Databases

http://www.ncbi.nlm.nih.gov/About/tools/restable_mol.html
4a. PubChem Compound

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pccompound

"Compound -- Chemical representatives in a substance. Chemical structure presented in a compound is standardized through PubChem's data pipeline. A mixture substance may have several standardized compounds." Since compounds are structurally unique, one compound may link to many substances. CID is PubChem's compound identifier.


4b. PubChem Compound Database - search examples

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_Compound_Database
5a. PubChem Substance

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pcsubstance

"Substance -- Individual record object collected from depositors, representing a sample used at bioassay."


5b. PubChem Substance Database - search examples

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_Substance_Database
6. PubChem BioAssay

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pcassay
6b. PubChem BioAssay Database - search and display examples

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_BioAssay_Database
7a. PubChem Structure Search

http://pubchem.ncbi.nlm.nih.gov/search/
7b. PubChem Structure Search Help

http://pubchem.ncbi.nlm.nih.gov/search/help_simplesearch.htm
7c. PubChem Advanced Structure Search Help

http://pubchem.ncbi.nlm.nih.gov/search/help_search.htm
8. PubChem Structure Search Help. Upload Query File

http://pubchem.ncbi.nlm.nih.gov/search/help_simplesearch.htm#EntrezTerm

"Most (if not all) standard and common chemical file formats may be used, including "MOL", "SDF" (both v2000 and v3000), "CDX", "SKC", "MOL", "MOL2", "JME", and "SK2". You may also use a text file with your choice of a "SMILES", "SMARTS", or "SLN" string."


9. PubChem Help. Similar Compounds/Substances Link.

http://pubchem.ncbi.nlm.nih.gov/help.html#xSimilar_Compounds

"The different percent similarities are determined using a Tanimoto score relative to the "binary fingerprint" calculated for two different chemical structures."


10. PubChem Indexes and Index Search

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_index
11. PubChem Summary Display

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_Summary
12. PubChem Cross Links

http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_Links

>1) Go to the PubChem "Compound" or "Substance" pages, depending on whether

>you want unique structure records only or all deposited structures. The URLs

>are:


>

>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pccompound (1 hit for the

>InChI below)

>

>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pcsubstance



>(5 hits for the InChI below)

>

>2) Paste in your InChI, for example:



>

>"InChI=1/C17H14O4S/c1-22(19,20)14-9-7-12(8-10-14)15-11-21-17(18)16(15)13-5-3

>-2-4-6-13/h2-10H,11H2,1H3"

>

>Note that the QUOTES ARE REQUIRED, and there must be no carriage return or



>line feed in the string, despite appearances on this email. Note also that

>the current text query system does not actually recognize the numbers and

>punctuation characters, just the count. It seems to identify the correct

>structures most all the time, nonetheless. I am told a proper recognition

>system for InChI's (a structure decoder) is in the works. Anyone

>interested in this should contact NLM/NCBI directly.

Entrez cross-database search page
Titled: "Entrez: The life sciences search engine" this page is a useful gateway for searching all NBCI databases including PubMed, PubChem, Genome projects and more. Users can enter terms and click 'GO' to run the search against ALL the databases, OR Click Database Name or Icon to go directly to the Search Page for that database, OR click Question Mark for a short explanation of that database.

http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi

http://www.emolecules.com/doc/index.htm

eMolecules discovers sources of chemical data by searching the internet, and receives submissions from data providers such as chemical suppliers and academic researchers.


This is the most comprehensive overhaul of eMolecules since our launch in November 2005. We rebuilt the eMolecules database from the ground up, starting from updated databases and catalogs. Then we added four million new entries from dozens of new sources. In addition, the results pages were redesigned for a cleaner, more compact presentation.
Over 5.5 million unique molecules from over 16 million sources

Over 500,000 CAS numbers

Over 100 chemical suppliers

New government and academic databases, such as NIST and NCI, with direct links to their data

Tens of thousands of trade names and common names

Over 2 million IUPAC and other names



We provide links to real molecules, those available for purchase, and to chemical properties databases with real information, whenever possible, and without ambiguity in the stereochemistry.


База данных защищена авторским правом ©shkola.of.by 2016
звярнуцца да адміністрацыі

    Галоўная старонка