UCSF banner
Property Subsets

Ready-to-download subsets of ZINC by property are available below. To download, click on the subset name. More about ZINC subsets is here. You can also create your own minisubset. ZINC may be used free of charge for research by individuals and institutions. Whereas you are free to share the results of a ZINC search or a screen of molecules from ZINC, you may not redistribute major portions of ZINC without the express written permission of John Irwin.
Subset
Click to download
and Description
Compounds
Click to browse
Last UpdateSelection criteriaOnly one sourceT<.9T<.8T<.7T<.6
 lead-like
(#1)

Teague, Davis, Leeson, Oprea, Angew Chem Int Ed Engl. 1999 Dec 16;38(24):3743-3748.

11473262008-06-25p.xlogp < 3.5 and p.mwt < 350 and p.n_h_donors <= 3 and p.n_h_acceptors <= 6 and p.mwt > 150648581 196924838113178610133
 fragment-like
(#2)

Carr RA, Congreve M, Murray CW, Rees DC, Drug Discov Today. 2005 Jul 15;10(14):987

674892008-06-25p.xlogp <=2.5 and p.mwt <=250 and 150 <= p.mwt and p.rb <=3 and p.n_h_donors <=2 and p.n_h_acceptors <=426117 312711593779243712
 drug-like
(#3)

Lipinski, J Pharmacol Toxicol Methods. 2000 Jul-Aug;44(1):235-49.

53482152008-07-09p.xlogp <= 5 and p.mwt <= 500 and p.mwt > 150 and p.rb < 8 and p.psa < 150 and p.n_h_acceptors <= 103645012 146196667972961811343
 all-purchasable
(#6)

Purchasable chemical space

82040142008-07-096011385 145566636462678210083
 newton-hit-like
(#7)

Roger Newton's (Maybridge) informed tweak of Teague/Oprea's Lead-like concept (ref Lecture at UCSF Dec 05)

1360082008-06-25p.xlogp>1 and p.xlogp<3 and p.mwt>200 and p.mwt<35099119 N/AN/AN/AN/A
 everything
(#10)

All public molecules in ZINC, including around 1M compounds not in subset #6 that may not be available commercially

84901912008-07-095801569 4757671921976994824766
 neutral-fragments
(#17)

As Subset #1, but with no charge

666442008-06-25p.xlogp <=3 and -2 <= p.xlogp and p.mwt <=250 and 150 <= p.mwt and p.rb<=3 and p.n_h_donors <=2 and p.n_h_acceptors<=4 and p.net_charge = 024845 308301538576503611
 CNS permeable
(#29)

more likely to pass the blood-brain barrier

2097482008-06-25p.psa <60 and p.psa > 0 and p.mwt < 400 and p.xlogp < 2.7 and p.xlogp > 1.5 and p.mwt > 150121336 1349975647418
 stiff leads
(#30)

no comment

680972008-06-25p.rb <= 1 and p.mwt < 350 and p.xlogp<4 and p.xlogp > -221230 N/AN/AN/AN/A
 monoanions
(#31)

small and negative

406132008-06-25p.net_charge = -1 and p.mwt>200 and p.mwt < 300 and p.xlogp < 4 and p.xlogp > -218934 226214791002582
 monocations
(#32)

small and positive

815312008-06-25p.net_charge = 1 and p.mwt>200 and p.mwt < 300 and p.xlogp < 4 and p.xlogp > -279730 388124941550879
 goldilocks
(#33)

not too big, not too small, not too polar, not too greasy, uncharged, just right

4721032008-06-25p.net_charge = 0 and p.mwt>200 and p.mwt < 300 and p.xlogp < 4 and p.xlogp > -2228166 432370317262
 Piotr
(#38)

6370362008-06-25p.mwt <= 300 and p.n_h_donors <= 5 and p.n_h_acceptors <= 5 and p.net_charge <=1 and -1 <= p.net_charge299058 7056435932159666077
 leads-frags
(#41)

Teague, Davis, Leeson, Oprea, Angew Chem Int Ed Engl. 1999 Dec 16;38(24):3743-3748.

11800482008-06-25p.xlogp<4 and p.xlogp> -2 and p.mwt < 350 and p.n_h_donors <= 3 and p.n_h_acceptors <= 6 and p.mwt > 250 698876 N/AN/AN/AN/A
 kerim-like
(#42)

less floppy fragments for AmpC project

1357162008-06-25p.xlogp <=3 and -2 <= p.xlogp and p.mwt <= 250 and 150 <= p.mwt and p.rb <=352281 15471059719490
 abram1
(#49)

RNA

1128912008-06-25p.mwt <= 500 and p.net_charge in (0,1,2) and p.n_h_donors in (0,1,2,3,4,5,6,7,8,9,10) and p.n_h_acceptors in (0,1,2,3,4,5,6,7,8,9,10) and p.rb in (0,1,2,3,4,5,6,7,8,9,10)15648 1035870712554
 stiff-soluble
(#50)

soluble and rigid fragments

192842008-06-25 p.xlogp<=2 and p.rb<=2 and p.n_h_acceptors<=3 and p.n_h_donors <=1 and p.mwt<2506909 10031549528211440
 stiffs
(#51)

rigid, lead-like-ish molecules

1015702008-06-25 p.xlogp<=4 and p.rb<=1 and p.mwt<40040091 3877120267103705126

The following limits are in effect:
  • Molecules per mol2 or SDF format file: 100,000
  • Entries per flexibase file 20,000
  • Typical size of mol2/sdf/flexibase files: 20 to 100 MB
  • Number of large files that may be downloaded concurrently: 2
In the downloads above, we offer four formats: SMILES, mol2, SDF, and flexibase formats. Most docking programs we know of use mol2. We urge you to not download Flexibaseunless you are using DOCK3.5.54.

We have generated protonated forms in 3 pH ranges. The reference structure ("ref") is a single representative structure at pH 7. The middle pH range ("mid") contains additional protonated forms obtained by titrating protons with pKa between 5.75 and 8.25. The low pH range ("lo") contains additional protonated forms between pH 4.5-7 and the high pH range ("hi") contains additional protonation forms between pH 7-9.5.

Databases are divided into chunks for easier download. mol2 and sdf slices are 100MB or less uncompressed (often 30MB compressed, containing 20K to 30K molecules each Flexibase are 200MB or less uncompressed.

To download all files in the usual pH range (5.75-8.25) use "Usual". To download databases for metalloenzymes use "Metal", which adds the pH range 7-9.5 to usual above. For the entire pH range 4.5-9.5 please use "All".

We also offer tables of Molecular Properties, purchasing information, and Inchi files.

Last updated: Jan 30, 2007 by jji at cgl.ucsf.edu.

A product of BCIRC, the Bioinfomatics and Chemical Informatics Research Center @ UCSF. Last updated May 6, 2008. bug reports to support at docking.org; comments to comments at docking.org; questions and discussion to blaster-fans at docking.org.