S PN=5577239 OR PN=5950192 OR PN=4642762

               2  PN=5577239
               1  PN=5950192
               1  PN=4642762
      S2       3  PN=5577239 OR PN=5950192 OR PN=4642762
?
 
 
 TYPE 2/2/ALL


  2/2/1     (Item 1 from file: 653) 
DIALOG(R)File 653:US Patents Fulltext
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             01566805
Utility
STORAGE AND RETRIEVAL OF GENERIC CHEMICAL STRUCTURE REPRESENTATIONS
 
PATENT NO.:  4,642,762
ISSUED:      February 10, 1987 (19870210)
INVENTOR(s): Fisanick, William, Columbus, OH (Ohio), US (United States of
             America)
ASSIGNEE(s): American Chemical Society, (A  U.S. Company or Corporation ),
             Washington, DC (District of Columbia), US (United States of
             America)
EXTRA INFO:  Expired, effective February 10, 1999 (19990210), recorded in
             O.G. of April 20, 1999 (19990420)
             Reinstated, effective June 28, 1999 (19990628), recorded in
             O.G. of July 27, 1999 (19990727)
APPL. NO.:   6-614,219
FILED:       May 25, 1984 (19840525)
U.S. CLASS:  707-3 cross ref: 707-104
INTL CLASS:  [4] G06F 15-40
FIELD OF SEARCH: 364-200MSFILE; 364-900MSFILE
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,473,890    9/1984   Araki                                  364-900
 
PRIMARY EXAMINER: Zache, Raulfe B.
ATTORNEY, AGENT, OR FIRM: Pollick, Philip J.
CLAIMS:           31
EXEMPLARY CLAIM:  1
DRAWING PAGES:    22
DRAWING FIGURES:  22
ART UNIT:         232
FULL TEXT:        1514 lines
 
 
  2/2/2     (Item 1 from file: 654) 
DIALOG(R)File 654:US PAT.FULL.
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             02999303
Utility
RELATIONAL  DATABASE  MANGEMENT  SYSTEM  FOR  CHEMICAL  STRUCTURE  STORAGE,
SEARCHING AND RETRIEVAL
 
PATENT NO.:  5,950,192
ISSUED:      September 07, 1999 (19990907)
INVENTOR(s): Moore, Jeffrey, Timonimun, MD (Maryland), US (United States of
             America)
             Brazil, Joanne, White Hall, MD (Maryland), US (United States
             of America)
             Hoover, Jeffrey R., Baltimore, MD (Maryland), US (United
             States of America)
ASSIGNEE(s): Oxford Molecular Group, Inc , (A U.S. Company or Corporation),
             Towson, MD (Maryland), US (United States of America)
APPL. NO.:   8-883,165
FILED:       June 26, 1997 (19970626)
 
  This  application  is a continuation, of application Ser. No. 08-715,708,
filed  Sep. 19, 1996, now abandoned, which is a continuation application of
Ser.  No. 08-288,503, filed Aug. 10, 1994, now U.S. Pat. No. 5,577,239, the
entire disclosure of which is incorporated herein by reference.
 
U.S. CLASS:  707-3 cross ref: 702-27
INTL CLASS:  [6] G06F 17-30
FIELD OF SEARCH: 395-496; 395-497; 395-499; 395-600; 395-603; 707-3; 707-2;
             707-1; 707-100; 707-102; 707-104; 707-22; 707-27; 707-19;
             707-20; 702-22; 702-27; 702-19; 702-20
 
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,642,762    2/1987   Fisanick                                 707-3
    4,811,217    3/1989   Tokizane et al.                        364-300
    4,855,931    8/1989   Saunders                               364-499
    5,025,388    6/1991   Cramer, III et al.                     364-496
    5,056,035   10/1991   Fujita                                 364-497
    5,259,137   11/1993   Wilson et al.                          364-496
    5,367,058   11/1994   Pitner et al.                        530-391.9
    5,379,234    1/1995   Wilson et al.                          364-496
    5,386,507    1/1995   Teig et al.                            395-161
    5,418,944    5/1995   DiPace et al.                          395-600
    5,463,564   10/1995   Agrafiotis et al.                      364-496
    5,577,239   11/1996   Moore et al.                           395-603
 
                         NON-U.S. PATENT DOCUMENTS
 
    090 895 A2   10/1983   EP (European Patent Office)
    213 483 A2    3/1987   EP (European Patent Office)
 
                             OTHER REFERENCES
 
 
Viking Instruments Corp. (Hewlett Packard); SpectraTrak Transportable GS/MS
Systems; (brochure)-No Date.
 
Chemical  Structures, The International Language of Chemistry; Wendy A. War
(Ed.); "Interfacing DARC--Oracle" AJCM (Juus) de Jong (1988).
 
J.  Chem.  Inf.  Comput.  Sci.  (1983)  , vol. 23, No. 3; pp. 102-108; DARC
Substructure  Search  System: A New Approach to Chemical Information; Roger
Attias.
 
J.  Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2; pp. 74-82; DARC System:
Notions  of Defined and Generic Substructures. Filiation and Coding of FREL
Substructure (SS) Classes; Jacques-Emile Dubois et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1990),  vol.  30,  No.  2;  pp.  191-199,
Substructure  Search Systems. 1. Performance Comparison of the MACCS, DARC,
HTSS,  and CAS Registry MVSSS, and S4 Substructure Search System; Martin G.
Hicks & Clemens.
 
J.  Chem.  Inf.  Comput.  sci.  (1988),  vol.  28,  No.  4; pp. 221-226; An
Efficient Graph Approach to Matching Chemical Structures, O. Owolabi.
 
J.  Chem.  Inf. Comput. Sci. (1990), vol. 30, No. 4; pp. 332-339; Reactions
in the Beilstein Information System: Nonaporic Organic Synthesis; Martin G.
Hicks.
 
Analytica  Chimica Acta, 235 (1990), pp. 87-92; Substructure Search Systems
for Large Chemical Data Bases; Martin G. Hicks et al.
 
J.  Chem.  Inf.  Comput.  Sci.  (1991),  vol.  31,  No. 2; pp. 320-326; The
Beilstein Structure Registry System. 1. General Design; Laszio Domokos.
 
J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4; pp. 255-260; 3DSearch; A
System for Three-Dimensional Substructure Searching; Robert P. Sheridan, et
al.
 
Substructure  Searches  of  Chemical  Structure  Files;  (Jan.  23,  1973);
Strategic   Considerations   in  the  Design  of  a  Screening  System  for
Substructure  Searches  of  Chemical Structure Files; George W. Adamson, et
al.
 
Chemical  Structure  Searching;  (Jan.  21,  1975); An Efficient Design for
Chemical Structure Searching. I. The Screens; Alfred Feldman et al.
 
J.  Chem.  Inf.  Comput.  Sci. (1982), vol. No. 4; The Third BASIC Fragment
Search Dictionary; W. Graf, H. K. Kaindl, et al.
 
J.  Chem.  Inf.  Comput. Sci. (1983), vol. 23, No. 3; The CAS Online Search
System.  1.  General  System  Design  and Selection, Generation, and Use of
Search Screens; P. G. Dittmar, et al.
 
Computer  Chemical,  ((1991),  vol.  15, No. 2, pp. 103-107; A Central Atom
Based  Algorithm  and Computer Program for Substructure Search; Alf Dengler
and Ivar Ugi.
J.  Chem.  Inf. Comput. Sci. (1993), vol. 33, No. 4; pp. 545-547; Sturcture
Searching  in  Chemical  Databases  by  Direct  Lookup Methods; Baradley D.
Christie et al.
 
 
PRIMARY EXAMINER: Von Buhr, Maria N.
ATTORNEY, AGENT, OR FIRM: Dickstein Shapiro Morin & Oshinsky
CLAIMS:           14
EXEMPLARY CLAIM:  1
DRAWING PAGES:    7
DRAWING FIGURES:  12
ART UNIT:         277
FULL TEXT:        798 lines
 
 
  2/2/3     (Item 2 from file: 654) 
DIALOG(R)File 654:US PAT.FULL.
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             02592280
Utility
CHEMICAL STRUCTURE STORAGE, SEARCHING AND RETRIEVAL SYSTEM
 
PATENT NO.:  5,577,239
ISSUED:      November 19, 1996 (19961119)
INVENTOR(s): Moore, Jeffrey, 12 Breezy Tree Ct., Timonimun, MD (Maryland),
             US (United States of America), 21093
             Brazil, Joanne, 4500 Jolly Acres Rd., White Hall, MD
             (Maryland), US (United States of America), 21161
             Hoover, Jeffrey R., 8639 Willow Oak Rd., Baltimore, MD
             (Maryland), US (United States of America), 21234
             [Assignee Code(s): 68000]
EXTRA INFO:  Assignment transaction [Reassigned], recorded October 12,
             1994 (19941012)
             Assignment transaction [Reassigned], recorded January 21,
             1997 (19970121)
APPL. NO.:   8-288,503
FILED:       August 10, 1994 (19940810)
U.S. CLASS:  707-3 cross ref: 702-27
INTL CLASS:  [6] G06F 17-30
FIELD OF SEARCH: 364-DIG.1; 364-DIG.2; 364-496; 364-497; 364-499; 395-600
 
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,642,762    2/1987   Fisanick                               364-300
    4,811,217    3/1989   Tokizane et al.                        364-300
    4,855,931    8/1989   Saunders                               364-499
    5,025,388    6/1991   Cramer, III et al.                     364-496
    5,056,035   10/1991   Fujita                                 364-497
    5,249,137    2/1993   Wilson et al.                          364-496
    5,367,058   11/1994   Pitner et al.                        530-391.9
    5,379,234    1/1995   Wilson et al.                          364-496
    5,386,507    1/1995   Teig et al.                            395-161
    5,418,944    5/1995   DiPace et al.                          395-600
    5,463,564   10/1995   Agrafiotis et al.                      364-496
 
                             OTHER REFERENCES
 
 
Viking  Instruments  Corp.  (Hewlett  Packard);  Spectra Trak Transportable
GC/MS System; (brochure), No date.
 
Chemical  Structure,  The International Language of Chemistry; Wendy A. War
(Ed.); "Interfacing DARC-Oracle" AJCM (Juus) de Jong (1988).
 
J.  Chem.  Inf.  Comput.  Sci.  (1983),  vol.  23,  No. 3 pp. 102-108; DARC
Substructure  Search  System; A New Approach to Chemical Information; Roger
Attias.
 
J.  Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2; pp. 74-82; DARC System;
Notions  of Defined and Generic Substructures. Filiation and Coding of FREL
Substructure (SS) Classes; Jacques-Emile Dubois et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1990),  vol.  30,  No.  2;  pp.  191-199,
Substructure  Search Systems, 1, Performance Comparison of the MACCS, DARC,
HTSS,  CAS  Registry  MVSSS,  and S4 Substructure Search Systems; Martin G.
Hicks.
 
J.  Chem.  Inf.  Comput.  Sci.  (1988),  vol.  28,  No.  4; pp. 221-226; An
Efficient Graph Approach to Matching Chemical Structures, O. Owolabi.
 
J.  Chem.  Inf. Comput. Sci. (1990), vol. 30, No. 4; pp. 332-339; Reactions
in the Bellstein Information System: Nonaporic Organic Synthesis; Martin G.
Hicks.
 
Analytica  Chimica Acta, 235 (1990), pp. 87-92; Substructure Search Systems
for Large Chemical Data Bases; Martin G. Hicks et al.
 
J.  Chem.  Inf.  Comput.  Sci.  (1991),  vol.  31,  No. 2; pp. 320-326; The
Bellstein Structure Registry System, 1, General Design; Laszio Domokos.
 
J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4; pp. 255-260; 3DSearch; A
System for Three-Dimensional Substructure Searching; Robert P. Sheridan, et
al.
 
Substructure  Searches  of  Chemical  Structure  Files;  (Jan.  23,  1973);
Strategic   Considerations   in  the  Design  of  a  Screening  System  for
Substructure  Searches  of  Chemical Structure Files; George W. Adamson, et
al.
 
Chemical  Structure  Searching;  (Jan.  21,  1975); An Efficient Design for
Chemical Structure Searching, I, The Screens; Alfred Feldman et al.
 
J. Chem. Inf. Comput. Sci. (1982), vol. 22, No. 4; The Third BASIC Fragment
Search Dictionary; W. Graf, H. K. Kaindl, et al.
 
J.  Chem.  Inf.  Comput. Sci. (1983), vol. 23, No. 3; The CAS ONLINE Search
System,  1,  General  System  Design  and Selection, Generation, and Use of
Search Screens; P. G. Dittmar, et al.
 
Computer  Chemical,  (1991),  vol.  15,  No. 2; pp. 103-107; A Central Atom
Based  Algorithm  and Computer Program for Substructure Search; Alf Dengler
and Ivar Ugi.
 
J.  Chem.  Inf. Comput. Sci. (1993), vol. 33, No. 4; pp. 545-547; Structure
Searching  in  Chemical  Databases  by  Direct  Lookup  Methods; Bradley D.
Christie et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1993);  vol.  33,  No.  4;  pp.  539-541;
Substructure  Searching  on  Very  Large  Files  by  Using Multiple Storage
Techniques; Alexander Bartmann et al.
 
 
PRIMARY EXAMINER: Black, Thomas G.
ASST. EXAMINER:   Von Buhr, Maria N.
ATTORNEY, AGENT, OR FIRM: Dickstein Shapiro Morin & Oshinsky LLP
CLAIMS:           12
EXEMPLARY CLAIM:  1
DRAWING PAGES:    7
DRAWING FIGURES:  12
ART UNIT:         237
FULL TEXT:        791 lines
?
 
 
 TYPE 2/2,EM,SU/ALL


  2/2,EM,SU/1     (Item 1 from file: 653) 
DIALOG(R)File 653:US Patents Fulltext
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             01566805
Utility
STORAGE AND RETRIEVAL OF GENERIC CHEMICAL STRUCTURE REPRESENTATIONS
 
PATENT NO.:  4,642,762
ISSUED:      February 10, 1987 (19870210)
INVENTOR(s): Fisanick, William, Columbus, OH (Ohio), US (United States of
             America)
ASSIGNEE(s): American Chemical Society, (A  U.S. Company or Corporation ),
             Washington, DC (District of Columbia), US (United States of
             America)
EXTRA INFO:  Expired, effective February 10, 1999 (19990210), recorded in
             O.G. of April 20, 1999 (19990420)
             Reinstated, effective June 28, 1999 (19990628), recorded in
             O.G. of July 27, 1999 (19990727)
APPL. NO.:   6-614,219
FILED:       May 25, 1984 (19840525)
U.S. CLASS:  707-3 cross ref: 707-104
INTL CLASS:  [4] G06F 15-40
FIELD OF SEARCH: 364-200MSFILE; 364-900MSFILE
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,473,890    9/1984   Araki                                  364-900
 
PRIMARY EXAMINER: Zache, Raulfe B.
ATTORNEY, AGENT, OR FIRM: Pollick, Philip J.
CLAIMS:           31
EXEMPLARY CLAIM:  1
DRAWING PAGES:    22
DRAWING FIGURES:  22
ART UNIT:         232
FULL TEXT:        1514 lines
 
 
                                  FIELD
 
  This  invention  relates  to  a method for storing and retrieving generic
chemical  structure  representations (Markush formulations) and information
associated   with  them.  It  is  directed  especially  to  development  of
specific(real)-atom   and  generic-group  representations  of  the  Markush
formulation  that are used in atom-by-atom and group-by-group comparison of
query  and  file  representations  and  the  use of screening techniques to
eliminate  a  high  percentage  of irrelevant file representations prior to
group-by-group   and   atom-by-atom   comparison   of   generic-group   and
specific-atom representations.
 
                               BACKGROUND
 
  The  ability  to  effectively  retrieve  information  on generic chemical
structures,  i.e.,  so-called  Markush  structures,  has  been a problem of
varying  magnitude  and  complexity  since  the inception of the use of the
Markush  claim  by  the  Patent  Office  in  the  1920's.  Many  manual and
mechanized  information  retrieval  systems have been developed to meet the
challenge  of  this problem but the known techniques for such retrieval are
imprecise  and  often  place  a  premium  on  the knowledge, intuition, and
cognitive skills of the searcher.
 
  The  basic  system for dealing with Markush structures is a manual system
in   which  individual  documents  containing  the  Markush  structure  are
classified   according  to  a  highly  refined  classification  system  and
physically  grouped  according  to  the classification scheme into a search
file. In making a search, the searcher proceeds by classifying the document
(query)  in  hand  and  then  goes to the appropriately classified physical
group of documents in the search file and manually searches those documents
for relevant retrievals. Such a system places a high premium on the correct
initial  classification of search file documents, correct classification of
the  query,  physical search-file integrity, and highly-developed cognitive
skills  of  the  searcher.  Moreover,  because  the  Markush  may represent
thousands  or  even  millions  of  compounds,  it  often  is  impossible to
promulgate   copies   of   the   document  into  all  of  the  search  file
classifications  represented  by the Markush formulation. Weaknesses in any
of  the  aforementioned  areas  is  likely to produce unsatisfactory search
results.  (U.S.  Department  of  Commerce,  "Development  and Use of Patent
Classification Systems", U.S. Government Printing Office, Washington, D.C.,
1966.)
 
  Another  technique  used  in  both  manual and mechanized systems for the
handling   of   Markush   structures  involves  the  use  of  a  system  of
fragmentation  codes  that  are  in  effect  generic  or  real-atom "group"
representations  of  portions  of  a  particular  Markush  formulation. For
example,  that portion of the formulation containing chains of carbon atoms
might  be  generically  encoded  as  alkyl,  or  OH  group as an alcohol or
hydroxide,  and  F,  Cl,  Br,  and I as a halide. Real-atom groups, such as
methyl  for  CH sub 3 13 , ethyl for CH sub 3 CH sub 2 --, and phenyl for C
sub  6 H sub 5 --, are also typically used. (Balent, M. Z.; Emberger, J. M.
"A  Unique Chemical Fragmentation System for Indexing Patent Literature" J.
Chem.  Inf.  Comput.  Sci.  1975,  15,  100-104.  Kaback,  S.  M. "Chemical
Structure Searching in Derwent's World Patents Index" J. Chem. Inf. Comput.
Sci.  1980,  20, 1-6. Rossler, S.; Kolb, A. "The GREMAS System, an Integral
Part  of the IDC System for Chemical Documentation" J. Chem. Doc. 1970, 10,
128-134. Rowlett, R. J. "Gleaning Patents with Chemical Abstracts" Chemtec.
1979,  June,  348-349.  Silk,  J.  A.  "Present  and  Future  Prospects for
Structural  Searching  of the Journal and Patent Literature." J. Chem. Inf.
Comput.  Sci.  1979,  19,  195-198.) However, the inter-relationships among
these  groups  in  a  Markush  formulation  are typically not encoded. As a
result,  such  systems tend to have good recall, i.e., most of the relevant
search file answers are retrieved but, because the inter-relationship among
the  groups  can  not be specified and the reliance on generic terminology,
such  systems  have  a pronounced tendency to lack precision, i.e., many of
the  answers  retrieved  are  irrelevant  to  the query. Precision has been
improved  by  incorporation  of  a  higher  degree  of specificity into the
fragmentation codes, but only at a price paid in terms of higher complexity
and  difficulty  in  file  encoding  and  search  profile formulation and a
resulting higher potential for error.
 
  Mechanized  specific  atom-by-atom  structure  matching of query and file
structural  representations  is  a well-known commercial technique that has
been  available  since  the  1960s  and  has  demonstrated  high recall and
precision  as  a search and retrieval technique. (Wigington, R. L. "Machine
Methods for Accessing Chemical Abstracts Service Information in Proceedings
of  the  IBM  Symposium  on  Computers  and Chemistry"; IBM Data Processing
Division:  White  Plains, NY, 1969. Eakin, D. R. "The ICI CROSSBOW System,"
in  Ash,  J.  E.;  Hyde, E., Eds. Chemical Information Systems, Chichester,
Horwood,  1975.  Dubois,  J.  E.  "DARC  System  in Chemistry", in Computer
Representation  and  Manipulation  of  Chemical  Information, Wipke, W. T.;
Heller,  S.; Feldman, R.; Hyde, E., Eds., Wiley, New York, 1974. Schenk, H.
R.;  Wegmuller,  F. "Substructure Search by Means of the Chemical Abstracts
Service  Chemical  Registry II System" J. Chem. Inf. Comput. Sci. 1976, 16,
153-161.   Feldman,   R.  J.  "Interactive  Graphic  Chemical  Substructure
Searching"   in   Computer  Representation  and  Manipulation  of  Chemical
Information,  Wipke, W. T.; Heller, S.; Feldman, R.; Hyde, E., Eds., Wiley,
New  York,  1974.)  Because atom-by-atom structure matching is a relatively
slow  process, screening techniques have been developed to eliminate a high
percentage of irrelevant file representations. Typically screening involves
capturing key features of the file representations such as atom environment
and  atom  sequences  and  then  matching similar key features of the query
representation  to give a set of answers that are then used in atom-by-atom
structure  matching.  (Dittmar, P. G.; Farmer, N. A.; Fisanick, W.; Haines,
R.  C.;  Mockus, J. "The CAS ONLINE Search System. 1. General System Design
and Selection, Generation, and Use of Search Screens" J. Chem. Inf. Comput.
Sci.  1983,  23, 93-102. Attias, R. "DARC Substructure Search System: A New
Approach  to  Chemical  Information"  J.  Chem. Inf. Comput. Sci. 1983, 23,
102-108.)  Unfortunately,  structure matching techniques tend to be limited
to  files  containing  representations  of  unique individual compounds and
queries  have been limited to specific structural representations that must
exactly   match   the   structural  representation  of  the  file  compound
(full-structure  search)  or  be  embedded within it (substructure search).
Structure  matching  techniques  have  been applied to Markush formulations
which  represent  a  relatively  small  number  of specific compounds using
queries  that  contain  only real atoms. (Meyer, E. "Topological Search for
Classes   of   Compounds  in  Large  Files--even  of  Markush  Formulas--at
Reasonable  Machine  Cost"  in  Computer Representation and Manipulation of
Chemical  Information,  Wipke,  W.  T.;  Heller, S.; Feldman, R.; Hyde, E.,
Eds.,  Wiley,  New  York,  1974.) However, in attempting to apply structure
matching  techniques  to  query  and file structures represented by Markush
formulations  of  the  type  often  found  in  broad  patent claims, one is
immediately  faced  with  the problem that a single Markush formulation may
literally represent millions of specific compounds. When one considers that
the  file  size of the current large commercial structural matching systems
is  a little less than seven million specific compounds, an appreciation is
gained  for the difficulty in using structure matching techniques to search
effectively  Markush structures. Although proposals have been made to apply
structure  matching  techniques  to  broad  Markush formulations, no viable
system  for searching such Markush formulations that gives a high degree of
recall  and precision has yet been achieved. (Lynch, M. F.; Bernard, J. M.;
Welford,  S.  M.  "Computer  Storage  and  Retrieval  of  Generic  Chemical
Structures  in Patents. 1. Introduction and General Strategy" J. Chem. Inf.
Comput.  Sci.  1981, 21, 148-150. Barnard, J. M.; Lynch, M. F.; Welford, S.
M.  "Computer  Storage  and  Retrieval  of  Generic  Chemical Structures in
Patents.  2.  GENSAL,  a  Formal  Language  for  the Description of Generic
Chemical Structures" J. Chem. Inf. Comput. Sci. 1981, 21, 151-161. Welford,
S.  M.;  Lynch,  M.  F.;  Barnard, J. M. "Computer Storage and Retrieval of
Generic Chemical Structures in Patents. 3. Chemical Grammars and their Role
in  the  Manipulation  of  Chemical  Structures" J. Chem. Inf. Comput. Sci.
1981,  21,  161-168. Barnard, J. M.; Lynch, M. F.; Welford, S. M. "Computer
Storage  and  Retrieval  of  Generic  Chemical Structures in Patents. 4. An
Extended Connection Table Representation (ECTR) for Generic Structures." J.
Chem.  Inf.  Comput.  Sci.  1982,  22,  160-164. Nakayama, T.; Fujiwara, Y.
"Computer  Representation  of  Generic  Chemical  Structures by an Extended
Block-Cutpoint  Tree"  J.  Chem. Inf. Comput. Sc 1983, 23, 80-87. Kudo, Y.;
Chihara  H.  "Chemical  Substance  Retrieval  System  for Searching Generic
Representations.  1.  A  Prototype System for the Gazetted List of Existing
Chemical  Substances  of  Japan"  J.  Chem.  Inf.  Comput.  Sci.  1983, 23,
109-117.)
 
                                 SUMMARY
 
  A  typical  Markush storage and retrieval process according to the resent
invention   comprises   the   steps   of   forming  a  file  of  structural
representations  of  Markush formulations in which each Markush formulation
is represented by a single specific atom multiple connectivity node (SpMCN)
representation  in which the formal valance requirements of requisite atoms
are  relaxed  to  allow for the attachment of all atoms and groups of atoms
depicted   in   the   Markush   formulation  and,  as  a  result,  gives  a
representation  containing  all  implicit specific atom structures found in
the  Markush  formulation.  The  SpMCN  is  then converted to an associated
generic group multiple connectivity node (GnMCN) representation through the
use  of a generic-group hierarchy. A query Markush formulation is similarly
converted   to   SpMCN   and   GnMCN   representations.   The  query  GnMCN
representation  then  is  compared on a group-by-group basis with each file
GnMCN  in  such fashion so that a match is found when at least one implicit
generic  structure  representation  (IGSR)  of the query GnMCN is identical
with  (overlaps)  or is contained in (embedded in) at least one IGSR of the
file  GnMCN.  The  query  SpMCN  representation  then  is  compared  on  an
atom-by-atom  basis with the file SpMCN representations associated with the
file  GnMCN  representations  (answers)  obtained  in  the  previous  query
GnMCN/file  GnMCN  matching  step  in such fashion so that a match is found
when at least one implicit specific atom structure representation (ISSR) of
the  query  SpMCN structure is identical with (overlaps) or is contained in
(embedded  in)  at  least one ISSR of the file SpMCN. An indexing system is
used to identify IGSRs and ISSRs for the matching process and to manipulate
large or complex GnMCN and SpMCN representations.
 
  As  a  further  refinement,  generic  features  of  the  original Markush
formulation are captured by using the generic-group hierarchy as a means of
representing  generic features of the Markush formulation in both the SpMCN
and  GnMCN. To insure high recall, a roll-back feature is used to allow for
the  exchange  of  generic-group and specific-atom representations in SpMCN
matching so that all real atom file or query structural features implied in
the  generic structural features of the file or query SpMCN are matched. In
addition,  specific features of the SpMCN and specifically identified parts
of  generic  features of the original Markush formulation, such as specific
atoms,  type  of  bonding, ring size, etc. are associated with each generic
group  of  the file GnMCN as group attributes and are matched against group
attributes  of  the  generic  groups  of  the  query  GnMCN  prior to SpMCN
matching.
 
  As  a  further refinement, screening techniques are applied to both SpMCN
and  GnMCN  representations  in  order  to  eliminate  a  large  number  of
irrelevant  file  representations prior to the more exacting group-by-group
and atom-by-atom comparisons. In order to achieve a high level of recall, a
Boolean  strategy  is  used  in  the  query screen logic expression whereby
special,  "diagnostic"  generic-group  screens are used as alternatives for
sets  of  specific-atom  screens  in  order  to  retrieve  file  answers in
situations  where  real-atom  structures  of  the SpMCN query structure are
implied  in  the  generic  portions  of the file SpMCNs that originate from
generic  features  of  the original Markush formulation and for which there
are no real-atom counterparts.
 
                          DETAILED DESCRIPTION
 
  A simple Markush formulation is set forth in structure Ia of FIG. 1. This
formulation  consists of a fixed structure portion to which is attached the
variable  groups  R  sub 1 and R sub 2. As indicated in the text portion of
the  formulation,  R sub 1 may be chlorine (Cl) or bromine (Br) and R sub 2
may  be  ethyl  (CH  sub  3 CH sub 2) or methyl (CH sub 3). Implicit in the
Markush  formulation  is  the  representation  of  four distinct individual
compound representations, Ia1-Ia4, that are, in effect, all of the possible
individual  structures  resulting from the combinations of fragments in the
variable groups denoted by R sub 1 and R sub 2.
 
  In  representation  Ia,  it  is  noted  that  carbon  (C) typically has a
connectivity (valance) of four, i.e., is capable of attaching or connecting
itself  to  four  other entities or to fewer than four other entities via a
multiple  bond  to  one  or  more of the entities. Specifically in the ring
system  of  representation Ia, each carbon is bound to a second carbon by a
multiple  (double)  bond, to a third carbon atom by a single bond, and to a
hydrogen atom (H) or to a variable group node (R sub 1,R sub 2) by a single
bond  to  give  the usual carbon valance of four. As is shown in structures
Ia1-Ia4,  it  is  common  practice  in  the chemical arts often to omit the
hydrogen  atoms  and  to  designate  the  alternate single and double bonds
between  carbon atoms in the ring as a circle, the later convention is felt
to  represent  more  realisticly  a  delocalized bonding situation in which
there  are  more like one and a half bonds between all carbon atoms. Except
where  noted,  these  common  conventions  will  be followed throughout the
remainder of the specification and drawings.
 
  Structure  Ib  of FIG. 1 is a multiple connectivity node (MCN) structure.
In  it,  all of the fragments belonging to the variable groups described in
the  text  part  of  the  Markush  formulation  have been attached to their
respective  nodes  or points of variability (as shown in the structure part
of  the  Markush  formulation  Ia)  giving rise to nodes of abnormally high
connectivity  and  hence  the multiple connectivity node (MCN) designation.
Since  Ib  represents  all  of the specific atoms identified in the Markush
formulation, it is designated as a specific-atom multiple connectivity node
(SpMCN)   representation.  It  should  be  noted  that  the  four  distinct
individual  compound  representations,  Ia1-Ia4,  are  also implicit in the
SpMCN.  These  individual  implicit  representations  are  referred  to  as
implicit specific-atom structural representations (ISSRs).
 
  By  using  common  generic technology, it is possible to simplify further
the  specific  multiple  connectivity  node structure (SpMCN). For example,
carbon  ring  structures containing only carbon atoms in the ring are often
given the generic description of carbocycles; linear chains of carbon atoms
are  generically  termed  alkyls; and chlorine and bromine are often called
halides.  Using this basic generic terminology, it is possible to transform
the  SpMCN  representation shown in Ib to the generic multiple connectivity
node  representation  (GnMCN)  shown  in Ic. In transforming the SpMCN to a
GnMCN,  the  bonding level between the generic groups is preserved. In this
particular  example,  only  a single bond exists between the carbocycle and
the  variable  groups.  If,  however,  a  multiple  bond exists between the
generic  representations,  such bonding is indicated in the GnMCN. Implicit
within  the  GnMCN representation are four implicit generic group structure
representations  (IGSRs),  Ic1-Ic4,  corresponding  to  the  four  distinct
compound  representations,  ISSRs, implicit in the original Markush and the
SpMCN.  It  is  critical  to  note  that  IGSRs  and  ISSRs  are  used  for
illustrative  purposes only. This invention does not anticipate the storage
of  all  ISSRs  and  IGSRs  associated with the respective SpMCN and GnMCN.
Rather  the invention is directed at the indivdiual ISSRs and IGSRs as they
are   implicitly   contained   within  the  SpMCN  and  GnMCN.  The  actual
representation  and  processing  uses  only  the  explicit  SpMCN and GnMCN
representations.  The  ISSRs and IGSRs are used only as they are implicitly
found within the SpMCN and GnMCN representations.
 
  FIGS.  2,  2',  3,  3'  and  4  illustrate the use of the GnMCN and SpMCN
representations  in  file  searching  and  retrieval.  In  FIGS.  2 and 2',
representations  IIa-VIa are illustrative file Markush formulations as they
might  appear in patent documents, IIb-VIb are SpMCN representations of the
corresponding Markush formulation, and IIc-VIc are GnMCN representations of
the corresponding SpMCN representations. The query Markush formulation VIIa
is  also  shown as a SpMCN representation (VIIb) and a GnMCN representation
(VIIc). As shown in FIGS. 3 and 3', a file search is initiated by comparing
each  query  IGSR  (VIIc1  and  VIIc2)  with  each  file  IGSR  (11c1-IIc5,
IIIc1-IIIc3,   IVc1-IVc3,   Vc1-Vc4,   and  VIc1-VIc4;  identical  implicit
structures  are  shown  only  once). As seen, query IGSR VIIc1 matches with
file  IGSRs IIc1-IIc5 and Vc4; VIIc2 matches with Vc2-Vc3 and VIc1-VIc4. At
this point, representations III and IV have been eliminated from the search
and,  as  shown  in  FIG.  4, matching now proceeds between the query ISSRs
VIIb1-VIIb2  and  the  file  ISSRs IIb1-IIb6, Vb1-Vb4, and VIb1-VIb4; query
ISSR  VIIb1  matches  only with file ISSR IIb1 and query ISSR VIIb2 matches
nothing,  specifically  illustrating that only one ISSR need match one file
ISSR  to  give  an  answer.  To  complete  the search, relevant information
associated with the Markush formulation IIa such as, but not limited to, an
abstract, patent number, or patent document is retrieved for the searcher.
 
  FIG. 5 illustrates the two types of matching criteria that a searcher may
use  in  carrying  out  a  search. Representation VIII is a single compound
representation  in  which  the ISSR is identical with the actual structure.
This structure matches exactly with the file representation X which is also
a single compound representation. The exact matching of all characteristics
of the query representation with those of the file representation is termed
"overlap" or full-structure search. Exact matching may be relaxed such that
the   query   representation   need  only  be  contained  within  the  file
representation.  Thus,  although query representation VIII does not exactly
match  or  "overlap"  the  single  file  representation XI, it is contained
within  representation  XI.  Such  containment  of the query representation
within  the  file  representation  is  termed  "embedment"  or substructure
search.  Systems  for  both  full-structure  and  substructure  search  are
commercially  available,  e.g.,  CAS  ONLINE:  The  Registry File, Chemical
Abstracts Service, Columbus, Ohio.
 
  Atom-by-atom  searching involves the comparison of a query structure with
a file structure using a path-tracing technique. Typically the path-tracing
technique  involves selecting a starting atom (node) of the query structure
(usually a noncarbon atom) and comparing it with the first atom of the file
structure. If the atoms do not match, the file structure is advanced to the
next atom (node) until a match with the starting query node is obtained. If
a match is obtained, the query proceeds to the next connected atom which is
compared  with the next connected atom of the file structure. If these next
atoms  do  not  match,  the  file  structure is backtracked to the original
matching  atom  and  another  connected  atom  is  selected for match. This
advancing/comparing/backtracking routine is continued until all atoms match
or all atom sequences of the query are exhausted. Overlap requires that all
atoms  of  the  query  match  with  all  atoms  of the file structure while
embedment  requires  that all of the atoms of the query be contained within
the  file  structure.  A  description  of atom-by-atom matching is given in
Lynch, M. F.; Harrison, J. M.; Town, W. G.; Ash, J. E. Computer Handling of
Chemical  Information,  MacDonald,  London  and American Elsevier Inc., New
York, 1971 at pp. 73-74, all of which is herein incorporated by reference.
 
  It  is  an  object  of  this  invention  to  extend  both the overlap and
embedment  matching  concepts to Markush searching. Thus if the query SpMCN
IXa  search is limited to overlap only, the query ISSR IXa1 will match only
with  file  ISSR  XIIa2. If the matching criterion is relaxed to embedment,
ISSR  XIIIa1  is  also  a  valid  match.  It  is not necessary to limit the
searching  of  a  Markush  query  to a Markush file, e.g., the ISSRs of the
SpMCN  representation  also  can  be  compared  with both specific compound
representations such as X and XI and the ISSRs of the SpMCN representations
XIIa  and  XIIIa.  At the overlap level of search, query IXa retrieves file
representations  X  and  XIIa;  at  the  embedment  level  of  search, file
representations  X,  XI,  XIIa,  and  XIIIa  are retrieved. Single specific
compound  queries also can be searched against the Markush file, e.g., VIII
matches  with  XIIa1  (overlap)  and  with XIIIa1 (embedment). Although not
illustrated,  embedment  and  overlap criteria are also used at the generic
level   of  searching.  Thus  an  implicit  generic  query  representation,
alkyl-halide,   overlaps   an   implicit   generic   file   representation,
alkyl-halide,  and  is embedded in an implicit generic file representation,
carbocycle-alkyl-halide. Finally it is noted that the overlap criterion can
be  applied  to  the  entire  SpMCN  representation  itself.  Such  a match
condition  requires  all structural elements of the file SpMCN be identical
to  all structural elements of the query SpMCN, i.e., all ISSRs or the file
and   query   SpMCNs   must   be  identical.  Requiring  the  entire  SpMCN
representation IXa to match at the overlap level permits only the retrieval
of  file  SpMCNs that are identical to it, i.e., contain both IXa1 and IXa2
but   only   those  two  implicit  representations.  For  an  entire  SpMCN
representation   to   match   at   the  embedment  level,  the  file  SpMCN
representation must contain all ISSRs of the query representation.
 
  In order to convert SpMCN representations to GnMCN representations, it is
highly  desirable  to have a classification scheme that uses a small number
of  controlled-vocabulary  hierarchical terms that permit classification of
all  groups  of atoms likely to be encountered in a specific substance or a
Markush  formulation.  FIG. 6 illustrates such a classification scheme. The
overall  structure  of  the classification scheme consists of breaking each
less-specific  group into two mutually exclusive, more specific groups. The
general  group "G" is used to handle groups of atoms that can not be easily
associated   with   a   more   specific   group  classification,  e.g.,  an
electron-withdrawing  group,  a group containing nitrogen, etc. The G group
is  classified further into two mutually exclusive groups: any cyclic group
(Cy)  or  any acyclic group (Ay). The cyclic group (Cy) is broken down into
any  carbocycle  group  (Cb)  or any heterocycle group (Hc). The carbocycle
group  (Cb)  characterizes any ring system containing only carbon atoms and
any  attached  hydrogen  atoms.  The  Cb group may be attached to any other
group,  including itself, or it may stand alone. The heterocycle group (Hc)
characterizes  any  ring  system  containing one or more hetero (noncarbon)
atoms and any attached hydrogen atoms. Similar to Cb, Hc may be attached to
any  group, including Hc, or it may stand alone. A fused ring system, i.e.,
two or more rings joined at two or more atoms on each ring with each other,
is  considered  a  single  group while two rings joined to each other by an
acyclic bond is considered as two groups. Thus a naphthalene ring system is
designated  as Cb while a biphenyl system would be characterized as Cb--Cb.
A  quinoline  ring  system,  which  consists  of  a  carbocycle  fused to a
heterocycle, is considered as a single heterocycle group, Hc.
 
  Moving  to  the  acyclic side of the hierarchy, the acyclic group (Ay) is
broken  down  into  any  acyclic  carbon  (chain) group (Ch) or any acyclic
noncarbon  (functional)  group  (Fg).  The  acyclic noncarbon group (Fg) is
further broken down into any acyclic noncarbon connecting group (Fc) or any
acyclic   noncarbon  terminal  group  (Ft).  The  terminal  group  (Ft)  is
characterized  as  a single atom that is neither carbon or hydrogen but may
be  attached  to  one  or  more  hydrogens.  The  Ft  group may stand alone
(unattached to any other group), e.g. NH sub 3, H sub 2 O, Cu, or it may be
attached  to  one and only one other group where the other group may be any
other  group  including  Ft  except that the Ft group cannot be bound to an
alkyl  group  (Ak)  by a multiple bond since, by definition, an alkyl group
bound  to a Ft group by a multiple bond is a Cg group. Thus C sub 6 H sub 5
--NH  sub  2  transforms  to Cb--Ft while an aldehyde such as CH sub 3 --CH
double  bond  O  transforms to Cg double bond Ft and not Ak double bond Ft.
See infra Cg and Ak. The acyclic noncarbon connecting group (Fc) is defined
as  a single atom that is neither carbon or hydrogen but may be attached to
one  or  more  hydrogens  and  must be attached to two or more other groups
including  itself,  e.g.,  phenyl-O-phenyl  is  expressed as Cb--Fc--Cb. By
definition, Fc may not stand by itself or attached to only one other group.
 
  The  acyclic  carbon  group  (Ch)  is further broken down into an acyclic
carbon group (Cg) attached to an acyclic noncarbon terminal group (Ft) by a
multiple bond, or any other acyclic carbon group (Ak) not defined as Cg. By
definition,  the  Cg  group  can not stand alone. It must be attached to at
least  one Ft group by a multiple bond and it may also be attached to other
groups, except Ak or Cg. The Ak group consists of a group of acyclic carbon
atoms  and  any  attached  hydrogen  atoms  that  may stand alone or may be
attached  to  any group, except Cg or Ak. When a Cg is attached to an Ak or
another  Cg or when an Ak is attached to a Cg or another Ak, the two groups
merge into the appropriate single group, e.g., Ft double bond Cg--Cg double
bond  Ft  becomes  Ft  double bond Cg double bond Ft, Ft double bond Cg--Ak
becomes Ft double bond Cg, and Ak--Ak becomes Ak. CH sub 3 --CH double bond
O  becomes Cg double bond Ft; CH sub 3 --OH becomes Ak--Ft. The compound CH
sub  3  --CH  sub  3 is not represented Ak--Ak but rather as simply Ak. The
compound  O  double  bond  CH--CH sub 2 CH sub 2 -- CH double bond O is not
represented  as  Ft double bond Cg--Ak--Ak--Cg double bond Ft but rather as
Ft  double  bond  Cg  double  bond Ft. Table I illustrates the hierarchical
scheme  using  classification  notation.  FIG. 7 illustrates the use of the
hierarchy to transform SpMCN representations to GnMCN representations.
 
              TABLE I
 Hierarchical Generic Group Classification
 G   any chemical group
Cy  any cyclic group
Cb  any all carbon cyclic group
Hc  any cyclic group containing at least one noncarbon atom
Ay  any acyclic group
Ch  any carbon acyclic group
Cg  any carbon acyclic group attached by a multiple bond to
    a terminal non-carbon acyclic group (Ft)
Ak  any carbon acyclic group not defined by Cg
Fg  any noncarbon acyclic group
Ft  any noncarbon atom standing alone or attached to only one
    other group
Fc  any noncarbon atom attached to two or more other groups
 
  The acyclic carbon group (Ak) can be further divided into two groups, one
group  in which at least one multiple carbon-to-carbon bond must be present
and  another  group  in  which  multiple  carbon-to-carbon  bonding  is not
allowed.  Alternatively,  the  Ak  group can be divided on the basis of the
number  of  carbon  atoms  in the group. For a generalized file, separation
into  two  groups in which one group contains six or fewer carbon atoms and
another group containing more than six carbon atoms gives a relatively even
number of file compounds to the two groups. Other groups, such as Cb and Hc
can also be divided into more-specific, mutually-exclusive groups. E.g., Cb
could  be  divided  into fused and non-fused groups or into groups based on
the number of rings within the group.
 
  The above hierarchy is designed to meet the requirements of a search file
consisting  of  a broad range of chemical compounds likely to be found in a
broad  definition  of the chemical arts. However, this is not to imply that
this  invention  is  limited to this hierarchy since it is anticipated that
other similar hierarchies may be defined to meet the needs of files limited
to  particular  kinds  of  compounds,  e.g.,  inorganic  compounds,  cyclic
compounds, etc.
  The  classification  hierarchy  also  is  used  to  represent the generic
language  used  in  the  original  Markush  formulation so as to enable the
capture  of  this  information  in the SpMCN and GnMCN representations. For
example,  as  seen  in structure XVIIIa of FIG. 8, Markush formulations may
contain  terms  that are in themselves generic, e.g., alkyl group, electron
with-drawing  group,  and  heterocyclyl.  Because the hierarchy defines all
possible   chemical   groups,  generic  terminology  used  in  the  Markush
formulation  can be captured and used in the matching process. For example,
in  going  from  structure  XVIIIa to structure XVIIIb in FIG. 8, the alkyl
becomes Ak, the electron-withdrawing group becomes the general group G, and
the  Heterocyclycl  group becomes Hc. The appearance of generic groups from
two  sources  in  the  GnMCN  (structure  XVIIIc),  i.e.,  from the Markush
formulation  itself and from the real atoms in the SpMCN (structure XVIIIb)
requires  additional processing considerations if high levels of recall are
to  be  obtained.  To afford this additional processing, it is necessary to
distinguish  generic  groups generated from real atoms from those generated
from the original Markush formulation in the GnMCN representation. As shown
in  GnMCN  representation XVIIIc, generic groups generated from the Markush
formulation  are  designated  with  a  prime  (')  to distinguish them from
generic groups generated from the real atoms shown in structure XVIIIb.
 
  FIG.  8  also illustrates other aspects of a Markush formulation that are
anticipated  by  this invention. For example, the group R sub 2 is noted to
have  no  specific  point of attachment in the Markush formulation (XVIIIa)
and  in  fact such formulation is intended to express the attachment of any
of  the  groups  designated  by R sub 2 in any of the open positions on the
six-membered  ring.  To handle such such a formulation, each of the R sub 2
groups  is promulgated on each of the open nodes of the benzene ring in the
SpMCN  representation (XVIIIb). The use of "n" to express an alkyl chain of
variable  length  in the Markush formulation (XVIIIa) is transformed to the
SpMCN  representation (XVIIIb) by separate expression of each unique moiety
found  in  the  "n"  formulation.  Finally  it  is  noted  that  a  Markush
formulation  may  contain  "limiting" logic, i.e., "If R sub 1 =OMe, then R
sub  2  =Cl.  Such "limiting" logic is associated and stored with the SpMCN
representation and is verified after query SpMCN and file SpMCN matching is
complete.
 
  In  order  to  apply  known structure matching techniques used for single
compound  matching to Markush searching in a straight-forward manner, it is
necessary  to  simplify complex SpMCN and GnMCN structures that result from
the  representation  of  Markush  formulations.  Structure  XIXb1 of FIG. 9
illustrates  two  such complexities, i.e., multiple ring formation when the
point  of  variability  is  within  a  ring  itself  and  the  formation of
"pseudo-rings"  when  the point of variability is within a chain structure.
By moving the point of variability to a node (atom) outside of the ring, it
is possible to eliminate such complex structural features (structures XIXb2
and  XIXc).  Occasionally,  as shown in FIG. 10, the structural features of
the  SpMCN  may not permit the shifting of the point of variability outside
of  the  ring  structure.  This  situation  requires the use of a "null" or
"dummy"  node, or, alternatively, the use of Boolean logic to represent the
various structural features. Representation XXb1 illustrates the use of the
null  node  (Nu)  while  representations  XXb2-XXb4  illustrate  the use of
individual  representations  linked  by  "OR"  Boolean logic. The null node
representation  or  Boolean  logic  representations  are also used with the
GnMCN representations (XXc1 and XXc2-XXc4, respectively).
 
  One  other feature that must be considered in the generation of the SpMCN
and  GnMCN  is  the  concept  of  variable hydrogen (H sub v). As was noted
earlier,  it  is a common practice to disregard the presence of hydrogen in
representing  chemical  structures,  especially  cyclic  structures such as
representations Ia1-Ia4 of FIG. 1, and, in fact, such a convention often is
used  in  commercial  structure-search  systems.  This is done typically to
minimize  storage requirements. The hydrogen atoms attached to carbon atoms
are  represented  implicitly,  i.e.,  the number of hydrogens attached to a
carbon  atom  can  be determined from the number of bonds and the number of
non-hydrogen  attachments.  Although  it  is  common notation to explicitly
represent  hydrogen  in  acyclic  structures,  such specific representation
usually  is  not used in commercial structure search systems. Thus hydrogen
is  typically  not  explicitly  used  in either cyclic or acyclic structure
search  systems.  See  for  example, "The CAS ONLINE Search System; General
System  Design  and Selection, Generation, and Use of Search Screens" by P.
G.  Dittmar,  N.  A.  Farmer,  W. Fisanick, R. C. Haines, and J. Mockus (J.
Chem.  Inf.  Comput.  Sci.,  1983,  23,  93-102.  ), all of which is hereby
incorporated by reference. However, in Markush formulations, hydrogen often
is  used  as  one  of  the  fragments in a variable group, and, in order to
achieve  high  recall, it is useful to express the hydrogen atom explicitly
in  such  cases.  For  example, in Markush formulation XXIa of FIG. 11, the
variable  portion  of the formulation contains the groups H, OH, and Cl. If
the  H  is  not  recorded in the SpMCN and GnMCN, the basic ISSR, XXIb1, is
"lost"  to  the  implicit  structure matching process at both the SpMCN and
GnMCN  levels.  To  avoid  such  a  loss,  the  variable  group hydrogen is
explicitly  represented  in  the  SpMCN  as H sub v so that the appropriate
ISSR,  XXIb1,  can  be anticipated in the SpMCN structure matching process.
Likewise,  variable  hydrogen  also  is explicitly represented in the GnMCN
representation so that the appropriate basic IGSR can be anticipated in the
GnMCN structure matching process, e.g., ISSR, XXId1', of FIG. 11.
 
  FIG.   11'  illustrates  the  representation  of  H  sub  v  and  various
combinations  of  generic  groups  that are forbidden by the classification
scheme,  e.g.,  Ak--Cg,  Ak--Ak.  In SpMCN representation XXIIb, H sub v is
explicitly  expressed. On transformation to the GnMCN representation XXIIc,
several  "forbidden"  group  combinations are observed, e.g., Ak--Cg, Ak--H
sub v, and Ak--Ak. These groups are appropriately combined according to the
hierarchical   rules   to  give  the  IGSRs,  Ak(XIId1'),  Cg  double  bond
Ft(XXIId3'),  and  Ak--Fc--Ak(XXIId4').  Alternatively, the use of variable
hydrogen  in  the  GnMCN  can be avoided by merging it with the CH sub 2 to
give  CH  sub 3 and shifting the node of variability to the first carbon as
shown  in  SpMCN  representation XXIIe. It should be noted that the node of
variability  is  shifted  and  not  an individual substituent, i.e., if one
substituent  is shifted then all substituents must be shifted. The shifting
of  only  one  variable substituent of a set of variable substituents gives
rise  to  extraneous  structures  not  implicit  in  the  original  Markush
formulation.  SpMCN  representation XXIIe is transformed to GnMCN XXIIf and
the forbidden group combinations eliminated in the IGSRs.
 
  Because  of  the  broad nature of the hierarchical classification used in
capturing  the  generic features of the original Markush formulation and in
transforming   the   SpMCN  representation  to  the  GnMCN  representation,
considerable  precision  is  lost  in  GnMCN structure matching if only the
hierarchical  groups  are  used  to  capture  these  generic  features. For
example, in FIG. 12, generic moieties a, b, and c and the specific moiety d
represent varying levels of atom and bond specificity that are lost if such
moieties  are  captured  only  as  the  generic  group Hc. To improve GnMCN
matching precision, attributes of any generic expressions from the original
Markush  formulation and real-atom moieties in the SpMCN are captured in an
attributes  table  and  used  in  an  additional  matching step immediately
following  the  matching  of  a  query  and  file group in query/file GnMCN
structure  match.  Alternatively,  attribute  matching  can be applied as a
separate  step  following  query/file GnMCN structure match for those cases
where  a  structure  match  has  been  found. As shown in FIG. 12, suitable
attributes for the Hc group are: (1) the identity of the hetero atom(s) and
their  number, (2) the size of the ring, and (3) the type of bonding in the
ring.  As  shown in FIG. 12, moiety (a), heterocycle, conveys no additional
attributes  not  found  in  the Hc group itself; moiety (b), N-heterocycle,
conveys  the  fact that one or more of the heteroatoms (noncarbon atoms) in
the  heterocycle  are nitrogen atoms; moiety (c), a 6-membered heterocycle,
conveys the fact that there are six atoms in the cyclic structure, and (d),
a  specific  atom  structure,  conveys  the  identity  and  exact number of
heteroatoms  (one nitrogen atom), the size of the ring (six atoms), and the
number  and  type  of  bonds  found in the cyclic structure (six normalized
bonds).  Attributes  are  derived  from  generic  features described in the
original  Markush  (moieties  (a)-(c))  as  well  as from the specific atom
structure  found  in  the SpMCN representation (moiety (d)). The use of the
attribute  table  in  the  query/file  comparison process is illustrated in
FIGS.  13  and 13' where the SpMCN, GnMCN, and Hc Attribute Table are given
for  file representations XXIII-XXV and query representation XXVI. FIGS. 13
and 13' illustrate a set of file compounds in which the IGSR XXVIb has been
found  to  match  at  least  one  IGSR  of each of the file representations
(XXIIIb-XXVb).  When  a  match between the Hc group in the query and the Hc
group  in  the  file  representation is obtained during IGSR comparison, or
alternatively  in  the step after GnMCN comparison, the query Hc attributes
of  the appropriate IGSR are compared with Hc attributes of the appropriate
file  IGSR.  As is evident, the query Hc attributes XXVIc are included only
within  the  file  Hc attributes XXVc. Query attributes of the other groups
making  up  the  query  IGSR  also  are  compared  with  attributes  of the
corresponding  groups of the file IGSRs. If all attributes of all groups of
at least one query IGSR are included within all of the corresponding groups
of at least one IGSR of a file GnMCN, the ISSRs of the associated query and
file  SpMCN  representations  are  compared  in the next step. In comparing
attributes,  only those attributes are compared for which a value exists in
both the query and file list. If a particular attribute in either the query
or  file  attribute  list  contains  no  information, no attempt is made to
compare  that  particular  attribute. Thus in comparing the query attribute
list   XXVIc   with  the  file  attribute  list  XXIVc,  only  attribute  1
(heteroatoms)  was compared; no attempt was made to compare attributes 2 or
3.  Depending  on  the  precesion desired, additional detail can be used in
describing a particular attribute. For example, if the attribute is derived
from  a real-atom configuration it is typically a closed set, i.e., limited
to  a specific value, while an attribute derived from a generic term may be
an open set where additional or alternate values are possible. If attribute
1  of  file  representation XXIVc is considered as an alternate expression,
i.e.,  one  or  more  nitrogen  or one or more oxygen atoms, then the query
attributes  XXVIc  would also be included within its attributes. Such level
of  detail  in  attribute  description  is  typically  handled by using the
appropriate Boolean logic.
 
  FIGS. 2-4 illustrate the general search strategy of this invention, i.e.,
matching  of  IGSRs followed by matching of ISSRs. The IGSRs of FIGS. 3 and
3'  and the ISSRs FIG. 4 were developed on the basis of an intuitive logic,
i.e.,   following   from  the  Markush  formulation  itself.  Although  the
development  of the ISSRs of FIG. 4 is aided significantly by the fact that
each ISSR must follow the usual connectivity rules of the chemical arts, at
the  generic level, such connectivity rules are no longer available for the
development  of  IGSRs.  One  approach to the problem is to break the query
GnMCN  into  its simplest structural units and compare these units with the
file  GnMCN  representations.  Thus  in  FIG. 3', query representation VIIc
breaks  down  into carbocycle-halide and carbocycle-alkyl. Comparing either
of  these  groups  with  the  file GnMCN representations results in all but
structure  IVc  giving a match. At this point, two things are noted: (1) it
is  not necessary to generate the implicit file GnMCN structures--the GnMCN
structure itself can be used to see if the fragment is contained within the
structure,  and (2) by using a less specific implicit query representation,
an   additional   answer,   irrelevant  to  be  sure,  is  obtained,  i.e.,
representation  IIIc.  Representation  IIIc  does, of course, drop from the
search at the SpMCN level of matching.
 
  In  order  to  obtain  the highest level of specificity for the ISSRs and
IGSRs  implicit in the respective SpMCN and GnMCN structures and to obviate
the  need  to expand each file and query SpMCN and GnMCN structure into its
respective  ISSRs  and  IGSRs,  it  is highly desirable to have an indexing
system  that  allows  for  the recognition of all ISSR and IGSRs within the
respective  query  and  file  SpMCN  and GnMCN structures. Such an indexing
system  is illustrated in FIG. 14. Each atom and group is assigned a "flag"
(f  sub  n) that indicates the level of variability (zero level in the case
of  atoms  or groups in the base structure) and, if appropriate the k value
of  the  substituent,  i.e., r sub k. As seen in Table II, such an indexing
system  makes  it possible to easily track all possible combinations of the
substituent groups.
 
  One possible isolation scheme is to first select a fragment in a variable
group  of  the  first  level of variability, for example, Br (r(1)f(1)) and
then  "prune"  or  discard other alternative fragments in the same variable
group  and take all but one fragment in any other variable group as denoted
by  the appropriate flags. Such a procedure leads directly to a single ISSR
or  IGSR structure. By keeping track of which fragments have been used, all
ISSRs  or  IGSRs  of  the  respective  SpMCN or GnMCN representation can be
generated.
 
              TABLE II
 f(0)          r(1)f(1) r(2)f(1)   f(3)f(2)
 ISSRs
Base structure
              Br       N(CH3)2    --
Base structure
              Br       NHCH2CH2   OCH3
Base structure
              Br       NHCH2CH2   SCH3
Base structure
              Cl       N(CH3)2    --
Base structure
              Cl       NHCH2CH2   OCH3
Base structure
              Cl       NHCH2CH2   SCH3
Base structure
              OCH3     N(CH3)2    --
Base structure
              OCH3     NHCH2CH2   OCH3
Base structure
              OCH3     NHCH2CH2   SCH3
IGSRs
AkHc          Ft       FcAk2      --
AkHc          Ft       FcAk       FcAk
AkHc          Ft       FcAk       FcAk
AkHc          Ft       FcAk2      --
AkHc          Ft       FcAk       FcAk
AkHc          Ft       FcAk       FcAk
AkHc          FcAk     FcAk2      --
AkHc          FcAk     FcAk       FcAk
AkHc          FcAk     FcAk       FcAk
 
   In  its  most  basic  form,  such an index system requires that all r(k)
values  be  different.  Thus in representation XXVIIc, the generation of an
implicit  structure  with  two Ft groups attached to Hc is forbidden, since
such  a  structure would have two r(1) groups. The index system can also be
used  to  handle  complex Markush formulations by using the node flags as a
way of breaking down a complex structure into separate structures linked by
appropriate  Boolean logic. In such a treatment, it is necessary to capture
some  atom  and  group  detail  from  the previous and next higher level of
representation  so  as  to  preserve  suitable  detail  for  atom and group
matching techniques.
 
  In  capturing  as  much  detail  of  the Markush formulation as possible,
generic  features  of  the  Markush  formulation  are captured in the SpMCN
representation  as  generic  groups, e.g., representation XVIIIb of FIG. 8.
The  use of such generic features in the SpMCN representation can result in
a  matching  failure  if  appropriate steps are not taken to allow for this
situation  at  the  SpMCN  matching level. To allow for appropriate generic
group  matching  at the SpMCN matching level, "roll-back" to the associated
GnMCN  level  of  representation  is  used.  The  "roll-back"  technique is
illustrated  in  FIG. 15. The query SpMCN representation XXVIIIa of FIG. 15
contains  a  phenyl group (uppermost portion of the representation) that is
implied  in  the  generic  group  Cb  of the file compound XXIXa (uppermost
portion of the representation).
  Atom-by-atom  structure matching will fail to match these two terms since
the  query  segment  contains carbon atom (C) nodes which do not match with
the  generic  group  (Cb)  node.  On  identification of a mismatch due to a
comparison  of  a  real  atom  node  against a generic group node (as noted
earlier,  generic groups in the SpMCN representation are distinguished by a
prime  (')  since they must originate from a generic feature in the Markush
formulation; alternatively, such generic groups can be identified through a
table  lookup),  the  appropriate  real-atom  part  of  the query (or file)
substance  is  rolled-back to its associated generic group node and the two
generic  group  nodes  are  compared.  Thus,  in the example, the real-atom
carbon  atoms  are rolled-back to the generic group node Cb and the Cb node
used to match the Cb node in the file representation. In the second segment
of  the  molecule, the query contains a generic feature (Ak) while the file
compound  contains  real  atoms (CH sub 2 CH sub 2). In this case, the file
segment  is  rolled  back  to  the  generic  group  (Ak) and Ak used in the
matching process.
 
  In  using  generic  groups, one additional factor must be considered. For
example,  in  FIG. 15 the middle segment of the query (Ak) was successfully
compared  with the rolled-backed Ak segment of the file group. If the query
segment  had been a Ch group instead of an Ak group, a match would not have
been  obtained  even though, by hierarchical group definition, the Ak group
is  included  within  the  definition  of  a  Ch group. To account for such
generic  mismatch  at  either  the  GnMCN/GnMCN matching level itself or on
roll-back  to  the  GnMCN group level in SpMCN/SpMCN matching, a particular
group  within the hierarchy is expanded to include as alternate node values
all  groups  above it in the hierarchy (thus expanding Ch to include Ay and
G)  and  all lower lower groups in the hierarchy (Cg and Ak). By using such
an  expansion for all file generic groups, a generic group node, such as Ch
or  Ay,  in  the query will successfully match an expanded Ak group node in
the file substance and vice-versa.
 
  Although  the  use  of  GnMCN representation matching and GnMCN attribute
matching  reduces the amount of SpMCN representation matching considerably,
it  is  of  further advantage to eliminate from the file as many irrelevant
GnMCN  and  SpMCN  structures as possible prior to GnMCN group-by-group and
SpMCN atom-by-atom matching since these latter two processes are relatively
slow even when using high-speed computers. Such a reduction in file size is
accomplished  by  use  of  a modified version of known screening techniques
such  as  those  described in "The CAS ONLINE Search System; General System
Design  and  Selection,  Generation,  and  Use  of Search Screens" by P. G.
Dittmar,  N.  A. Farmer, W. Fisanick, R. C. Haines, and J. Mockus (J. Chem.
Inf.  Comput. Sci., 1983, 23, 93-102.), all of which is hereby incorporated
by  reference.  The  generation  of  specific-atom  screens,  generic group
screens,  and  diagnostic screens is illustrated in FIG. 16. Specific atoms
screens are illustrated with representation XXXa of FIG. 16. Augmented Atom
(AA)  screens  are  defined  as  screens  involving an atom and its nearest
neighbor  atoms  and  the  attaching bonds. For example, in a first ISSR of
XXXa  (ISSRs  are  not  explicitly  shown),  the  first  carbon atom (C) is
attached  to  only  one other carbon by a single bond (hydrogen atoms being
ignored).  The  second  carbon  is attached to the first carbon and a third
carbon;  the  third  carbon  is attached to the second carbon and an oxygen
(O);  and the oxygen is attached to a carbon. In a second ISSR of XXXa, the
first carbon is attached to one other carbon, the second carbon is attached
to  the first carbon and to a third carbon, the third carbon is attached to
the  second  carbon  and to a nitrogen (N), and the nitrogen is attached to
the  third  carbon.  In  a  third ISSR, the first carbon is attached to the
second  carbon  and the second carbon is attached to the first carbon and a
third  carbon,  and  the  third  carbon  is  attached to the second carbon;
generic  groups  are  specifically  ignored  in the generation of real atom
screens.  The  general  notation  for  AA screens is to cite the atom under
consideration  first  followed  by each attached atom in alphabetical order
with the bonding immediately preceding each attached atom.
 
  Another  type  of screen is the Atom Sequence (AS) screen in which linear
sequences  of four to six nonhydrogen atoms are described. For the ISSRs of
XXXa, there are two four-atom sequences, C--C--C--N and C--C--C--O. A third
type  of  screen  is  the  Element  Count  (EC)  screen which indicates the
presence  of  specific elements and for some of the more common elements, a
count of the number of atoms, e.g., six or more carbon atoms.
 
  After  generating  the  AA,  EC,  and AS screens, they are looked up in a
screen  dictionary  (Table  III)  to  obtain  their  corresponding  screens
numbers.  The  screen  numbers  correspond to specific bits in a bit string
illustrated  in  FIG.  16.  It is noted that in screen fragment generation,
larger fragments are decomposed into smaller, more generic screen fragments
that are also included in the screen dictionary. For example, the AA screen
fragment,  C--C--O  (#4),  is  decomposed  into C--C (#1) and C--O (#6). As
illustrated  in  FIG.  16,  the  screens  for  the three ISSRs of the SpMCN
representation XXXa turn on bits 1, 2, 4-7, 11-13, 17, and 19.
 
              TABLE III
 SCREEN DICTIONARY
SPECIFIC ATOM SCREENS
            SCREEN NUMBER
 EC SCREENS
C             11
O             13
N             12
Cl            14
AA SCREENS
C--C           1
Cl--C          9
N--C           7
O--C           6
C--C--C        2
C--C--Cl      10
C--C--N        5
C--C--O        4
C--C--C--C     3
C*C*C          8
C*C           15
AS SCREENS
C--C--C--C    18
C--C--C--Cl   20
C--C--C--N    19
C--C--C--O    17
  (-- indicates a chain bond; * indicates a ring bond)
 
              TABLE IV
 SCREEN DICTIONARY
GENERIC SCREEN
SCREENS   SCREEN NUMBER
 Ak        58 (51,57,59)
Ak-Ay     37 (36,42,43,46,47,48; 51,57,58,59)
Ak-Cb     34 (33,36,43,46,48,49,50,60,61,62; 51,52,53,
          57,58,59)
Ak-Cy     33 (36,43,46,48,49,50,60; 51,52,57,58,59)
Ak-Fg     32 (36,37,41,42,43,44,45,46,47,48; 51,56,57,58,59)
Ak-Ft     31 (32,36,37,38,39,40,41,42,43,44,45,46,47,48;
          51,55,56,57,58,59)
Ak-G      36 (43,46,48; 51,57,58,59)
Ak-Hc     35 (33,36,43,46,48,49,50,60,63,64,65; 51,52,54,
          57,58,59)
Ay        57 (51)
Ay-Ay     47 (46,48; 51,57)
Ay-Cb     62 (46,48,50,60; 51,52,53,57)
Ay-Ch     42 (43,46,47,48; 51,57,59)
Ay-Cy     50 (46,48,60; 51,52,57)
Ay-Fg     44 (45,46,47,48; 51,56,57)
Ay-Ft     39 (40,45,46,47,48; 51,55,56,57)
Ay-G      48 (46; 51,57)
Ay-Hc     64 (46,48,50,60,63; 51,52,54,57)
Cb        53 (51,52)
Cb-G      63 (46,60; 51,52,53)
Ch        59 (51,57)
Ch-Cb     61 (43,46,48,49,50,60,62; 51,52,53,57,59)
Ch-Cy     49 (43,46,48,50,60; 51,52,57,59)
Ch-Fg     41 (42,43,44,46,47,48; 51,56,57,59)
Ch-Ft     38 (39,40,41,42,43,44,46,47,48; 51,55,56,57,59)
Ch-G      43 (46,48; 51,57,59)
Ch-Hc     65 (43,46,48,49,50,60,63,64; 51,52,54,57,59)
Cy        52 (51)
Cy-G      60 (46; 51,52)
Fg        56 (51,57)
Fg-G      45 (46,48; 51,56,57)
Ft        55 (51,56,57)
Ft-G      40 (45,46,48; 51,55,56,57)
G         51
G-G       46 (51)
G-Hc      63 (46,60; 51,52,54)
Hc        54 (51,52)
 
              TABLE V
 SCREEN DICTIONARY
DIAGNOSTIC SCREENS
SCREENS          SCREEN NUMBER
 Ak'              78 (71, 77, 79)
Ay'              77 (71)
Cb'              73 (71, 72)
Ch'              79 (71, 77)
Cy'              72 (71)
Fg'              76 (71, 77)
Ft'              75 (71, 76, 77)
G'               71
Hc'              74 (71, 72)
 
   For illustrative purposes, the screens and Specific-Atom Bit String used
to  depict  the  ISSRs  of  SpMCN  representation XXXa have been simplified
extensively.  This  is  in  no way intended to limit this invention to this
particular  description;  various other screens described in the article by
Dittmar,  et.  al,  ibid,  e.g., bond sequences, atom counts, etc. used for
specific-atom screening are also anticipated by this invention.
 
  In  developing  screens for the GnMCN representation, it typically is not
necessary to generate as wide an array of screens as is used with the SpMCN
representation   since   many   of   the   atomic  features  of  the  SpMCN
representation   are   compressed   into   a  single  group  in  the  GnMCN
representation. Typically, AA doublets or dumbbells (a generic group joined
to  one  other  generic group) and singlets (a generic group by itself) are
adequate for satisfactory screening at the generic group level. As with the
specific  atoms  screens,  this  description is given in simplified form to
teach  the  invention  and  is not intended to limit the number or types of
screens used for generic group screening.
 
  The  GnMCN  representation XXXb contains two dumbbell screens, Ak--Ft and
Ak--Cy.  Since Ak--Cy implies the more specific screens, Ak--Cb and Ak--Hc,
these  screens  must also be included as possible screens and are generated
at  the time of bit string construction. If these two more specific screens
are  not included, this particular file compound would be screened out when
in fact it is a potential match for a query such as Ak--Hc, i.e., the query
alkyl-heterocycle  (Ak--Hc)  is  implicit  in  the  file alkyl-cyclic group
(Ak--Cy).  In  examining  the  screen  dictionary  given in Table IV, it is
apparent  that  the Generic-Group screens, contain the more generic screens
for  that  screen.  Thus the screen Ay--G includes the more generic doublet
term  G--G  and  the singlets, Ay and G. As a result, the screen dictionary
automatically  accounts  for  the  matching  of  related,  but more generic
fragments  in  which  the  more specific fragment is contained. For generic
representation   XXXb,   bits   31-33  and  36-65  are  turned  on  in  the
Generic-Group Bit String.
 
  For  generic  groups  that  originate  from  generic  expressions  in the
original Markush formulation and, as noted in FIG. 8, denoted with a prime,
a  special  diagnostic bit screen is used. As with the generic screens, the
more  generic  terms  are  included  with the term in the diagnostic screen
dictionary  (Table  V),  e.g.  Cy'  includes  G,;  more  specific terms are
generated  at  the  time  of bit string construction, e.g., Cb' and Hc' are
generated  for Cy'. As a result, bits 71-74 are turned on in the Diagnostic
Bit String illustrated in FIG. 16.
 
  In  bit  string  generation  for  both  generic  and  specific  atom file
representations, screens for all possible ISSRs and IGSRs are included in a
single  bit  string. Moreover, the specific screens are generated without a
consideration   of   the  boundaries  which  result  when  real  atoms  are
transformed  into  generic  groups. Thus in representation XXXa of FIG. 16,
the  AA  screen C--C--O (No. 4) involved atoms (C--C) that transformed into
the Ak group and an atom (O) that transformed into the Ft group.
 
  For  query  representations, each ISSR is considered as a separate entity
and  real-atom  screens  are  generated  so  as  to restrict the real atoms
screens  to atoms that transform to a single generic group or generic-group
combination.  To  simplify the illustration of the screening technique, the
real-atom  screens  are  restricted  to  the boundaries of a single generic
group  but  this  is  not  intended to restrict the scope of the invention.
Although  real-atom  screens  must  be  restricted to the boundaries or the
corresponding  generic  counterpart, the generic counterpart may be defined
by  any  number  of  generic-group combinations. In representation XXXIa of
FIG. 17, the real atom screens are limited to those atoms that transform to
a  single  group,  i.e.,  the  screens  are derived from three disconnected
fragments,  CH  sub  3  CH  sub 2 CH sub 2, Cl, and OH. Only the AA screens
C--C--C  and  C--C  are  used  for the group CH sub 3 CH sub 2. AA screens,
C--C--Cl  and  C--C--O, and AS screens, C--C--C--Cl and C--C--C--O, are not
used since these screens span two separate generic groups. Correspondingly,
the diagnostic generic screens are limited to the singlets, Ak and Ft.
 
  Query  XXXIa  FIG.  17 along with the Specific-Atom and Generic Group Bit
Strings  of  FIG.  16 illustrate the screening technique in its basic form.
The  IGSR Ak--Ft of XXXIb, corresponding to ISSR CH sub 3 CH sub 2 CH sub 2
Cl,  gives  rise  to  two  disconnected  singlets,  Ak  and  Ft, which have
corresponding  screen  numbers  51,  55,  56, 57, 58, and 59. Using Boolean
logic, the query bit requirement is expressed as (Ak: 58 OR 51 OR 57 OR 59)
AND  (Ft: 55 OR 51 OR 56 OR 57). (The Ak and Ft in the logic expression are
present  for  illustrative purposes only.) Examining the file Generic-Group
Bit  String  in  FIG.  16, this initial condition is met. Proceeding to the
ISSRs,  query  ISSR  CH sub 3 CH sub 2 CH sub 2 Cl, which is treated as two
disconnected  fragments,  CH  sub 3 CH sub 2 CH sub 2 and Cl, requires that
the  file  Specific-Atom  Bit  String  contain  the following Boolean logic
query:  1  And  2  And  11  And 14. In examining the file Specific-Atom Bit
String  of  FIG.  16,  it  is apparent that this logic condition is not met
since bit 14 is off.
 
  Proceeding  to  the  next  implicit  structure  (as  is recalled only one
implicit  structure  of  the query must match with an implicit structure of
the file representation), IGSR of CH sub 3 CH sub 2 CH sub 2 OH is the same
as  that  for CH sub 3 CH sub 2 CH sub 2 Cl, i.e., Ak--Ft, and thus it must
also match the Generic-Group Bit String of FIG. 16. The Boolean logic query
expression  1 And 2 And 11 And 13 is obtained for the ISSR, CH sub 3 CH sub
2 CH sub 2 OH, as the disconnected fragments CH sub 3 CH sub 2 CH sub 2 and
OH, and is found in the Specific-Atom Bit String of FIG. 16. Since both the
generic-group  and specific-atom logic expressions of at least one IGSR and
corresponding  ISSR  of  query  XXXI is found in the file Generic-Group and
Specific-Atom  Bit  Strings, the associated GnMCN and SpMCN representations
are then compared using group-by-group and atom-by-atom techniques for each
IGSR  and  ISSR of both the file and query GnMCN and SpMCN representations.
Unlike  group-by-group  and atom-by-atom matching, which requires that each
IGSR and ISSR of both the file and query GnMCN and SpMCN representations be
compared,   screening  is  satisfied  as  long  as  the  generic-group  
and
corresponding  specific-atom screens of at least one IGSR and corresponding
ISSR  is  found  in  each  aggregate  file  Generic-Group and corresponding
Specific-Atom  Bit  String, i.e., screens from all IGSRs and ISSRs for each
file GnMCN and and associated SpMCN representation.
 
  ISSRs  XXXIIa  and XXXIIb of FIG. 17 illustrate the handling of a generic
feature   found  in  the  original  Markush  formulation.  Look-up  of  the
individual generic groups of XXXIIb in the generic screen dictionary (Table
IV) results in the logic expression, (Ak: 58 OR 51 OR 57 OR 59) AND (Hc: 54
OR  51  OR  52)  which is found in the Generic-Group Bit String of the file
representation  of  FIG. 16. In formulation a logic expression for an ISSR,
all generic groups are ignored. As a result, ISSR XXXIIa corresponds to the
fragment  CH sub 3 CH sub 2 CH sub 2 and results in the logic expression, 1
AND  2 AND 11, which is also found in the Specific Bit String shown in FIG.
16. IGSR XXXIIb also illustrates the need to expand file representations to
include   all   more   specific   members,   i.e.,  Ak--Cb  and  Ak--Hc  in
representation  XXXb.  If  the file representation had not been expanded to
include  these  members of the generic group, IGSR XXXIIb would have failed
to match.
 
  ISSR  XXXIIIa and IGSR XXXIIIb of FIG. 17 illustrates the use of the file
Diagnostic  Bit  String  (FIG.  16)  in handling a real-atom segment in the
query  ISSR  when  the  corresponding  file  ISSR  has only a generic group
corresponding  to  that  segment  as  a  result of originating in a generic
feature   of   the  original  Markush  for  which  no  real-atom  structure
information  is  available.  The  individual generic groups of IGSR XXXIIIb
give  rise  to the logic expression, (Ak: 58 OR 51 OR 57 OR 59) AND (Cb: 53
OR  51  OR  52)  which is found in the Generic-Group Bit String of the file
representation  XXXb. The specific atom groups of ISSR XXXIIIa give rise to
the  logic  expression, 1 AND 2 AND 8 AND 15. The logic expression does not
match with the file Specific-Atom Bit String of FIG. 16 since bits 8 and 15
of  the file Bit String are off. However, it is noted that the file ISSR CH
sub  3  CH  sub  2  CH  sub  2 Cy does match the query ISSR. The failure to
recognize  the  match  results from the fact that the query ISSR contains a
real-atom  three-carbon  ring-system  from  which  the screen fragments C*C
(#15)  and  C*C*C  (#8)  are derived while the file ISSR contains a generic
group  Cy  from  the  original  Markush  formulation  for  which the screen
fragments C*C and C*C*C are not generated. In order to insure that the file
substance  is  not  missed,  the  diagnostic  screens  exist  in  the  file
Diagnostic  Bit  String  to  indicate  when alternatives to real-atom query
screens  must  be  used,  i.e.,  they  indicate when there are no real-atom
screens  in  the  file Specific-Atom Bit String for a particular segment of
the  file  representation.  (This  is  done  since  it  is not practical to
generate and include in the Specific-Atom Bit String all possible real-atom
screens  that  correspond  to the various generic groups.) For example, the
generic  group  corresponding  to  the query cyclic group in representation
XXXIIIa   is   Cb.   On  examining  the  diagnostic  bit  string  for  file
representation  XXXa (FIG. 16), the diagnostic bit for Cb (#73) is "on". As
stated,  the fact that the diagnostic screen is on for Cb means that the Cb
group  originated from a generic feature of the Markush formulation and, as
such,  no  corresponding  real  atom  screens  are  available  in  the file
Specific-Atom  Bit  String for this group. To obtain a match with this file
representation,  it  is necessary to use as an alternative to the real-atom
screens  corresponding  to  Cb in the query logic expression the diagnostic
screens  73  OR 71 OR 72. As a result, screens 8 and 15 are not required to
be  present  for  file  representation  XXXa  if one of the above mentioned
diagnostic screens is present, and the resulting "reduced" real-atom screen
query  logic  expression for XXXIIIa is (1 AND 2 AND 11) AND ((8 AND 15) OR
(Cb':  73 OR 71 OR 72). Since this expression matches the Specific-Atom and
Diagnostic  Bit  Strings  of  FIG.  16,  the file representation of FIG. 16
becomes  a  candidate  for  GnMCN  group-by-group  and  SpMCN  atom-by-atom
matching.  As  has been seen, the purpose of the file diagnostic screens is
to  provide the necessary alternative to specific-atom screens in the query
logic expression when the file representation contains no real-atom screens
against  which the specific-atom screens of the query expression can match.
The function of this special Boolean strategy is to insure as high a recall
as possible for the system.
 
  For  illustrative  purposes,  generic  acid  specific  query/file  screen
matching  has been detailed in three steps, i.e., (1) matching of a generic
query  logic  expression with the generic file bit string (2) matching of a
specific  query  logic expression with a specific file bit string, and, (3)
using  alternative  diagnostic screens for specific-atom query screens when
the  appropriate  diagnostic screens are present in the file Diagnostic Bit
String. In practice, all three steps can be combined. A single specific and
generic  logic  expression  is  formulated for all segments of the ISSR and
within  the  specific  logic  expression  diagnostic  screens  are  used as
alternatives  for  the appropriate sets of real-atom screens that have been
derived from fragments corresponding to individual generic groups. Thus the
complete  logic expression for implicit representations XXXIIIa and XXXIIIb
is: (((1 AND 2 AND 11) OR (Ak': 78 OR 71 OR 77 or 79)) AND (Ak: 58 OR 51 OR
57  OR  59)) AND ((1 AND 8 AND 15) OR (Cb' : 73 OR 71 OR 72) AND (Cb: 53 OR
51  OR  52)).  This  expression is passed against a combined Specific-Atom,
Generic-Group,  and  Diagnostic Bit String in one step with the appropriate
diagnostic  screens  being used according to their presence in the combined
file  Bit  String.  Moreover,  complete logic expressions for all ISSRs and
IGSRs  in  a single SpMCN and associated GnMCN query can be combined into a
single  logic  expression.  Thus  for  SpMCN,  XXXIa, and associated GnMCN,
XXXIb,  the  following  logic expression is obtained: (((1 AND 1 AND 11) OR
(Ak':  78  OR 71 OR 77 OR 79)) AND (Ak: 58 OR 51 OR 57 OR 59)) AND (((13 OR
14) OR (Ft': 75 OR 71 OR 76 OR 77)) AND (Ft: 55 OR 51 OR 56 OR 57)).
 
  In  formulating  specific atom screens, it is essential to limit specific
atom  screens  to  the  specific  atoms  defined  by  generic groups. As is
apparent,  this  specific  atom  screen  limitation  is necessary so that a
specific-atom  group  in a query can match a corresponding generic group in
the  file  that  has  no  real-atom counterparts using the Boolean strategy
involving diagnostic screens. For illustrative purposes, the boundaries for
real-atom  screens  have  been limited to to a corresponding single generic
group.  In  practice,  combinations  of  generic groups are defined and all
real-atom  screens  within the generic-group combination used. For example,
the  real-atom  screens corresponding to the generic-group pair combination
Ak--Ft  can be for the ISSR CH sub 3 CH sub 2 CH sub 2 Cl of representation
XXXIa  of  FIG.  17.  In such a situation, the AA real-atom screen C--C--Cl
(#10)  and  AS  screen  C--C--C--Cl  (#20) are used to eliminate additional
irrelevant  structures  from  the  file, i.e., to improve search precision.
However,  in  using  such screens, the diagnostic screen boundaries must be
expanded  by using the pair combination set Ak'--Ft, Ak--Ft', and Ak'--Ft',
i.e.,  Ak'--Ft  OR  Ak--Ft'  OR  Ak'--Ft' as alternatives for the real-atom
screen  set. To afford even greater precision, both generic group pairs and
singlets  can  be used as diagnostic screens. For example, if the real-atom
screens  derived from CH sub 3 CH sub 2 CH sub 2 Cl are denoted as "set a",
the  screens from CH sub 3 CH sub 2 CH sub 2 as "set b", and the screens Cl
as  "set  c",  and  the  diagnostic  screens  are Ak', Ft', and (Ak'--Ft OR
Ak--Ft'  OR  Ak'--Ft'); then the logic expression is ((set a OR (Ak'--Ft OR
Ak--Ft'  OR  Ak'--Ft')  AND (set b OR Ak') AND (set c OR Ft')). The "set a"
screens  are  not  used  against  a  file substance with a GnMCN containing
Ak'--Ft,  Ak--Ft',  or Ak'--Ft', i.e., a GnMCN associated a SpMCN which has
an  Ak' attached to a real-atom group corresponding to a Ft, a Ft' attached
to a real-atom group corresponding to an Ak, or a Ak'--Ft'.
 
  Alternative  to  the  above  technique, an inverted file technique can be
used  for  the  storage  and searching of the file substance screens rather
then  bit  strings.  In this technique, the screens (specific-atom, generic
group,  and diagnostic) become file index terms and a common identifier for
the  GnMCN  and  SpMCN  representations  associated with a given screen are
"inverted"  under the file index term. The query screen logic expression is
developed  in  the same manner as for bit-string searching and is processed
against the screen index file giving as answers those substance identifiere
having the requisite query screens.
 
                                  TABLE VI
 INVERTED SCREEN FILE
SCREENS
Real Atom           Generic Group Diagnostic
1 2 3 4 5 6 7 8 9 10
                    41
                      42
                        43
                          44
                            45
                              46
                                47
                                  70
                                    71
                                      72
                                        73
 A   A   A A A   A       A   A A A
B B   B             B B B         B B
    C C C C       C     C C       C      C
  D           D D           D     D D
E E E E   E E E E     E E E   E     E    E
  QUERY: ((2 AND 3) OR (71 OR 72)) AND (41 OR 43)
 
For  example,  in  Table  VI,  "A"  is  the identifier for a file SpMCN and
associated  GnMCN  representations from which real-atom, generic-group, and
diagnostic  screens  are  derived.  "B"  is the identifier for another file
SpMCN  and  associated  GnMCN  from  which  real-atom,  generic-group,  and
diagnostic  screens  are  derived.  "C,"  "D,"  and  "E"  are  similar file
representation  identifiers.  Each  file  identifier  is  posted  under the
corresponding  screens  generated  from the SpMCN and GnMCN representation.
Appropriate file identifiers meeting the requisite Boolean logic conditions
of  the  query  logic  expression  are  retrieved  as  answers,  here, file
identifiers  "B"  and "E." These GnMCN and SpMCN representations associated
with   the  answers  are  then  further  processed  in  group-by-group  and
atom-by-atom searching.
 
  FIG.   18   illustrates  the  general  operation  and  computer  hardware
facilities  for  the  practice of this invention. A user terminal such as a
Hewlett-Packard  2647A  (Hewlett-Packard  Co.,  Cupertino,  CA) is used for
graphic  structure  input. Query and search control uses the resources of a
large   computer  system  such  as  one  or  more  IBM  3081s  (IBM  Corp.,
Poughkeepsie, NY) and includes a telecommunications interfact and query and
search control procedures such as those described in Zeidner, C. R.; Amoss,
J.  O.;  Haines,  R.  C.  "The  CAS  ONLINE  architecture  for Substructure
Searching"  in  "Proceedings  of  the  3rd National Online Meeting" Learned
Information,  Inc.: Medford N.J., 1982; all of which is incorporated herein
by  reference,  software  for  handling graphic structure input such as the
Online  Structure Input System (OLSIS: J. E. Blake, N. A. Farmer, and R. C.
Haines.  "An  interactive  Computer Graphics System for Processing Chemical
Structure Diagrams." J. Chem. Inf. Comput. Sci., 1977, 17, 223-228), all of
which  is  herein incorporated by reference, a data-base management program
such as ADABAS (Software AG, Darmstadt, W. Germany) used during answer data
retrieval to access data bases such as patent files, abstracts files, etc.,
and  output  programs  for  online  answers  and  off-line prints through a
high-speed  printer  such as the Xerox 9700 laser printer (not shown; Xerox
Corporation,  El  Sequndo, CA). Initial screen search is performed on a set
of  minicomputers  such as Digital Equipment Corp. (Maynard, MA.) PDP11/45s
while   group-by-group   and  atom-by-atom  searches  and  logic  condition
verification  are  carried  out  on  a set of minicomputers such as Digital
Equipment  Corp.  PDP  11/44s  with  one or more microcomputers such as the
Digital  Equipment  Corp.  PDP 11/55 and PDP 11/44 used for system support,
maintenance, and backup.
 
  In  a  typical  search,  a screen logic expression is formulated from the
query GnMCN and SpMCN structures and includes a special Boolean strategy to
ensure retrieval of all representations based on generic representations in
the  original  Markush  formulation  and  is  passed against the bit screen
files.   Answers   from   the   screening  search  are  passed  to  generic
group-by-group  search  to  determine  if  any  IGSRs  in  the  query GnMCN
representation  are  found in the file GnMCN representations. Attributes of
each  generic-group  node of the query representation are compared with the
attributes  of each matching generic-group node of the file representations
during  the GnMCN comparison. File answers from the generic-group/attribute
match are passed to an atom-by-atom search to determine if any ISSRs in the
query  SpMCN  representation are found in the file SpMCN representations. A
roll-back  technique  is  used  to  achieve  a high degree of recall during
atom-by-atom  comparison by substituting a corresponding generic-group node
for a real-atom group of nodes when such a generic group node occurs in the
query  or file substance. Resulting answers are checked to confirm that all
substituent  logic  conditions  are  met. The answers are returned to query
control  where  answers  from  appropriate  data  bases  are  retrieved and
forwarded to the user along with the substance search answers.
 
                              APPLICABILITY
 
  This invention provides a powerful tool for Markush structure storage and
searching.  It  is designed to achieve total recall and very high precision
for  the  searching  of  Markush-type  queries  against  a  file of generic
substances  of  the type found in Markush patent claims. The query and file
representations  handled  range from specific structures to those with high
variability  and  generic  features, including those that consist solely of
generic  features.  Both full-structure and substructure searching, at both
the  specific-atom  and  generic-group  level afford a wide range of search
capability.
 
  A   single   multiple   connectivity  node  structure  for  specific-atom
representations   and  an  associated  single  multiple  connectivity  node
structure  for  generic-group representations afford a manageable file size
for  the  Markush  formulations.  An  indexing  system  for  for  both  the
specific-atom  and  generic-group  multiple  connectivity  node  structures
assures  the generation and processing of all implict structures within the
generic-group  and  specific-atom  multiple connectivity node structures as
well as providing for a means of dealing with highly complex structures.
 
  A  hierarchical  generic-group classification scheme provides a means for
the  input  of  generic  features for both generic and file substances and,
more importantly, provides a controlled language for the fail-safe matching
of  generic  groups  against  generic  groups  and real-atom groups against
generic  groups. This is accomplished by mapping all groups, both real-atom
and  generic  groups  in  the query and file substances, to a common set of
generic  groups  as defined by the hierarchy which can be dealt with by the
search routines, especially the initial screening step.
 
  A  screening  technique  is  used to reduce the number of irrelevant file
answers  for  a  particular  query  on  a  rapid and "fail-safe" basis. The
fail-safe result is achieved through a built-in handling of the generic vs.
generic  and  real-atom  vs. generic matching via the common set of generic
groups  as  defined by the hierarchy. Real-atom screens are used to improve
precision  in  the screening process and, in order to preserve a high level
of recall, alternative Boolean diagnostic screens are provided in the query
screen  logic  expression  for  use  when certain real-atom screens are not
appropriate.
 
  High  precision  is  achieved  by matching file and query substances on a
group-by-group  and  atom-by-atom  basis. This type of matching establishes
the  required connectivity and arrangement of matching fragments lacking in
presently  available  systems. Attribute searching provides added precision
to  the  group-by-group  by  requiring  that the attributes of two matching
groups  also match. In atom-by-atom matching, a roll-back technique is used
to allow for the matching of a generic feature expressed as a generic group
in  the  query or file substance to match against a corresponding real-atom
group in the query or file substance.
 
  While  the  forms  of the invention herein disclosed constitute presently
preferred  embodiments, many others are possible. It is not intended herein
to  mention  all  of  the possible equivalent forms or ramifications of the
invention.  It  is  to  be understood that the terms used herein are merely
descriptive  rather  than  limiting,  and  that various changes may be made
without departing from the spirit or scope of the invention.
 
 
  2/2,EM,SU/2     (Item 1 from file: 654) 
DIALOG(R)File 654:US PAT.FULL.
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             02999303
Utility
RELATIONAL  DATABASE  MANGEMENT  SYSTEM  FOR  CHEMICAL  STRUCTURE  STORAGE,
SEARCHING AND RETRIEVAL
 
PATENT NO.:  5,950,192
ISSUED:      September 07, 1999 (19990907)
INVENTOR(s): Moore, Jeffrey, Timonimun, MD (Maryland), US (United States of
             America)
             Brazil, Joanne, White Hall, MD (Maryland), US (United States
             of America)
             Hoover, Jeffrey R., Baltimore, MD (Maryland), US (United
             States of America)
ASSIGNEE(s): Oxford Molecular Group, Inc , (A U.S. Company or Corporation),
             Towson, MD (Maryland), US (United States of America)
APPL. NO.:   8-883,165
FILED:       June 26, 1997 (19970626)
 
  This  application  is a continuation, of application Ser. No. 08-715,708,
filed  Sep. 19, 1996, now abandoned, which is a continuation application of
Ser.  No. 08-288,503, filed Aug. 10, 1994, now U.S. Pat. No. 5,577,239, the
entire disclosure of which is incorporated herein by reference.
 
U.S. CLASS:  707-3 cross ref: 702-27
INTL CLASS:  [6] G06F 17-30
FIELD OF SEARCH: 395-496; 395-497; 395-499; 395-600; 395-603; 707-3; 707-2;
             707-1; 707-100; 707-102; 707-104; 707-22; 707-27; 707-19;
             707-20; 702-22; 702-27; 702-19; 702-20
 
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,642,762    2/1987   Fisanick                                 707-3
    4,811,217    3/1989   Tokizane et al.                        364-300
    4,855,931    8/1989   Saunders                               364-499
    5,025,388    6/1991   Cramer, III et al.                     364-496
    5,056,035   10/1991   Fujita                                 364-497
    5,259,137   11/1993   Wilson et al.                          364-496
    5,367,058   11/1994   Pitner et al.                        530-391.9
    5,379,234    1/1995   Wilson et al.                          364-496
    5,386,507    1/1995   Teig et al.                            395-161
    5,418,944    5/1995   DiPace et al.                          395-600
    5,463,564   10/1995   Agrafiotis et al.                      364-496
    5,577,239   11/1996   Moore et al.                           395-603
 
                         NON-U.S. PATENT DOCUMENTS
 
    090 895 A2   10/1983   EP (European Patent Office)
    213 483 A2    3/1987   EP (European Patent Office)
 
                             OTHER REFERENCES
 
 
Viking Instruments Corp. (Hewlett Packard); SpectraTrak Transportable GS/MS
Systems; (brochure)-No Date.
 
Chemical  Structures, The International Language of Chemistry; Wendy A. War
(Ed.); "Interfacing DARC--Oracle" AJCM (Juus) de Jong (1988).
 
J.  Chem.  Inf.  Comput.  Sci.  (1983)  , vol. 23, No. 3; pp. 102-108; DARC
Substructure  Search  System: A New Approach to Chemical Information; Roger
Attias.
 
J.  Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2; pp. 74-82; DARC System:
Notions  of Defined and Generic Substructures. Filiation and Coding of FREL
Substructure (SS) Classes; Jacques-Emile Dubois et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1990),  vol.  30,  No.  2;  pp.  191-199,
Substructure  Search Systems. 1. Performance Comparison of the MACCS, DARC,
HTSS,  and CAS Registry MVSSS, and S4 Substructure Search System; Martin G.
Hicks & Clemens.
 
J.  Chem.  Inf.  Comput.  sci.  (1988),  vol.  28,  No.  4; pp. 221-226; An
Efficient Graph Approach to Matching Chemical Structures, O. Owolabi.
 
J.  Chem.  Inf. Comput. Sci. (1990), vol. 30, No. 4; pp. 332-339; Reactions
in the Beilstein Information System: Nonaporic Organic Synthesis; Martin G.
Hicks.
 
Analytica  Chimica Acta, 235 (1990), pp. 87-92; Substructure Search Systems
for Large Chemical Data Bases; Martin G. Hicks et al.
 
J.  Chem.  Inf.  Comput.  Sci.  (1991),  vol.  31,  No. 2; pp. 320-326; The
Beilstein Structure Registry System. 1. General Design; Laszio Domokos.
 
J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4; pp. 255-260; 3DSearch; A
System for Three-Dimensional Substructure Searching; Robert P. Sheridan, et
al.
 
Substructure  Searches  of  Chemical  Structure  Files;  (Jan.  23,  1973);
Strategic   Considerations   in  the  Design  of  a  Screening  System  for
Substructure  Searches  of  Chemical Structure Files; George W. Adamson, et
al.
 
Chemical  Structure  Searching;  (Jan.  21,  1975); An Efficient Design for
Chemical Structure Searching. I. The Screens; Alfred Feldman et al.
 
J.  Chem.  Inf.  Comput.  Sci. (1982), vol. No. 4; The Third BASIC Fragment
Search Dictionary; W. Graf, H. K. Kaindl, et al.
 
J.  Chem.  Inf.  Comput. Sci. (1983), vol. 23, No. 3; The CAS Online Search
System.  1.  General  System  Design  and Selection, Generation, and Use of
Search Screens; P. G. Dittmar, et al.
 
Computer  Chemical,  ((1991),  vol.  15, No. 2, pp. 103-107; A Central Atom
Based  Algorithm  and Computer Program for Substructure Search; Alf Dengler
and Ivar Ugi.
J.  Chem.  Inf. Comput. Sci. (1993), vol. 33, No. 4; pp. 545-547; Sturcture
Searching  in  Chemical  Databases  by  Direct  Lookup Methods; Baradley D.
Christie et al.
 
 
PRIMARY EXAMINER: Von Buhr, Maria N.
ATTORNEY, AGENT, OR FIRM: Dickstein Shapiro Morin & Oshinsky
CLAIMS:           14
EXEMPLARY CLAIM:  1
DRAWING PAGES:    7
DRAWING FIGURES:  12
ART UNIT:         277
FULL TEXT:        798 lines
 
 
                         FIELD OF THE INVENTION
 
  The  present invention relates to a relational database management system
that  stores, searches and retrieves chemical structure information quickly
and easily.
 
                       BACKGROUND OF THE INVENTION
 
  Chemical  and  pharmaceutical  industries and chemical-related government
agencies  commonly  maintain  large  chemical  substance  databases.  These
entities often provide structure-searching capabilities in association with
such databases. Recently, these organizations have been standardizing their
databases  using relational database management systems (RDBMS) such as the
Oracle  Relational  Database Management System by Oracle Corporation, World
Headquarters, 500 Oracle Pkwy., Redwood Shores, Calif. 94065.
 
  The  advantages  of  integrating  chemical  structure information into an
RDBMS  include:  a  closer  integration  with  other related chemical data,
efficiency  in  both  storage and retrieval of chemical structure data, and
better access to the chemical structure data by other related applications.
 
  Unfortunately, chemical information systems have traditionally been built
using specialized database technology requiring, in many cases, hundreds of
thousands  of lines of custom computer code. Systems of this type are often
both  difficult  to  maintain,  and difficult to adapt to changing hardware
technologies.   These   maintenance   problems,  coupled  with  a  lack  of
portability  of  these  highly  specialized  systems,  often  lead to large
investments  of  time  and  money being allocated to relatively short-lived
systems.
 
  The   introduction   of   relational   database  technology  provides  an
opportunity   to  transfer  a  large  amount  of  the  database  management
responsibility  from  the specialized database systems described above to a
standard  widely-accepted  technology.  However,  relational technology has
typically not been used as the basis for chemical information systems. This
is  due to the fact that there are problems inherent in any attempt to cast
a  chemical  structure  searching  system  problem  into  structured  query
language  (SQL)--the  standard  language  of  relational  databases.  These
problems include difficulty in storing and representing chemical structures
in  a  database.  No  chemical  information system has yet been implemented
using only relational technology as its database component.
 
  Several  systems  have  attempted to achieve this goal but, as more fully
explained  below,  none  have  been  able  to  develop  a purely relational
database  management  system  which is able to search and retrieve chemical
structure information easily and quickly.
 
  For  example,  Molecular  Access System (MACCS) and Integrated Scientific
Information  System  (ISIS)  are both created by Molecular Design Ltd., MDL
Information  Systems,  14600  Catalina  Street,  San Leandro, Calif. 94577.
These  systems  provide  a  stand-alone chemical information system wherein
chemical  structures  are stored as hierarchical structures. However, these
systems  require  large amounts of custom code, and are not maintained in a
relational  database.  Accordingly,  they  do  not  have  the advantages of
relational technology listed above.
 
  While  it  is  true  that these systems can be interfaced to a relational
database  management  system  such as the Oracle Database Management System
noted above, it must be done using additional custom code and software that
converts hierarchical structures to the relational tables needed for such a
database.  Therefore,  it  is  difficult  to  incorporate the advantages of
relational  technology  into  the  MACCS  and  ISIS  systems. Moreover, the
conversion software slows down overall performance speed.
 
  In  summary, these systems do not provide the advantages and capabilities
existing in the present invention.
 
  The   present   invention   overcomes   the   above-listed  problems  and
additionally has the following advantages:
 
  (1)  development and maintenance costs will be greatly reduced by using a
commercial  database package. Accordingly, development efforts and benefits
can  be  more  effectively  directed  toward  aspects of system design, and
improvements  in  the  underlying database technology will be automatically
transferred  to  the  chemical information system. This shift of focus away
from  database  development  concentrates  the  development and maintenance
efforts  on improving the search strategy and the user interface, which are
the  highly  visible  aspects  of  the  system;  (2) interfacing with other
information  systems  will  be  simplified  since  relational databases are
already  used  to  store  much  of the non-structural chemical data used in
research  and commercial settings; and (3) portability will be much less of
a design drawback since the amount of custom programming is minimal and can
easily   be  adapted  to  numerous  types  of  technology.  Therefore,  the
portability   responsibilities   are  mostly  shouldered  by  the  database
manufacturer  itself,  and  not  by  the  developer of the chemical storage
system.
 
                        SUMMARY OF THE INVENTION
 
  The present invention overcomes the shortfalls in the art by developing a
chemical structure search system which expands the capabilities of existing
systems by capitalizing on the strengths of relational database technology.
 
  The  present  invention allows the user to optimally store and search the
chemical  structure  information  using  various  search strategies such as
multi-valued  atoms, multi-typed bonds, Markush searching and various other
options in a relational database management system.
 
  Furthermore,  it  provides  a  complete chemical information system which
includes modules for:
 
  (1) exact structure searching;
 
  (2) substructure searching;
 
  (3) key searching;
 
  (4) chemical name searching;
 
  (5) molecular formula searching;
 
  (6) registration of new molecules;
 
  (7) structure import/export; and
 
  (8) data editing.
  Additionally,  the  present  invention  allows the routine integration of
chemical  structure  data with other related information such as inventory,
spectroscopic  data  and  clinical  data  via  standard relational database
methods  to allow better usage of all types of chemical information in both
commercial and research settings.
 
  By  taking  advantage of the data manipulation capabilities of relational
technology,  this  system will also introduce dynamic querying capabilities
which  will  allow  the  user  to be notified of any new chemicals that are
entered  into  the  database that are responsive to previously run queries.
This  provides the functionality of relational views for chemical structure
information.
 
  Additionally,  structure  classes can also be implemented which allow the
user  to  store  certain  types  of  information  about particular types of
chemical  structures such as steroids. Accordingly, users can later call up
this  information  in  a  quick and efficient manner without re-entering or
performing previously run queries.
 
  With  these  and  other objects, advantages and features of the invention
that  may  become apparent, the nature of the invention may be more clearly
understood  by  reference  to  the  following  detailed  description of the
invention, the appended claims and the several drawings attached hereto.
 
            DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
 
  The   present   invention  makes  use  of  standard  relational  database
technology  such  as  that  found in the commercial product Oracle which is
marketed  by  Oracle  Corporation  as  noted  above.  All references to the
retrieval  and storage of information will be done in a standard relational
database,  and  will  use  standard  procedures  for  doing  so,  including
structured  query  language (SQL) commands. The operations and functions of
relational databases discussed in this patent application are well known to
those  of ordinary skill in the database management field. Those operations
and  functions  can be found in numerous texts, including Oracle users' and
developers' manuals.
 
I. Hardware
 
  Referring  now  to  FIG.  9,  the  preferred embodiment of the relational
database  management system for chemical structure storage and retrieval is
shown.  A  typical computer workstation 1 will contain a central processing
unit  (CPU)  2,  and main memory 3, and can be coupled to storage devices 4
such  as magnetic disks, an input device such as a keyboard 5 or mouse, and
output  device  such as a computer monitor screen 6 and a printer 7. One or
more such storage devices may be utilized.
 
  The preferred embodiment of the relational database management system for
storing,   searching   and   retrieving   chemical   structure  utilizes  a
microprocessor,  such  as  a  Microvax  3100 model 900 operating with a VMS
5.5-2  operating  system  with  at least two gigabytes of disk space and at
least  32  megabytes of RAM. The system can be provided with more memory to
speed  up  throughput  access  rates.  The  system could also be optionally
coupled   to   a   local   area   network  (LAN)  or  other  communications
architecture/environment  in order to link with other computer workstations
and have access to data from other systems.
 
II. Relational Database Interface
 
  As noted above, one of the advantages of using relational databases for a
chemical  structure  search  system is that there is no need on the part of
the  developers  to  be  concerned  with  portability, since the relational
database is a standard unto itself, that requires no special interface from
one type of system to the next.
 
  In  the present invention, the use of a standard relational database such
as   the   Oracle  Relational  Database  Management  System  minimizes  the
portability  issues  since  they are available on virtually every platform.
Additionally,  the  present  invention  maintains the degree of portability
since it uses standard C with embedded SQL.
 
III. Registering New Structures
 
  As shown in FIG. 6, to register a new structure in the database, the user
simply  enters  the  atoms and bonds that make up the chemical structure by
typing  the  appropriate  keys,  or  selecting the appropriate choices from
menus  22  in  a standard chemical drawing software package such as Kekule,
marketed by PSI INTERNATIONAL, 810 Gleneagles Court, Suite 300, Towson, Md.
21286.
 
  For  each  new  structure  that  is  registered  (added) in the database,
several  steps  must  occur: (a) a connection table must be constructed and
stored  in  the database 24; (b) the system verifies that this structure is
not  a duplicate 26; (c) at least one search key must be created and stored
in  the database 28, and (d) information such as name, formula and registry
key number must be stored in the database 30. Each of these procedures will
be fully described below.
  a. Construction of a Connection Table
 
  For  each  structure to be registered in the database, a connection table
is constructed at step 24. This table stores information about each atom in
the  structure  including  its  atomic  number,  the identity of all of the
connected atoms, and the type of bond to each of these connected atoms. For
example,  the connection table for a chemical structure to be added such as
structure S2 as depicted in FIG. 2 is shown in FIG. 3.
 
  The  table  depicts  the  types  of links that are stored between any two
given atoms in a structure. A single bond between two atoms is denoted by a
"1" while a double bond is denoted by a "2" and a triple bond is denoted by
a "3." This table is stored in a relational table along with its associated
registry  number  in  a compressed sparse matrix form. The connection table
will  be  used  for  the  Atom  by  Atom  Matching  (ABAM) process which is
described more fully below.
 
  b. Search for Duplicates
 
  The  system  then  searches  the  existing  structures  to verify that no
duplicates  exist  in the database at step 26. If the structure has already
been entered into the system, it will not be entered again.
 
  c. Creation of Search Keys
 
  When  a  new  structure  of  N  atoms  is registered in the system, it is
necessary  to  construct N search keys for the structure. These search keys
are stored as data in the relational database. Each search key results from
a  unique  numbering  of  the  atoms with a different atom representing the
starting  point  of  the key. In effect, N different search keys are stored
for each structure of N atoms.
 
  To create effective search keys, it is necessary to derive an unambiguous
string of characters for each atom in the structure or query. The string is
a  representation  of  the  atomic  environment  of  the starting atom. The
ordering  of  characters  in  the string cannot be sensitive to deletion of
portions  of  the structure or query. That is, deletion can remove portions
of  the  string  (with  subsequent replacement by wildcards in a query) but
cannot cause reordering of the remaining characters in the string. One such
algorithm, which builds the string by adding connectivity information using
a  breadth-first  graph  traversal  is  detailed  in FIG. 7, and is used in
subsequent examples.
 
  As shown in FIG. 7, for each starting atom, the following process is used
to generate a search key. The process starts at step 40, and at step 41 all
atoms  are marked as "unranked" and "unused". Next, at step 42 the starting
atom  is  marked  as "used" and added to the key. Additionally, at step 43,
the starting atom is marked as the current atom.
 
  So, for example, when reviewing structure S1 in FIG. 1, and starting with
the  Bromine  (Br)  atom, the search key. The string would begin with "Br".
For  clarity,  the  first code in the key is shown as the atomic symbol; in
practice,  a  one  byte  code  is  used for this purpose. Additionally, the
Bromine (Br) atom would be marked as "used" and set to the current atom.
 
  At  step  44,  any  unused  neighbors  are examined. In this example, the
Carbon  (C  sub  1)  atom would be unused and accordingly, the system would
advance  to  step  45 where unused neighbors were ordered. Because there is
only one neighbor in this portion of the structure, and the ordering is not
terminated  by  an  open  site at step 46, and there is no open site at the
current  atom  at  step 48, the system advances to step 49, where codes for
the neighbors, in order, are added to the key and marked as "used".
 
  In  this example, the letter "c" will be added to the key to indicate the
single  bond  to  the  Carbon (C sub 1) atom, and the Carbon (C sub 1) atom
will be marked as "used". The system next adds an end-of-atom marker to the
key at step 51. The key now reads "Br c .". The current atom (Br) is marked
as "ranked" at step 52.
 
  The  system next verifies that ordering was not terminated, and there was
no  open  site at step 53. The process continues at step 54 by examining if
any atoms in the key are unranked. In this example, the Carbon (C sub 1) is
unranked.  Because  the  key  is  not  too  long  (i.e.,  not longer than a
predefined  length) at step 55, the Carbon (C sub 1) is chosen as the first
unranked  atom  in  step 56. The process repeats itself starting at step 43
with the Carbon (C sub 1) atom as the current atom.
 
  Now,  the  Carbon (C sub 1) is marked as the current atom, and the unused
neighbors  are examined at step 44. Once again, there is only a single bond
to  a  Carbon  (C  sub  2)  atom,  and therefore the ordering at step 45 is
unnecessary.  At  step  49,  a  "c" is added to the key, and at step 51 the
end-of-atom marker is added to the key. Accordingly the key now reads "Br c
. c .".
 
  This  Carbon  (C  sub  1) atom is now marked as "ranked" at step 52. Once
again, the unranked atoms in the key are examined at step 54, and the first
unranked atom in the key is chosen at step 56. This Carbon (C sub 2) is now
set  as  the current atom at step 43, and the unused neighbors are examined
at  step 44. There are two unused neighbors: a double bond to an Oxygen (O)
denoted  by "e", and a single bond to a Carbon (C sub 3) denoted by "c". At
step  45,  these  bonds  are  ordered, with "c" taking precedence over "e".
These codes are then added to the key in order, and the atoms are marked as
"used"  at step 49. Next, an end-of-atom marker is added to the key at step
51.  The  key  now  reads  "Br c . c . c e .". The Carbon (C sub 2) atom is
marked  as  "ranked" , and the unranked atoms (the Oxygen and Carbon (C sub
3) atom) are examined at step 54.
 
  At  step  56, the first unranked atom in the key (C sub 3) is chosen, and
at  step  43,  it  is  set  as  the  current atom. The process continues by
examining  the  single  bond  to  the  Carbon (C sub 4) emanating from this
Carbon (C sub 3), and accordingly "c" and "." are added to the key. The key
now reads "Br c . c . c e . c .".
 
  The  next  unranked atom (Oxygen) is then set as the current atom at step
43.  Given  that  there  are no unused neighbors at step 44, an end-of-atom
marker  is  simply  added  to  the  key  at step S1. The process once again
repeats  with  C  sub  4  as  the current atom. Because there are no unused
neighbors,  "."  is  appended to the string, and the process stops for this
starting  atom.  The final search key would be: "Br c . c . c e . c . . .".
The  same process is repeated for every starting atom in a given structure.
This  process is also shown, step-by-step, in FIG. 12. (Steps 57, 58 and 59
in FIG. 7 will be explained in Section V. below.)
 
  Utilizing  this  key  generation algorithm (FIG. 7) with the illustrative
bonded  atom  codes  shown  in  FIG. 4, the keys generated for structure S1
(FIG. 1) are:
 
  1) Br c . c . c e . c . . .
 
  2) C b c . . c e . c . . .
 
  3) C c c e . b . c . . . .
 
  4) O d . c c . b . c . . .
 
  5) C c c . . c e . b . . .
 
  6) C c . c . c e . b . . .
 
  For structure S2 (FIG. 2), the keys generated are:
 
  1) Cl c . c . c e . c . . .
 
  2) C a c . . c e . c . . .
 
  3) C c c e . a . c . . .
 
  4) O d . c c . a . c . . .
 
  5) C c c . . c e . a . . .
 
  6) C c . c . c e . a . . .
 
  These  search  keys  are  stored in the database with associated registry
numbers  which  correspond to the registry numbers of the connection tables
and  associated  information.  Keys that are duplicated due to symmetry are
eliminated at registration time.
 
  The  steps required for processing a search are unaffected by the details
of  the  search key generation process. That is, any key generation process
which  satisfies  the conditions set forth in the opening paragraph of this
section (generation of an unambiguous character string for each atom in the
structure,  etc.) can be utilized without modification of the search engine
software.
 
  An  additional  process,  which  builds  the string by listing structural
features found at each graph theoretical distance (level) from the starting
atom is detailed below.
 
  As  shown in FIG. 11, and using structure S1, the following process could
also be used to generate search strings. Beginning at step 80, all bonds in
a  given  structure are marked as "untraversed". A starting atom is chosen,
which  in  this case is the Bromine (Br) atom, at step 81, and added to the
key at step 82.
 
  Because  there  are  no open sites on this atom, the process continues at
step 85, where the system examines if any untraversed bonds to atoms at the
next level exist. In this case, there is an untraversed path to a Carbon (C
sub  1)  atom. The system next determines that there is no open site at any
atom  at  this  level  at  step  88  and,  if not, continues at step 90, by
ordering all untraversed bonds to all atoms at the next level.
 
  Because  the  wildcard flag has not been determined as having been set at
step  91, the system continues by adding the codes for the ordered paths to
the  key  at  step  93.  Then,  at step 94, the system adds an end-of-level
marker  to  the key. Accordingly, the key now reads "Br c .", and all bonds
that are included in the key are marked as "traversed".
 
  The  system  moves on to the next level with the Carbon (C sub 1) atom at
step 96. Once again, the system determines that there are untraversed paths
to  atoms  at the next level at step 85, and ultimately adds "c" and "." to
the key, at steps 93 and 94, respectively, after going through steps 88-92.
 
  The  system  then marks these bonds as "traversed", and moves to the next
level  beginning  with  the  Carbon (C sub 2) atom. Once again, untraversed
paths  are  found at step 85, namely a double bond to an Oxygen (O) denoted
by "e", and a single bond to a Carbon (C sub 3) denoted by "c". These codes
are ordered at step 90, and are added to the key with "c" taking precedence
over  "e"  at step 93. Again, an end-of-level marker is added to the key at
step 94. The string now reads "Br c . c . c e .".
 
  Next,  the  system  advances  to  the  next  level in this structure, and
repeats  the  above process with the Carbon (C sub 3) atom. Once again "c "
and "." are added to the string.
 
  Finally,  the  system  moves  to the next level at step 96 using the last
Carbon  (C sub 4) atom. At step 85, the system determines that there are no
untraversed  paths  to  atoms  at the next level, and stops at step 87. The
final key reads: "Br c . c . c e . c ."
 
  d. Associated Information Storage
 
  Additional  information  about  each  structure can also be stored in the
database,  such  as registry key or other unique identifier of a structure,
name,  and  formula. The user may also define any additional information to
store and search using standard RDBMS technology.
 
IV. Implementation Issues
 
  Each  of  the  fragment  codes  comprising the search keys can be made to
occupy  a single byte in the database. There are approximately 313 of these
fragment  types  existing  in a large sample of structures. One byte allows
256  possibilities,  three  of which cannot be used. (Byte 0 cannot be used
due  to its importance in programming, and two bytes used in the relational
database management system for its wildcard operation cannot be used, since
it  is normally difficult to search these characters and use the SQL "Like"
operator in the same statement).
 
  The  remaining  bytes  can  be  divided  into  three  groups:  (1)  those
representing  the most common fragments, (2) those representing atoms whose
presence  alone  (regardless  of  bonding)  is an effective screen, and (3)
those  very  rare  atoms  that  can be grouped together; accordingly, every
search for these atoms is essentially a multi-valued search.
 
V. Processing of Query Substructure Searches
 
  Performing  a  query  against  search keys is a relatively simple matter.
Each  query  structure  generates  one  search  key  for  each  atom in the
structure  that  could be assigned an unambiguous fragment code. The search
keys  of  the  query  structure  are generated by applying exactly the same
rules  to the query structure as those used to generate the database search
key  defined  in  III.b.  above, with the only exception being treatment of
wildcards.
 
  When  a  wildcard  (i.e., a site in which no particular atom is necessary
for  the  search) is encountered, the process must either stop, or continue
the  query by identifying all possibilities for the value of that wildcard.
Additionally,   queries  can  easily  accommodate  multi-valued  atoms  and
multi-typed  bonds  or  Markush searches in the same way that wildcards are
handled.   The  advantage  of  this  methodology  over  standard  screening
techniques is that it reflects the specificity of the query. For moderately
specific  queries (i.e., few wild-cards), it enables remarkable selectivity
because of the length of the key.
 
  As  demonstrated  with  the generation of search keys above, the basis of
the  search  process  is  that  the  search  keys  are  generated using one
unambiguous  set  of rules. So long as these rules are applied to the query
structure  in  exactly  the same fashion, and since each database structure
has  a  key  originating  from  each  of  its  atoms,  the  results will be
standardized.
 
  In  order  for the database structure to match the query structure, every
search  key  generated  for  the  query  must match one or more search keys
generated for the database structure. Additionally, if any query search key
fails to retrieve the structure, the query cannot be a substructure of that
structure. These rules make it possible to perform this extremely selective
screening process in a relational database with a single SELECT statement.
 
  To  generate  a query, the user types in a structure in the same way that
new  structures  are  entered,  and  can indicate where there are wildcards
(i.e.,   no  particular  atom  necessary)  and  where  there  are  multiple
acceptable  types  of  atoms  or bonds by indicating the specific atoms and
bonds that are acceptable in any given position.
  As  noted  above,  the query keys are generated in the same manner as the
search  keys for the structures. Accordingly, any acceptable key generation
process  may be used. Therefore, when generating a query key for Q1 in FIG.
5, and using the process shown in FIG. 7, the following steps occur.
 
  The  process  begins  at step 40, and at step 41, all atoms are marked as
"unranked"  and  "unused".  A starting atom is chosen and marked as "used",
and  the  atom  code  is  added to the key at step 42. In this example, the
Bromine (Br) atom will be the starting atom.
 
  At  step  43, Bromine (Br) is marked as the current atom, and at step 44,
it  is determined that there are unused neighbors. The process continues at
step 45, with all unused neighbors being ordered.
 
  Because  there are no open sites at step 46, and because there is no open
site  at the current atom at step 48, the codes for the neighbors are added
to  the  key  in  order  and  are  marked as "used" at step 49. The process
continues with an end-of-atom marker being added to the key at step 51. The
key  now  currently  reads "Br c .", and the Bromine (Br) atom is marked as
"ranked" at step 52.
 
  Because  the  ordering  was not terminated, and no open site was found at
the  current atom at step 53, the process continues at step 54 by reviewing
the  unranked atoms represented in the key. The system next verifies that a
maximum  number  of atoms has not been reached (the length of the query key
does  not  exceed  a  predetermined maximum length), and the first unranked
atom (C sub 1) is chosen at step 56.
 
  The process continues at step 43 with the Carbon (C sub 1) atom being set
as a current atom. Again, all unused neighbors are examined at step 44, and
ordered  at  step  45. Because the system has still not encountered an open
site,  the code for this bond (single bond to a Carbon) is added to the key
at  step  49,  with the end of atom marker added to the key at step 51. The
query key now reads "Br c . c .", and C sub 1 is marked as "ranked".
 
  The  process  next repeats itself with C sub 2 as the first unranked atom
in  the  key at step 56. Accordingly, C sub 2 is marked as the current atom
at  step  43, and at step 44 it is determined that there are still "unused"
neighbors. The unused neighbors are attempted to be ordered at step 45. The
bonds of the neighbors consist of a double bond to an Oxygen (O) denoted by
"e", and a wildcard (*).
 
  The  ordering is not terminated by the open site at step 46, because open
sites  only  terminate  this  process  at step 46 if the open site does not
exist  on  the  current  atom.  Because the open site exists on the current
atom,  the  "yes" branch is taken at step 481 and at step 50, codes for the
neighbors  are  added to the key with wildcard symbols around them. Next an
end-of-atom marker is added to the key at step 51. Accordingly, the current
string  reads  "Br  c  .  c  .  %  e  %"; and the C sub 2 atom is marked as
"ranked."
 
  At  step  53,  because  an  open  site was found at the current atom, the
system  advances  to  step 57. This string is a query key, so a wildcard is
added  to the end of the key at step 58, and the process is stopped at step
59. Accordingly, the final string reads "Br c . c . % e % . %". The process
is then repeated with all other atoms as starting atoms.
 
  Utilizing  the  key  generation  process  (FIG.  7) with the illustrative
bonded atom codes as shown in FIG. 4, the keys generated for query Q1 (FIG.
5) are:
 
  1) Br c . c . % e % . %
 
  2) C b c . . % e % . %
 
  3) C % c % e % . %
 
  4) O d . % c % . %
 
  The  process  illustrated  in  FIG. 11 can also be used to generate query
strings.  The  only  notable  addition  would  be that when an open site is
encountered  at  step  83  or 88, a wildcard flag is set at steps 84 or 89,
respectively.  Then, from step 91, the system advances to step 92, and adds
wildcard  (%) symbols before and after every code to be added to the string
at  this  step.  Additionally, a wildcard (%) symbol is added to the end of
the string.
 
  When  a  match  exists between a query and a given structure, each of the
query  keys  will  match  one  or more of the search keys of the structure.
Therefore,  any  one  of  the query keys may be used to retrieve a matching
structure.  Statistical  information is stored in the database allowing the
optimal query key to be used as the primary screen.
 
  However, passing the screening phase alone is not enough to indicate that
a  match  has  been  found.  The  system  must  next  verify that the query
structure  is  a  subset  of  the  structure  by performing an atom by atom
matching (ABAM) process. To do this, a connection table is prepared for the
query  structure  in  the  same  way  that  it  is  prepared  for the newly
registered  structures  in  III.a. above (see FIG. 3). These two connection
tables are then compared atom by atom, bond by bond. If every atom and bond
in  the connection table for the query is found in the connection table for
the structure, the system returns a match.
 
  Three cases are possible.
 
  Case 1. The query is a substructure of the retrieval structure. Note that
Q1  (FIG. 5) is a substructure of S1 (FIG. 1). In this case, note that each
Q1  key  matches  an  S1  key  (Q1 keys 1, 2, 3, 4 match S1 keys 1, 2, 3, 4
respectively).
 
  Case  2.  The  chosen key may retrieve a structure for which the query is
not  a substructure. For example, Q1 key 4 would match S2 key 4 even though
Q1  is  not  a  substructure  of S2. However, the ABAM would eliminate this
structure.
 
  Case  3.  The  chosen  query  key  does  not  match  any of the keys of a
particular  structure. For example, Q1 key 1 does not match any of the keys
of S2. Therefore, S2 is eliminated as a match and ABAM is avoided.
 
  Although  one query key is typically used to drive the screening process,
the other query keys can be used a secondary screen. In Case 2 above, using
Q1 key 1 as a secondary screen would eliminate S2, thus avoiding ABAM.
 
  The  query,  along  with  the  matching  results  and  other  identifying
information  (such  as owner of query and name of query), are stored in the
relational database for later use and viewing.
 
  The  user  may simply then advance through and view all of the structures
that  are  a  match  in the system which are displayed on a screen, such as
that  shown  in  FIG.  10. If no structures match, the system will return a
message indicating this.
 
  Due  to  the  fact  that  most  of  the  work  is done during the initial
screening  phase  (comparing  query  strings to structure search keys), the
time-consuming  atom  by  atom  matching is done only on a relatively small
subset   of  the  total  structures  in  the  database.  Accordingly,  this
methodology  may  be much quicker than other systems which perform the same
function.
 
  Because  all  search  queries  and  results  are stored in the relational
database,  the  user,  through standard relational database procedures, may
also  list previously conducted searches, edit previously defined searches,
update  or  refresh previously run searches, view structures in any search,
and delete previous searches.
 
VI. Exact Structure (Identity) Searching
 
  Identity  searching  involves finding a particular structure within a set
of  database  structures. This operation is performed by users, and is also
needed  at  the  time  of  registration.  Typically,  this  means finding a
structure  in  the  database  that  matches  the query exactly. An identity
search  is  a  special  case of the substructure search outlined above, but
having no open sites in the query.
 
  Thus,  the  current  substructure  search  method  described  above is an
adequate  method  for  implementing identity searching, and as such will be
used  accordingly.  Additionally,  the  meaning  of  "exact  match"  can be
user-definable.  The  default  definition  limits  the  matching process to
element  types  and  bonding.  Users can also specify additional structural
information  such  as  charge and mass values. This is performed at atom by
atom matching time.
 
VII. Chemical Name Searching
  Chemical  name  searching has been a problem of special note in the field
of  chemical  information systems. Most chemical names are long and complex
strings  which  are  not  easily searchable by standard substring searching
mechanisms.  This problem is compounded by the fact that most chemicals are
known by many systematic and/or tradenames.
 
  Chemical  name  searching  can  be  accomplished  by storing and indexing
carefully  defined  name fragments, as well as indexing the complex strings
of  the complete chemical names. Searching can be performed on a partial or
complete chemical name query using standard relational database technology.
 
  To  optimize  the  search,  the query is degenerated into its constituent
chemical  terms.  The  terms  are sorted in ascending order by frequency of
occurrence  found by looking up the number of compounds having a particular
term  in a stored table. This stored table is created by scanning all names
of  structures upon registration, and storing frequency information in that
table. Thus, this table acts as an index to chemical name fragments.
 
  Given  this  list  of  chemical  terms,  the  search  can be performed by
intersecting  the  resulting  SELECT  statements  or  using  one to drive a
correlated subquery.
  Since the chemical name information is handled entirely by the relational
database,  the data is then easily integrated with the rest of the chemical
information.
 
VIII. Molecular Formula Searching and Key Searching
 
  Molecular formula can be done by using standard SQL string search methods
on  all  or  part of the formula. Key searching (lookup by identifier) is a
standard SQL operation.
 
IX. Data Integration and Import/Export of Data
 
  A  significant  advantage  of  basing  a chemical information system in a
relational  database  is  the  ease  with  which  the structure data can be
combined  with  related  data,  resulting in a complete, integrated system.
This allows information in other systems to be easily imported and exported
into the RDBMS using standard RDBMS functionality.
 
X. Dynamic Queries
 
  As  is  true  with  all  relational  databases,  the design of the system
decomposes  into  a  series  of entities, relationships, and functions. The
relationships  among  entities  are  rigorously  defined  since referential
integrity  is  the  cornerstone  of  relational database design. A chemical
information system implemented using relational technology must be designed
with these considerations in mind.
 
  A  natural  relationship  exists  between the database structures and the
resulting  substructure  searches  with  each  search resulting in a set of
compound  identifiers. In the present invention, a relational table is used
to store the set of identifiers during each search of the database. This is
implemented  by  creating  a  table  to store general information about the
query (current user, date, query structure, options and search statistics).
A  related  table  is  created to store the identifiers of those structures
matching the query.
 
  As  new  structures  are  registered in the system, the set of structures
identified  as  resulting  from  earlier queries becomes obsolete since the
structure  database  contains  structures  not  present  at the time of the
original  search. This is, in a sense, a violation of referential integrity
because the relationship between structures and queries are not maintained.
 
  In  the  present  invention,  however,  the concept of dynamic queries is
introduced.  When  a  new  structure is registered, the system will examine
those  queries  designated  as  dynamic, and will add the identifier to the
search  result set for each query matching the new structure. This process,
made   simple   by  relational  technology,  allows  the  system  to  offer
functionality never before available in chemical information systems.
 
  Dynamic  queries  are  analogous to relational views. That is, they allow
searches  of  the database to be stored as objects in the database that are
always   current.  The  following  example  will  illustrate  the  type  of
functionality made possible by dynamic queries.
 
  As  shown  in  FIG.  8,  when  a  user  performs  a  search  at  step  60
(substructure,  chemical  name,  molecular  formula), the system stores the
resulting  set  of  compound  identifiers  at step 62. As the user examines
these  compounds,  the  system flags each compound as having been viewed by
the user at step 64. Therefore, the system always knows the search results,
and the extent to which the user has reviewed them.
 
  A  user  interested  in a particular class of structures (e.g., steroids)
would   perform  a  search  once  and  designate  the  search  as  dynamic.
Thereafter,  the  search will be maintained automatically by the system. In
fact,  the  system  would  notify  the  user  whenever  a  new  steroid was
registered  in  the  system  at  step 66. This is done by having the system
perform  all  dynamic  queries  on  any  newly registered molecule as it is
registered  into  the database at step 68 and notifying the user if a match
occurs  at  step 70. The user could then view the previously unseen results
at  step  72  without  having  to  repeat the query or view previously seen
results.
 
  While  dynamic  queries  would  not  have much importance with relatively
static  databases,  they  would  have  many  uses  in  the system serving a
research environment. Heavy use of dynamic queries could require allocation
of   significant   amounts  of  disk  space  for  storing  search  results.
Additionally,   the  performance  of  the  registration  process  could  be
adversely affected by the presence of a large number of dynamic queries.
 
  These  potential  problems  can  be  controlled  by the introduction of a
resource  allocation  system  with each user being assigned two quotas. The
first  quota  controls the number of dynamic queries that the user can have
active  at  any one time, which will protect performance at the time of the
registration  process.  The  second  quota will control the total number of
structure  identifiers  that  each  user  has  stored by dynamic queries to
conserve disk space.
 
  Alternatively, users would be allowed to disable these quota systems, but
this  may  slow  the system during the registration process, or may exhaust
disk space for the database.
 
XI. Structure Classes
 
  The  division  of  chemical  structures into classes based on overlapping
criteria  (e.g., functional groups, ring systems) have long been used as an
organizational   technique   in  chemistry.  Chemical  information  systems
typically  provide for this class system by allowing users to intersect the
results  of  different  searches.  While  this  intersection is a necessary
feature  of  any  chemical  information  system,  it  does  not address the
fundamental  importance of the classification 
schemes used in chemistry. In
the  present  invention,  a  mechanism will be provided for maintaining any
number  of  classification  schemes  in  the database for structures. These
schemes  or structure classes can be privately defined by individual users,
or can be used as a system-wide search aid.
 
  A  "structure  class"  is  defined  to  be a set of structure identifiers
resulting  from  a substructure search, a chemical name search, a molecular
formula  search,  or  by a combination of these searches. Structure classes
are  an  application  of  dynamic queries used to limit the scope of future
searches.  For  example,  the  system  may  maintain  a structure class for
steroids.  When  a user performs a search, he or she can designate that the
result   should  be  restricted  to  the  members  of  the  steroid  class.
Accordingly, the user could simply query the database for all steroids that
have a particular substructure without drawing the entire steroid ring.
 
  This  results  in two primary benefits: The first benefit is that queries
are  simplified,  i.e.,  there  is no need to draw complex queries, and the
second  benefit  is  that the screening phase need only be applied to those
compounds already known to be members of the structure class.
 
  Dynamic  queries and structure classes both exhibit a common benefit--the
overhead involved in structure searching is encountered only once (when the
dynamic  query  or  structure  class is defined) and additional overhead is
distributed  evenly  across  subsequent  updates  to the chemical structure
database.
 
  From the preceding description, it is evident that the invention has been
described in detail by reference to a particular embodiment adapted for use
in  the  field of chemistry. Although this invention offers many advantages
in  this  field,  it  may be used in other fields wherein structure data is
stored  advantageously as well. Accordingly, this invention is not intended
to  be  limited by the details of the preferred embodiment described above,
but rather by the terms of the appended claims.
 
 
  2/2,EM,SU/3     (Item 2 from file: 654) 
DIALOG(R)File 654:US PAT.FULL.
(c) format only 2001 The Dialog Corp. All rts. reserv.
 
             02592280
Utility
CHEMICAL STRUCTURE STORAGE, SEARCHING AND RETRIEVAL SYSTEM
 
PATENT NO.:  5,577,239
ISSUED:      November 19, 1996 (19961119)
INVENTOR(s): Moore, Jeffrey, 12 Breezy Tree Ct., Timonimun, MD (Maryland),
             US (United States of America), 21093
             Brazil, Joanne, 4500 Jolly Acres Rd., White Hall, MD
             (Maryland), US (United States of America), 21161
             Hoover, Jeffrey R., 8639 Willow Oak Rd., Baltimore, MD
             (Maryland), US (United States of America), 21234
             [Assignee Code(s): 68000]
EXTRA INFO:  Assignment transaction [Reassigned], recorded October 12,
             1994 (19941012)
             Assignment transaction [Reassigned], recorded January 21,
             1997 (19970121)
APPL. NO.:   8-288,503
FILED:       August 10, 1994 (19940810)
U.S. CLASS:  707-3 cross ref: 702-27
INTL CLASS:  [6] G06F 17-30
FIELD OF SEARCH: 364-DIG.1; 364-DIG.2; 364-496; 364-497; 364-499; 395-600
 
                             References Cited
 
                          U.S. PATENT DOCUMENTS
 
    4,642,762    2/1987   Fisanick                               364-300
    4,811,217    3/1989   Tokizane et al.                        364-300
    4,855,931    8/1989   Saunders                               364-499
    5,025,388    6/1991   Cramer, III et al.                     364-496
    5,056,035   10/1991   Fujita                                 364-497
    5,249,137    2/1993   Wilson et al.                          364-496
    5,367,058   11/1994   Pitner et al.                        530-391.9
    5,379,234    1/1995   Wilson et al.                          364-496
    5,386,507    1/1995   Teig et al.                            395-161
    5,418,944    5/1995   DiPace et al.                          395-600
    5,463,564   10/1995   Agrafiotis et al.                      364-496
 
                             OTHER REFERENCES
 
 
Viking  Instruments  Corp.  (Hewlett  Packard);  Spectra Trak Transportable
GC/MS System; (brochure), No date.
 
Chemical  Structure,  The International Language of Chemistry; Wendy A. War
(Ed.); "Interfacing DARC-Oracle" AJCM (Juus) de Jong (1988).
 
J.  Chem.  Inf.  Comput.  Sci.  (1983),  vol.  23,  No. 3 pp. 102-108; DARC
Substructure  Search  System; A New Approach to Chemical Information; Roger
Attias.
 
J.  Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2; pp. 74-82; DARC System;
Notions  of Defined and Generic Substructures. Filiation and Coding of FREL
Substructure (SS) Classes; Jacques-Emile Dubois et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1990),  vol.  30,  No.  2;  pp.  191-199,
Substructure  Search Systems, 1, Performance Comparison of the MACCS, DARC,
HTSS,  CAS  Registry  MVSSS,  and S4 Substructure Search Systems; Martin G.
Hicks.
 
J.  Chem.  Inf.  Comput.  Sci.  (1988),  vol.  28,  No.  4; pp. 221-226; An
Efficient Graph Approach to Matching Chemical Structures, O. Owolabi.
 
J.  Chem.  Inf. Comput. Sci. (1990), vol. 30, No. 4; pp. 332-339; Reactions
in the Bellstein Information System: Nonaporic Organic Synthesis; Martin G.
Hicks.
 
Analytica  Chimica Acta, 235 (1990), pp. 87-92; Substructure Search Systems
for Large Chemical Data Bases; Martin G. Hicks et al.
 
J.  Chem.  Inf.  Comput.  Sci.  (1991),  vol.  31,  No. 2; pp. 320-326; The
Bellstein Structure Registry System, 1, General Design; Laszio Domokos.
 
J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4; pp. 255-260; 3DSearch; A
System for Three-Dimensional Substructure Searching; Robert P. Sheridan, et
al.
 
Substructure  Searches  of  Chemical  Structure  Files;  (Jan.  23,  1973);
Strategic   Considerations   in  the  Design  of  a  Screening  System  for
Substructure  Searches  of  Chemical Structure Files; George W. Adamson, et
al.
 
Chemical  Structure  Searching;  (Jan.  21,  1975); An Efficient Design for
Chemical Structure Searching, I, The Screens; Alfred Feldman et al.
 
J. Chem. Inf. Comput. Sci. (1982), vol. 22, No. 4; The Third BASIC Fragment
Search Dictionary; W. Graf, H. K. Kaindl, et al.
 
J.  Chem.  Inf.  Comput. Sci. (1983), vol. 23, No. 3; The CAS ONLINE Search
System,  1,  General  System  Design  and Selection, Generation, and Use of
Search Screens; P. G. Dittmar, et al.
 
Computer  Chemical,  (1991),  vol.  15,  No. 2; pp. 103-107; A Central Atom
Based  Algorithm  and Computer Program for Substructure Search; Alf Dengler
and Ivar Ugi.
 
J.  Chem.  Inf. Comput. Sci. (1993), vol. 33, No. 4; pp. 545-547; Structure
Searching  in  Chemical  Databases  by  Direct  Lookup  Methods; Bradley D.
Christie et al.
 
J.   Chem.  Inf.  Comput.  Sci.  (1993);  vol.  33,  No.  4;  pp.  539-541;
Substructure  Searching  on  Very  Large  Files  by  Using Multiple Storage
Techniques; Alexander Bartmann et al.
 
 
PRIMARY EXAMINER: Black, Thomas G.
ASST. EXAMINER:   Von Buhr, Maria N.
ATTORNEY, AGENT, OR FIRM: Dickstein Shapiro Morin & Oshinsky LLP
CLAIMS:           12
EXEMPLARY CLAIM:  1
DRAWING PAGES:    7
DRAWING FIGURES:  12
ART UNIT:         237
FULL TEXT:        791 lines
 
 
                         FIELD OF THE INVENTION
 
  The  present invention relates to a relational database management system
that  stores, searches and retrieves chemical structure information quickly
and easily.
 
                       BACKGROUND OF THE INVENTION
 
  Chemical  and  pharmaceutical  industries and chemical-related government
agencies  commonly  maintain  large  chemical  substance  databases.  These
entities often provide structure-searching capabilities in association with
such databases. Recently, these organizations have been standardizing their
databases  using relational database management systems (RDBMS) such as the
Oracle  Relational  Database Management System by Oracle Corporation, World
Headquarters, 500 Oracle Pkwy., Redwood Shores, Calif. 94065.
 
  The  advantages  of  integrating  chemical  structure information into an
RDBMS  include:  a  closer  integration  with  other related chemical data,
efficiency  in  both  storage and retrieval of chemical structure data, and
better access to the chemical structure data by other related applications.
 
  Unfortunately, chemical information systems have traditionally been built
using specialized database technology requiring, in many cases, hundreds of
thousands  of lines of custom computer code. Systems of this type are often
both  difficult  to  maintain,  and difficult to adapt to changing hardware
technologies.   These   maintenance   problems,  coupled  with  a  lack  of
portability  of  these  highly  specialized  systems,  often  lead to large
investments  of  time  and  money being allocated to relatively short-lived
systems.
 
  The   introduction   of   relational   database  technology  provides  an
opportunity   to  transfer  a  large  amount  of  the  database  management
responsibility  from  the specialized database systems described above to a
standard  widely-accepted  technology.  However,  relational technology has
typically not been used as the basis for chemical information systems. This
is  due to the fact that there are problems inherent in any attempt to cast
a  chemical  structure  searching  system  problem  into  structured  query
language  (SQL)--the  standard  language  of  relational  databases.  These
problems include difficulty in storing and representing chemical structures
in  a  database.  No  chemical  information system has yet been implemented
using only relational technology as its database component.
 
  Several  systems  have  attempted to achieve this goal but, as more fully
explained  below,  none  have  been  able  to  develop  a purely relational
database  management  system  which is able to search and retrieve chemical
structure information easily and quickly.
 
  For  example,  Molecular  Access System (MACCS) and Integrated Scientific
Information  System  (ISIS)  are both created by Molecular Design Ltd., MDL
Information  Systems,  14600  Catalina  Street,  San Leandro, Calif. 94577.
These  systems  provide  a  stand-alone chemical information system wherein
chemical  structures  are stored as hierarchical structures. However, these
systems  require  large amounts of custom code, and are not maintained in a
relational  database.  Accordingly,  they  do  not  have  the advantages of
relational technology listed above.
 
  While  it  is  true  that these systems can be interfaced to a relational
database  management  system  such as the Oracle Database Management System
noted above, it must be done using additional custom code and software that
converts hierarchical structures to the relational tables needed for such a
database.  Therefore,  it  is  difficult  to  incorporate the advantages of
relational  technology  into  the  MACCS  and  ISIS  systems. Moreover, the
conversion software slows down overall performance speed.
 
  In  summary, these systems do not provide the advantages and capabilities
existing in the present invention.
 
  The   present   invention   overcomes   the   above-listed  problems  and
additionally  has the following advantages: (1) development and maintenance
costs  will  be  greatly  reduced  by  using a commercial database package.
Accordingly,  development  efforts  and  benefits  can  be more effectively
directed   toward  aspects  of  system  design,  and  improvements  in  the
underlying  database  technology  will  be automatically transferred to the
chemical  information  system.  This  shift  of  focus  away  from database
development   concentrates  the  development  and  maintenance  efforts  on
improving  the search strategy and the user interface, which are the highly
visible  aspects  of  the  system;  (2)  interfacing with other information
systems  will  be simplified since relational databases are already used to
store  much  of  the  non-structural  chemical  data  used  in research and
commercial  settings;  and  (3)  portability  will be much less of a design
drawback  since  the amount of custom programming is minimal and can easily
be  adapted  to  numerous  types  of technology. Therefore, the portability
responsibilities are mostly shouldered by the database manufacturer itself,
and not by the developer of the chemical storage system.
 
                        SUMMARY OF THE INVENTION
 
  The present invention overcomes the shortfalls in the art by developing a
chemical structure search system which expands the capabilities of existing
systems by capitalizing on the strengths of relational database technology.
 
  The  present  invention allows the user to optimally store and search the
chemical  structure  information  using  various  search strategies such as
multi-valued  atoms, multi-typed bonds, Markush searching and various other
options in a relational database management system.
 
  Furthermore,  it  provides  a  complete chemical information system which
includes modules for:
 (1) exact structure searching;
 (2) substructure searching;
 (3) key searching;
 (4) chemical name searching;
 (5) molecular formula searching;
 (6) registration of new molecules;
 (7) structure import/export; and
 (8) data editing.
 
  Additionally,  the  present  invention  allows the routine integration of
chemical  structure  data with other related information such as inventory,
spectroscopic  data  and  clinical  data  via  standard relational database
methods  to allow better usage of all types of chemical information in both
commercial and research settings.
 
  By  taking  advantage of the data manipulation capabilities of relational
technology,  this  system will also introduce dynamic querying capabilities
which  will  allow  the  user  to be notified of any new chemicals that are
entered  into  the  database that are responsive to previously run queries.
This  provides the functionality of relational views for chemical structure
information.
 
  Additionally,  structure  classes can also be implemented which allow the
user  to  store  certain  types  of  information  about particular types of
chemical  structures such as steroids. Accordingly, users can later call up
this  information  in  a  quick and efficient manner without re-entering or
performing previously run queries.
 
  With  these  and  other objects, advantages and features of the invention
that  may  become apparent, the nature of the invention may be more clearly
understood  by  reference  to  the  following  detailed  description of the
invention, the appended claims and the several drawings attached hereto.
 
            DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
 
  The   present   invention  makes  use  of  standard  relational  database
technology  such  as  that  found in the commercial product Oracle which is
marketed  by  Oracle  Corporation  as  noted  above.  All references to the
retrieval  and storage of information will be done in a standard relational
database,  and  will  use  standard  procedures  for  doing  so,  including
structured  query  language (SQL) commands. The operations and functions of
relational databases discussed in this patent application are well known to
those  of ordinary skill in the database management field. Those operations
and  functions  can be found in numerous texts, including Oracle users' and
developers' manuals.
 
I. HARDWARE
  Referring  now  to  FIG.  9,  the  preferred embodiment of the relational
database  management system for chemical structure storage and retrieval is
shown.  A  typical computer workstation 1 will contain a central processing
unit  (CPU)  2,  and main memory 3, and can be coupled to storage devices 4
such  as magnetic disks, an input device such as a keyboard 5 or mouse, and
output  device  such as a computer monitor screen 6 and a printer 7. One or
more such storage devices may be utilized.
 
  The preferred embodiment of the relational database management system for
storing,   searching   and   retrieving   chemical   structure  utilizes  a
microprocessor,  such  as  a  Microvax  3100 model 900 operating with a VMS
5.5-2  operating  system  with  at least two gigabytes of disk space and at
least  32  megabytes of RAM. The system can be provided with more memory to
speed  up  throughput  access  rates.  The  system could also be optionally
coupled   to   a   local   area   network  (LAN)  or  other  communications
architecture/environment  in order to link with other computer workstations
and have access to data from other systems.
 
II. RELATIONAL DATABASE INTERFACE
 
  As noted above, one of the advantages of using relational databases for a
chemical  structure  search  system is that there is no need on the part of
the  developers  to  be  concerned  with  portability, since the relational
database is a standard unto itself, that requires no special interface from
one type of system to the next.
 
  In  the present invention, the use of a standard relational database such
as   the   Oracle  Relational  Database  Management  System  minimizes  the
portability  issues  since  they are available on virtually every platform.
Additionally,  the  present  invention  maintains the degree of portability
since it uses standard C with embedded SQL.
 
III. REGISTERING NEW STRUCTURES
 
  As shown in FIG. 6, to register a new structure in the database, the user
simply  enters  the  atoms and bonds that make up the chemical structure by
typing  the  appropriate  keys,  or  selecting the appropriate choices from
menus  22  in  a standard chemical drawing software package such as Kekule,
marketed by PSI INTERNATIONAL, 810 Gleneagles Court, Suite 300, Towson, Md.
21286.
 
  For  each  new  structure  that  is  registered  (added) in the database,
several  steps  must  occur: (a) a connection table must be constructed and
 
stored  in  the database 24; (b) the system verifies that this structure is
not  a duplicate 26; (c) at least one search key must be created and stored
in  the database 28, and (d) information such as name, formula and registry
key  number  must  be  stored in the database 30. Each or, these procedures
will be fully described below.
 
a. Construction of a Connection Table
 
  For  each  structure to be registered in the database, a connection table
is constructed at step 24. This table stores information about each atom in
the  structure  including  its  atomic  number,  the identity of all of the
connected atoms, and the type of bond to each of these connected atoms. For
example,  the connection table for a chemical structure to be added such as
structure S2 as depicted in FIG. 2 is shown in FIG. 3.
 
  The  table  depicts  the  types  of links that are stored between any two
given atoms in a structure. A single bond between two atoms is denoted by a
"1" while a double bond is denoted by a "2" and a triple bond is denoted by
a "3." This table is stored in a relational table along with its associated
registry  number  in  a compressed sparse matrix form. The connection table
will  be  used  for  the  Atom  by  Atom  Matching  (ABAM) process which is
described more fully below.
b. Search for Duplicates
 
  The  system  then  searches  the  existing  structures  to verify that no
duplicates  exist  in the database at step 26. If the structure has already
been entered into the system, it will not be entered again.
 
c. Creation of Search Keys
 
  When  a  new  structure  of  N  atoms  is registered in the system, it is
necessary  to  construct N search keys for the structure. These search keys
are stored as data in the relational database. Each search key results from
a  unique  numbering  of  the  atoms with a different atom representing the
starting  point  of  the key. In effect, N different search keys are stored
for each structure of N atoms.
 
  To create effective search keys, it is necessary to derive an unambiguous
string of characters for each atom in the structure or query. The string is
a  representation  of  the  atomic  environment  of  the starting atom. The
ordering  of  characters  in  the string cannot be sensitive to deletion of
portions  of  the structure or query. That is, deletion can remove portions
of  the  string  (with  subsequent replacement by wildcards in a query) but
cannot cause reordering of the remaining characters in the string. One such
algorithm, which builds the string by adding connectivity information using
a  breadth-first  graph  traversal  is  detailed  in FIG. 7, and is used in
subsequent examples.
 
  As shown in FIG. 7, for each starting atom, the following process is used
to generate a search key. The process starts at step 40, and at step 41 all
atoms  are marked as "unranked" and "unused". Next, at step 42 the starting
atom  is  marked  as "used" and added to the key. Additionally, at step 43,
the starting atom is marked as the current atom.
 
  So, for example, when reviewing structure S1 in FIG. 1, and starting with
the  Bromine  (Br)  atom,  the  search  string  would  begin with "Br". For
clarity,  the  first  code  in  the  key  is shown as the atomic symbol; in
practice,  a  one  byte  code  is  used for this purpose. Additionally, the
Bromine (Br) atom would be marked as "used" and set to the current atom.
 
  At  step  44,  any  unused  neighbors  are examined. In this example, the
Carbon  (C)  atom would be unused and accordingly, the system would advance
to  step  45 where unused neighbors were ordered. Because there is only one
neighbor  in  this  portion  of  the  structure,  and  the  ordering is not
terminated  by  an  open  site at step 46, and there is no open site at the
current  atom  at  step 48, the system advances to step 49, where codes for
the neighbors, in order, are added to the key and marked as "used".
 
  In  this example, the letter "c" will be added to the key to indicate the
single  bond  to  the  Carbon (C sub 1) atom, and the Carbon (C sub 1) atom
will be marked as "used". The system next adds an end-of-atom marker to the
key at step 51. The key now reads "Br c .". The current atom (Br) is marked
as "ranked" at step 52.
 
  The  system next verifies that ordering was not terminated, and there was
no  open  site at step 53. The process continues at step 54 by examining if
any atoms in the key are unranked. In this example, the Carbon (C sub 1) is
unranked.  Because  the  key  is  not  too  long  (i.e.,  not longer than a
predefined  length) at step 55, the Carbon (C sub 1) is chosen as the first
unranked  atom  in  step 56. The process repeats itself starting at step 43
with the Carbon (C sub 1) atom as the current atom.
 
  Now,  the  Carbon (C sub 1) is marked as the current atom, and the unused
neighbors  are examined at step 44. Once again, there is only a single bond
to  a  Carbon  (C  sub  2)  atom,  and therefore the ordering at step 45 is
unnecessary.  At  step  49,  a  "c" is added to the key, and at step 51 the
end-of-atom marker is added to the key. Accordingly the key now reads "Br c
. c .".
 
  This  Carbon  (C  sub  1) atom is now marked as "ranked" at step 52. Once
again, the unranked atoms in the key are examined at step 54, and the first
unranked atom in the key is chosen at step 56. This Carbon (C sub 2) is now
set  as  the current atom at step 43, and the unused neighbors are examined
at  step 44. There are two unused neighbors: a double bond to an Oxygen (O)
denoted  by "e", and a single bond to a Carbon (C sub 3) denoted by "c". At
step  45,  these  bonds  are  ordered, with "c" taking precedence over "e".
These codes are then added to the key in order, and the atoms are marked as
"used"  at step 49. Next, an end-of-atom marker is added to the key at step
51.  The  key  now  reads  "Br c . c . c e .". The Carbon (C sub 2) atom is
marked as "ranked", and the unranked atoms (the Oxygen and Carbon (C sub 3)
atom) are examined at step 54.
 
  At  step  56, the first unranked atom in the key (C sub 3) is chosen, and
at  step  43,  it  is  set  as  the  current atom. The process continues by
examining  the  single  bond  to  the  Carbon (C sub 4) emanating from this
Carbon (C sub 3), and accordingly "c" and "." are added to the key. The key
now reads "Br c . c . c e . c .".
 
  The  next  unranked atom (Oxygen) is then set as the current atom at step
43.  Given  that  there  are no unused neighbors at step 44, an end-of-atom
marker  is  simply  added  to  the  key  at step S1. The process once again
repeats  with  C  sub  4  as  the current atom. Because there are no unused
neighbors,  "."  is  appended to the string, and the process stops for this
starting  atom.  The final search key would be: "Br c . c . c e . c . . .".
The  same process is repeated for every starting atom in a given structure.
This  process is also shown, step-by-step, in FIG. 12. (Steps 57, 58 and 59
in FIG. 7 will be explained in Section V. below.)
 
  Utilizing  this  key  generation algorithm (FIG. 7) with the illustrative
bonded  atom  codes  shown  in  FIG. 4, the keys generated for structure S1
(FIG. 1) are:
 1) Br c . c . c e . c . . .
 2) C b c . . c e . c . . .
 3) C c c e . b . c . . . .
 4) O d .c c . b . c . . .
 5) C c c . . c e . b . . .
 6) C c . c . c e . b . . .
 
  For structure S2 (FIG. 2), the keys generated are:
 1) Cl c . c . c e . c . . .
 2) C a c . . c e . c . . .
 3) C c c e . a . c . . . .
 4) O d . c c . a . c . . .
 5) C c c . . c e . a . . .
 6) C c . c . c e . a . . .
 
  These  search  keys  are  stored in the database with associated registry
numbers  which  correspond to the registry numbers of the connection tables
and  associated  information.  Keys that are duplicated due to symmetry are
eliminated at registration time.
 
  The  steps required for processing a search are unaffected by the details
of  the  search key generation process. That is, any key generation process
which  satisfies  the conditions set forth in the opening paragraph of this
section (generation of an unambiguous character string for each atom in the
structure,  etc.) can be utilized without modification of the search engine
software.
 
  An  additional  process,  which  builds  the string by listing structural
features found at each graph theoretical distance (level) from the starting
atom is detailed below.
 
  As  shown in FIG. 11, and using structure S1, the following process could
also be used to generate search strings. Beginning at step 80, all bonds in
a  given  structure are marked as "untraversed". A starting atom is chosen,
which  in  this case is the Bromine (Br) atom, at step 81, and added to the
key at step 82.
 
  Because  there  are  no open sites on this atom, the process continues at
step 85, where the system examines if any untraversed bonds to atoms at the
next level exist. In this case, there is an untraversed path to a Carbon (C
sub  1)  atom. The system next determines that there is no open site at any
atom  at  this  level  at  step  88  and,  if not, continues at step 90, by
ordering all untraversed bonds to all atoms at the next level.
 
  Because  the  wildcard flag has not been determined as having been set at
step  91, the system continues by adding the codes for the ordered paths to
the  key  at  step  93.  Then,  at step 94, the system adds an end-of-level
marker  to  the  key  Accordingly, the key now reads "Br c ." and all bonds
that are included in the key are marked as "traversed".
 
  The  system  moves on to the next level with the Carbon (C sub 1) atom at
step 96. Once again, the system determines that there are untraversed paths
to  atoms  at the next level at step 85, and ultimately adds "c" and "." to
the key, at steps 93 and 94, respectively, after going through steps 88-92.
  The  system  then marks these bonds as "traversed", and moves to the next
level  beginning  with  the  Carbon (C sub 2) atom. Once again, untraversed
paths  are  found at step 85, namely a double bond to an Oxygen (O) denoted
by "e", and a single bond to a Carbon (C sub 3) denoted by "c". These codes
are ordered at step 90, and are added to the key with "c" taking precedence
over  "e"  at step 93. Again, an end-of-level marker is added to the key at
step 94. The string now reads "Br c . c . c e .".
 
  Next,  the  system  advances  to  the  next  level in this structure, and
repeats  the  above  process with the Carbon (C sub 3) atom. Once again "c"
and "." are added to the string.
 
  Finally,  the  system  moves  to the next level at step 96 using the last
Carbon  (C sub 4) atom. At step 85, the system determines that there are no
untraversed  paths  to  atoms  at the next level, and stops at step 87. The
final key reads: "Br c . c . c e . c . ".
 
d. Associated Information Storage
 
  Additional  information  about  each  structure can also be stored in the
database,  such  as registry key or other unique identifier of a structure,
name,  and  formula. The user may also define any additional information to
store and search using standard RDBMS technology.
 
IV. IMPLEMENTATION ISSUES
 
  Each  of  the  fragment  codes  comprising the search keys can be made to
occupy  a single byte in the database. There are approximately 313 of these
fragment  types  existing  in a large sample of structures. One byte allows
256  possibilities,  three  of which cannot be used. (Byte 0 cannot be used
due  to its importance in programming, and two bytes used in the relational
database management system for its wildcard operation cannot be used, since
it  is normally difficult to search these characters and use the SQL "Like"
operator in the same statement).
 
  The  remaining  bytes  can  be  divided  into  three  groups:  (1)  those
representing  the most common fragments, (2) those representing atoms whose
presence  alone  (regardless  of  bonding)  is an effective screen, and (3)
those  very  rare  atoms  that  can be grouped together; accordingly, every
search for these atoms is essentially a multi-valued search.
 
V. PROCESSING OF QUERY SUBSTRUCTURE SEARCHES
 
  Performing  a  query  against  search keys is a relatively simple matter.
Each  query  structure  generates  one  search  key  for  each  atom in the
structure  that  could be assigned an unambiguous fragment code. The search
keys  of  the  query  structure  are generated by applying exactly the same
rules  to the query structure as those used to generate the database search
key  defined  in  III.b.  above, with the only exception being treatment of
wildcards.
 
  When  a  wildcard  (i.e., a site in which no particular atom is necessary
for  the  search) is encountered, the process must either stop, or continue
the  query by identifying all possibilities for the value of that wildcard.
Additionally,   queries  can  easily  accommodate  multi-valued  atoms  and
multi-typed  bonds  or  Markush searches in the same way that wildcards are
handled.   The  advantage  of  this  methodology  over  standard  screening
techniques is that it reflects the specificity of the query. For moderately
specific  queries (i.e., few wild-cards), it enables remarkable selectivity
because of the length of the key.
 
  As  demonstrated  with  the generation of search keys above, the basis of
the  search  process  is  that  the  search  keys  are  generated using one
unambiguous  set  of rules. So long as these rules are applied to the query
structure  in  exactly  the same fashion, and since each database structure
has  a  key  originating  from  each  of  its  atoms,  the  results will be
standardized.
 
  In  order  for the database structure to match the query structure, every
search  key  generated  for  the  query  must match one or more search keys
generated for the database structure. Additionally, if any query search key
fails to retrieve the structure, the query cannot be a substructure of that
structure. These rules make it possible to perform this extremely selective
screening process in a relational database with a single SELECT statement.
 
  To  generate  a query, the user types in a structure in the same way that
new  structures  are  entered,  and  can indicate where there are wildcards
(i.e.,   no  particular  atom  necessary)  and  where  there  are  multiple
acceptable  types  of  atoms  or bonds by indicating the specific atoms and
bonds that are acceptable in any given position.
 
  As  noted  above,  the query keys are generated in the same manner as the
search  keys for the structures. Accordingly, any acceptable key generation
process  may be used. Therefore, when generating a query key for Q1 in FIG.
5, and using the process shown in FIG. 7, the following steps occur.
 
  The  process  begins  at step 40, and at step 41, all atoms are marked as
"unranked"  and  "unused".  A starting atom is chosen and marked as "used",
and  the  atom  code  is  added to the key at step 42. In this example, the
Bromine (Br) atom will be the starting atom.
 
  At  step  43, Bromine (Br) is marked as the current atom, and at step 44,
it  is determined that there are unused neighbors. The process continues at
step 45, with all unused neighbors being ordered.
 
  Because  there are no open sites at step 46, and because there is no open
site  at the current atom at step 48, the codes for the neighbors are added
to  the  key  in  order  and  are  marked as "used" at step 49. The process
continues  with an end-of-atom marker being added to the key at step 51 The
key  now  currently  reads "Br c .", and the Bromine (Br) atom is marked as
"ranked" at step 52.
 
  Because  the  ordering  was not terminated, and no open site was found at
the  current atom at step 53, the process continues at step 54 by reviewing
the  unranked atoms represented in the key. The system next verifies that a
maximum  number  of atoms has not been reached (the length of the query key
does  not  exceed  a  predetermined maximum length), and the first unranked
atom (C sub 1) is chosen at step 56.
 
  The process continues at step 43 with the Carbon (C sub 1) atom being set
as a current atom. Again, all unused neighbors are examined at step 44, and
ordered  at  step  45. Because the system has still not encountered an open
site,  the code for this bond (single bond to a Carbon) is added to the key
at  step  49,  with  the end of atom marker added to the key at step 51 The
query key now reads "Br c . c .", and C sub 1 is marked as "ranked".
 
  The  process  next repeats itself with C sub 2 as the first unranked atom
in  the  key at step 56. Accordingly, C sub 2 is marked as the current atom
at  step  43, and at step 44 it is determined that there are still "unused"
neighbors. The unused neighbors are attempted to be ordered at step 45. The
bonds of the neighbors consist of a double bond to an Oxygen (O) denoted by
"e", and a wildcard (*).
 
  The  ordering is not terminated by the open site at step 46, because open
sites  only  terminate  this  process  at step 46 if the open site does not
exist  on  the  current  atom.  Because the open site exists on the current
atom,  the  "yes" branch is taken at step 481 and at step 50, codes for the
neighbors  are  added to the key with wildcard symbols around them. Next an
end-of-atom marker is added to the key at step 51. Accordingly, the current
string  reads  "Br  c  .  c  .  %  e  %"; and the C sub 2 atom is marked as
"ranked."
  At  step  53,  because  an  open  site was found at the current atom, the
system  advances  to  step 57. This string is a query key, so a wildcard is
added  to the end of the key at step 58, and the process is stopped at step
59. Accordingly, the final string reads "Br c . c . % e % . %". The process
is then repeated with all other atoms as starting atoms.
 
  Utilizing  the  key  generation  process  (FIG.  7) with the illustrative
bonded atom codes as shown in FIG. 4, the keys generated for query Q1 (FIG.
5) are:
 1) Br c . c . % e % . %
 2) C b c . . % e % . %
 3) C % c % e % . %
 4) O d . % c % . %
 
  The  process  illustrated  in  FIG. 11 can also be used to generate query
strings.  The  only  notable  addition  would  be that when an open site is
encountered  at  step  83  or 88, a wildcard flag is set at steps 84 or 89,
respectively.  Then, from step 91, the system advances to step 92, and adds
wildcard  (%) symbols before and after every code to be added to the string
at  this  step.  Additionally, a wildcard (%) symbol is added to the end of
the string.
  When  a  match  exists between a query and a given structure, each of the
query  keys  will  match  one  or more of the search keys of the structure.
Therefore,  any  one  of  the query keys may be used to retrieve a matching
structure.  Statistical  information is stored in the database allowing the
optimal query key to be used as the primary screen.
 
  However, passing the screening phase alone is not enough to indicate that
a  match  has  been  found.  The  system  must  next  verify that the query
structure  is  a  subset  of  the  structure  by performing an atom by atom
matching (ABAM) process. To do this, a connection table is prepared for the
query  structure  in  the  same  way  that  it  is  prepared  for the newly
registered  structures  in  III.a. above (see FIG. 3). These two connection
tables are then compared atom by atom, bond by bond. If every atom and bond
in  the connection table for the query is found in the connection table for
the structure, the system returns a match.
 
  Three cases are possible.
 
  Case 1. The query is a substructure of the retrieval structure. Note that
Q1  (FIG. 5) is a substructure of S1 (FIG. 1). In this case, note that each
Q1  key  matches  an  S1  key  (Q1 keys 1, 2, 3, 4 match S1 keys 1, 2, 3, 4
respectively).
 
  Case  2.  The  chosen key may retrieve a structure for which the query is
not  a substructure. For example, Q1 key 4 would match S2 key 4 even though
Q1  is  not  a  substructure  of S2. However, the ABAM would eliminate this
structure.
 
  Case  3.  The  chosen  query  key  does  not  match  any of the keys of a
particular  structure. For example, Q1 key 1 does not match any of the keys
of S2. Therefore, S2 is eliminated as a match and ABAM is avoided.
 
  Although  one query key is typically used to drive the screening process,
the other query keys can be used a secondary screen. In Case 2 above, using
Q1 key 1 as a secondary screen would eliminate S2, thus avoiding ABAM.
 
  The  query,  along  with  the  matching  results  and  other  identifying
information  (such  as owner of query and name of query), are stored in the
relational database for later use and viewing.
 
  The  user  may simply then advance through and view all of the structures
that  are  a  match  in the system which are displayed on a screen, such as
that  shown  in  FIG.  10. If no structures match, the system will return a
message indicating this.
 
  Due  to  the  fact  that  most  of  the  work  is done during the initial
screening  phase  (comparing  query  strings to structure search keys), the
time-consuming  atom  by  atom  matching is done only on a relatively small
subset   of  the  total  structures  in  the  database.  Accordingly,  this
methodology  may  be much quicker than other systems which perform the same
function.
 
  Because  all  search  queries  and  results  are stored in the relational
database,  the  user,  through standard relational database procedures, may
also  list previously conducted searches, edit previously defined searches,
update  or  refresh previously run searches, view structures in any search,
and delete previous searches.
 
VI. EXACT STRUCTURE (IDENTITY) SEARCHING
 
  Identity  searching  involves finding a particular structure within a set
of  database  structures. This operation is performed by users, and is also
needed  at  the  time  of  registration.  Typically,  this  means finding a
structure  in  the  database  that  matches  the query exactly. An identity
search  is  a  special  case of the substructure search outlined above, but
having no open sites in the query.
 
  Thus,  the  current  substructure  search  method  described  above is an
adequate  method  for  implementing identity searching, and as such will be
used  accordingly.  Additionally,  the  meaning  of  "exact  match"  can be
user-definable.  The  default  definition  limits  the  matching process to
element  types  and  bonding.  Users can also specify additional structural
information  such  as  charge and mass values. This is performed at atom by
atom matching time.
 
VII. CHEMICAL NAME SEARCHING
 
  Chemical  name  searching has been a problem of special note in the field
of  chemical  information systems. Most chemical names are long and complex
strings  which  are  not  easily searchable by standard substring searching
mechanisms.  This problem is compounded by the fact that most chemicals are
known by many systematic and/or tradenames.
 
  Chemical  name  searching  can  be  accomplished  by storing and indexing
carefully  defined  name fragments, as well as indexing the complex strings
of  the complete chemical names. Searching can be performed on a partial or
complete chemical name query using standard relational database technology.
  To  optimize  the  search,  the query is degenerated into its constituent
chemical  terms.  The  terms  are sorted in ascending order by frequency of
occurrence  found by looking up the number of compounds having a particular
term  in a stored table. This stored table is created by scanning all names
of  structures upon registration, and storing frequency information in that
table. Thus, this table acts as an index to chemical name fragments.
 
  Given  this  list  of  chemical  terms,  the  search  can be performed by
intersecting  the  resulting  SELECT  statements  or  using  one to drive a
correlated subquery.
 
  Since the chemical name information is handled entirely by the relational
database,  the data is then easily integrated with the rest of the chemical
information.
 
VIII. MOLECULAR FORMULA SEARCHING AND KEY SEARCHING
 
  Molecular formula can be done by using standard SQL string search methods
on  all  or  part of the formula. Key searching (lookup by identifier) is a
standard SQL operation.
 
IX. DATA INTEGRATION AND IMPORT/EXPORT OF DATA
 
  A  significant  advantage  of  basing  a chemical information system in a
relational  database  is  the  ease  with  which  the structure data can be
combined  with  related  data,  resulting in a complete, integrated system.
This allows information in other systems to be easily imported and exported
into the RDBMS using standard RDBMS functionality.
 
X. DYNAMIC QUERIES
 
  As  is  true  with  all  relational  databases,  the design of the system
decomposes  into  a  series  of entities, relationships, and functions. The
relationships  among  entities  are  rigorously  defined  since referential
integrity  is  the  cornerstone  of  relational database design. A chemical
information system implemented using relational technology must be designed
with these considerations in mind.
 
  A  natural  relationship  exists  between the database structures and the
resulting  substructure  searches  with  each  search resulting in a set of
compound  identifiers. In the present invention, a relational table is used
to store the set of identifiers during each search of the database. This is
implemented  by  creating  a  table  to store general information about the
query (current user, date, query structure, options and search statistics).
A  related  table  is  created to store the identifiers of those structures
matching the query.
 
  As  new  structures  are  registered in the system, the set of structures
identified  as  resulting  from  earlier queries becomes obsolete since the
structure  database  contains  structures  not  present  at the time of the
original  search. This is, in a sense, a violation of referential integrity
because the relationship between structures and queries are not maintained.
 
  In  the  present  invention,  however,  the concept of dynamic queries is
introduced.  When  a  new  structure is registered, the system will examine
those  queries  designated  as  dynamic, and will add the identifier to the
search  result set for each query matching the new structure. This process,
made   simple   by  relational  technology,  allows  the  system  to  offer
functionality never before available in chemical information systems.
 
  Dynamic  queries  are  analogous to relational views. That is, they allow
searches  of  the database to be stored as objects in the database that are
always   current.  The  following  example  will  illustrate  the  type  of
functionality made possible by dynamic queries.
 
  As  shown  in  FIG.  8,  when  a  user  performs  a  search  at  step  60
(substructure,  chemical  name,  molecular  formula), the system stores the
resulting  set  of  compound  identifiers  at step 62. As the user examines
these  compounds,  the  system flags each compound as having been viewed by
the user at step 64. Therefore, the system always knows the search results,
and the extent to which the user has reviewed them.
 
  A  user  interested  in a particular class of structures (e.g., steroids)
would   perform  a  search  once  and  designate  the  search  as  dynamic.
Thereafter,  the  search will be maintained automatically by the system. In
fact,  the  system  would  notify  the  user  whenever  a  new  steroid was
registered  in  the  system  at  step 66. This is done by having the system
perform  all  dynamic  queries  on  any  newly registered molecule as it is
registered  into  the database at step 68 and notifying the user if a match
occurs  at  step 70. The user could then view the previously unseen results
at  step  72  without  having  to  repeat the query or view previously seen
results.
 
  While  dynamic  queries  would  not  have much importance with relatively
static  databases,  they  would  have  many  uses  in  the system serving a
research environment. Heavy use of dynamic queries could require allocation
of   significant   amounts  of  disk  space  for  storing  search  results.
Additionally,   the  performance  of  the  registration  process  could  be
adversely affected by the presence of a large number of dynamic queries.
 
  These  potential  problems  can  be  controlled  by the introduction of a
resource  allocation  system  with each user being assigned two quotas. The
first  quota  controls the number of dynamic queries that the user can have
active  at  any one time, which will protect performance at the time of the
registration  process.  The  second  quota will control the total number of
structure  identifiers  that  each  user  has  stored by dynamic queries to
conserve disk space.
 
  Alternatively,  ushers  would  be allowed to disable these quota systems,
but  this  may  slow  the  system  during  the registration process, or may
exhaust disk space for the database.
 
XI. STRUCTURE CLASSES
 
  The  division  of  chemical  structures into classes based on overlapping
criteria  (e.g., functional groups, ring systems) have long been used as an
organizational   technique   in  chemistry.  Chemical  information  systems
typically  provide for this class system by allowing users to intersect the
results  of  different  searches.  While  this  intersection is a necessary
feature  of  any  chemical  information  system,  it  does  not address the
fundamental  importance of the classification schemes used in chemistry. In
the  present  invention,  a  mechanism will be provided for maintaining any
number  of  classification  schemes  in  the database for structures. These
schemes  or structure classes can be privately defined by individual users,
or can be used as a system-wide search aid.
 
  A  "structure  class"  is  defined  to  be a set of structure identifiers
resulting  from  a substructure search, a chemical name search, a molecular
formula  search,  or  by a combination of these searches. Structure classes
are  an  application  of dynamics queries used to limit the scope of future
searches.  For  example,  the  system  may  maintain  a structure class for
steroids.  When  a user performs a search, he or she can designate that the
result   should  be  restricted  to  the  members  of  the  steroid  class.
Accordingly, the user could simply query the database for all steroids that
have a particular substructure without drawing the entire steroid ring.
 
  This  results  in two primary benefits: The first benefit is that queries
are  simplified,  i.e.,  there  is no need to draw complex queries, and the
second  benefit  is  that the screening phase need only be applied to those
compounds already known to be members of the structure class.
 
  Dynamic  queries and structure classes both exhibit a common benefit--the
overhead involved in structure searching is encountered only once (when the
dynamic  query  or  structure  class is defined) and additional overhead is
distributed  evenly  across  subsequent  updates  to the chemical structure
database.
 
  From the preceding description, it is evident that the invention has been
described in detail by reference to a particular embodiment adapted for use
in  the  field of chemistry. Although this invention offers many advantages
in  this  field,  it  may be used in other fields wherein structure data is
stored  advantageously as well. Accordingly, this invention is not intended
to  be  limited by the details of the preferred embodiment described above,
but rather by the terms of the appended claims.
?
 
 
 e as=scriptgen p

Ref   Items  Index-term
E1        1  AR=952
E2        1  AR=972
E3        0 *AS=SCRIPTGEN P
E4   236007  AU=A
E5        5  AU=A ADOLPH AMEN RA
E6        5  AU=A ADOLPH AMEN-RA
E7        1  AU=A AE BUTTED LYNG DORRIT A
E8        1  AU=A AE LYNG DORRIT A
E9        1  AU=A COSTA ANTHONY
E10       1  AU=A COURT ANDY
E11       1  AU=A HAANS P
E12       4  AU=A HEARN MICHAEL A
 
          Enter P or PAGE for more
?
 
 
 e pa=scriptgen

Ref   Items  Index-term
E1        9  PA=SCRIPTEL CORPORATION
E2        1  PA=SCRIPTEL HOLDING INC
E3       18 *PA=SCRIPTGEN
E4       18  PA=SCRIPTGEN PHARMACEUTICALS INC
E5       31  PA=SCRIPTO
E6       28  PA=SCRIPTO INC
E7        1  PA=SCRIPTO LLC
E8        1  PA=SCRIPTO TOKAI CORP
E9        1  PA=SCRIPTO TOKAI CORPORATION
E10       1  PA=SCRIPTO TOKAI INC
E11       1  PA=SCRIPTO-TOKAI CORPORATION
E12       1  PA=SCRIPTO-TOKAI INC
 
          Enter P or PAGE for more
?
 
 
 S E18;TYPE /PA,PN/ALL

      S3       9  PA="SCRIPTPRO"
 
  3/PA,PN/1     (Item 1 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,161,721
ISSUED:      December 19, 2000 (20001219)
ASSIGNEE(s): Scriptpro LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/2     (Item 2 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,155,485
ISSUED:      December 05, 2000 (20001205)
ASSIGNEE(s): Scriptpro LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/3     (Item 3 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,085,938
ISSUED:      July 11, 2000 (20000711)
ASSIGNEE(s): Scriptpro LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/4     (Item 4 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,897,024
ISSUED:      April 27, 1999 (19990427)
ASSIGNEE(s): Scriptpro LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/5     (Item 5 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,873,488
ISSUED:      February 23, 1999 (19990223)
ASSIGNEE(s): ScriptPro, LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/6     (Item 6 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,860,563
ISSUED:      January 19, 1999 (19990119)
ASSIGNEE(s): Scriptpro, LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/7     (Item 7 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,798,020
ISSUED:      August 25, 1998 (19980825)
ASSIGNEE(s): Scriptpro, LLC, (A U.S. Company or Corporation), Mission, KS
             (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/8     (Item 8 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,762,235
ISSUED:      June 09, 1998 (19980609)
ASSIGNEE(s): ScriptPro, L L C , (A U.S. Company or Corporation), Shawnee
             Mission, KS (Kansas), US (United States of America)
             [Assignee Code(s): 46711]
 
 
  3/PA,PN/9     (Item 9 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,713,487
ISSUED:      February 03, 1998 (19980203)
ASSIGNEE(s): ScriptPro L L C , (A U.S. Company or Corporation), Shawnee
             Mission, KS (Kansas), US (United States of America)
?
 
 
 S E4/PA,PN/ALL

>>>Possible typing error near /
?
 
 
 S E4; T /PA,PN/ALL

      S4      18  PA="SCRIPTGEN PHARMACEUTICALS INC"
 
  4/PA,PN/1     (Item 1 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,239,291
ISSUED:      May 29, 2001 (20010529)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Boston, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/2     (Item 2 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,228,882
ISSUED:      May 08, 2001 (20010508)
ASSIGNEE(s): Anadys Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Boston, MA (Massachusetts), US (United States of America)
             Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527;  56074]
 
 
  4/PA,PN/3     (Item 3 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,194,399
ISSUED:      February 27, 2001 (20010227)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/4     (Item 4 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,184,393
ISSUED:      February 06, 2001 (20010206)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Boston, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/5     (Item 5 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,165,998
ISSUED:      December 26, 2000 (20001226)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/6     (Item 6 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,140,361
ISSUED:      October 31, 2000 (20001031)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/7     (Item 7 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,127,551
ISSUED:      October 03, 2000 (20001003)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Boston, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/8     (Item 8 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,090,953
ISSUED:      July 18, 2000 (20000718)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Boston, MA (Massachusetts), US (United States of
             America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Waltham, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/9     (Item 9 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,051,373
ISSUED:      April 18, 2000 (20000418)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             University of Massachusetts Medical Center, (A U.S. Company or
             Corporation), Worcester, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 22237;  40527]
 
 
  4/PA,PN/10     (Item 10 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,022,983
ISSUED:      February 08, 2000 (20000208)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Boston, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/11     (Item 11 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,020,488
ISSUED:      February 01, 2000 (20000201)
ASSIGNEE(s): Scriptgen Pharmaceuticals Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/12     (Item 12 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,004,779
ISSUED:      December 21, 1999 (19991221)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/13     (Item 13 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,986,111
ISSUED:      November 16, 1999 (19991116)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             Trustees of Boston University, (A U.S. Company or Corporation)
             , Boston, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 1308;  40527]
 
 
  4/PA,PN/14     (Item 14 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,942,547
ISSUED:      August 24, 1999 (19990824)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Waltham, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/15     (Item 15 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,760,063
ISSUED:      June 02, 1998 (19980602)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Medford, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/16     (Item 16 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,679,582
ISSUED:      October 21, 1997 (19971021)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Medford, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/17     (Item 17 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,668,165
ISSUED:      September 16, 1997 (19970916)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc , (A U.S. Company or
             Corporation), Medford, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
 
 
  4/PA,PN/18     (Item 18 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,585,277
ISSUED:      December 17, 1996 (19961217)
ASSIGNEE(s): Scriptgen Pharmaceuticals, Inc, (A U.S. Company or
             Corporation), Medford, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 40527]
?
 
 
 e pa=cubist p

Ref   Items  Index-term
E1        1  PA=CUBIFORM DESIGN AND DEV CO INC CN
E2       14  PA=CUBIST
E3        0 *PA=CUBIST P
E4       14  PA=CUBIST PHARMACEUTICALS INC
E5        3  PA=CUBIT
E6        2  PA=CUBIT CORP
E7        2  PA=CUBIT CORPORATION
E8        1  PA=CUBIT LIMITED
E9        1  PA=CUBIT LTD GB
E10      10  PA=CUBITAL
E11      10  PA=CUBITAL LTD
E12      10  PA=CUBITAL LTD IL
 
          Enter P or PAGE for more
?
 
 
 S E4; T/PN,PA/ALL

      S5      14  PA="CUBIST PHARMACEUTICALS INC"
 
  5/PN,PA/1     (Item 1 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,221,640
ISSUED:      April 24, 2001 (20010424)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/2     (Item 2 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,174,713
ISSUED:      January 16, 2001 (20010116)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/3     (Item 3 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  6,153,645
ISSUED:      November 28, 2000 (20001128)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/4     (Item 4 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,912,140
ISSUED:      June 15, 1999 (19990615)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/5     (Item 5 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,885,815
ISSUED:      March 23, 1999 (19990323)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/6     (Item 6 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,871,987
ISSUED:      February 16, 1999 (19990216)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/7     (Item 7 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,824,657
ISSUED:      October 20, 1998 (19981020)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/8     (Item 8 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,801,013
ISSUED:      September 01, 1998 (19980901)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/9     (Item 9 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,798,240
ISSUED:      August 25, 1998 (19980825)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/10     (Item 10 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,759,833
ISSUED:      June 02, 1998 (19980602)
ASSIGNEE(s): Cancer Institute, Japanese Foundation for Cancer Research, (A
             Non-U.S. Company or Corporation), Tokyo, JP (Japan)
             Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 12629;  33923;  41839]
 
 
  5/PN,PA/11     (Item 11 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,756,327
ISSUED:      May 26, 1998 (19980526)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/12     (Item 12 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,726,195
ISSUED:      March 10, 1998 (19980310)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/13     (Item 13 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,656,470
ISSUED:      August 12, 1997 (19970812)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc, (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,PA/14     (Item 14 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
PATENT NO.:  5,629,188
ISSUED:      May 13, 1997 (19970513)
ASSIGNEE(s): Cancer Institute, Japanese Foundation for Cancer Research, (A
             Non-U.S. Company or Corporation), Tokyo, JP (Japan)
             Cubist Pharmaceuticals, Inc, (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             Massachusetts Institute of Technology, (A U.S. Company or
             Corporation), Cambridge, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 12629;  33923;  41839;  52912]
?
 
 
 TYPE /PN,TI,PA/ALL


  5/PN,TI,PA/1     (Item 1 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
ENTEROCOCCAL  AMINOACYL-TRNA SYNTHETASE PROTEINS, NUCLEIC ACIDS AND STRAINS
COMPRISING SAME
 
PATENT NO.:  6,221,640
ISSUED:      April 24, 2001 (20010424)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/2     (Item 2 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
CANDIDA  CYTOPLASMIC  TRYPTOPHANYL-TRNA  SYNTHETASE PROTEINS, NUCLEIC ACIDS
AND STRAINS COMPRISING SAME
 
PATENT NO.:  6,174,713
ISSUED:      January 16, 2001 (20010116)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/3     (Item 3 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
HETEROCYCLES AS ANTIMICROBIAL AGENTS
 
PATENT NO.:  6,153,645
ISSUED:      November 28, 2000 (20001128)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/4     (Item 4 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
RECOMBINANT  PNEUMOCYSTIS  CARINII  AMINOACYL TRNA SYNTHETASE GENES, TESTER
STRAINS AND ASSAYS
[ Isolated  nucleic  acid  which codes a functional portion of a lysyl-tRNA
synthetase; having catalytic acitivity and binding function; drug screening
for AIDS therapy]
 
PATENT NO.:  5,912,140
ISSUED:      June 15, 1999 (19990615)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/5     (Item 5 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
CANDIDA  ISOLEUCYL-TRNA  SYNTHETASE  PROTEINS,  NUCLEIC  ACIDS  AND STRAINS
COMPRISING SAME
[Nucleic acid encoding Candida isoleucyl-tRNA synthetase]
 
PATENT NO.:  5,885,815
ISSUED:      March 23, 1999 (19990323)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/6     (Item 6 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
CANDIDA   TYROSYL-TRNA  SYNTHETASE  PROTEINS,  NUCLEIC  ACIDS  AND  STRAINS
COMPRISING SAME
 
PATENT NO.:  5,871,987
ISSUED:      February 16, 1999 (19990216)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/7     (Item 7 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
AMINOACYL SULFAMIDES FOR THE TREATMENT OF HYPERPROLIFERATIVE DISORDERS
[Anticarcinogenic agents, skin disorders and psoriasis]
 
PATENT NO.:  5,824,657
ISSUED:      October 20, 1998 (19981020)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/8     (Item 8 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
HELICOBACTER  AMINOACYL-TRNA SYNTHETASE PROTEINS, NUCLEIC ACIDS AND STRAINS
COMPRISING SAME
 
PATENT NO.:  5,801,013
ISSUED:      September 01, 1998 (19980901)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/9     (Item 9 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
RECOMBINANT  MYCOBACTERIAL  METHIONYL-TRNA  SYNTHETASE GENES AND METHODS OF
USE THEREFORE
[Isolated nucleic acid encoding aminoacyl-transfer RNA synthetase]
 
PATENT NO.:  5,798,240
ISSUED:      August 25, 1998 (19980825)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/10     (Item 10 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
HUMAN  ISOLEUCYL-TRNA SYNTHETASE PROTEINS, NUCLEIC ACIDS AND TESTER STRAINS
COMPRISING SAME
 
PATENT NO.:  5,759,833
ISSUED:      June 02, 1998 (19980602)
ASSIGNEE(s): Cancer Institute, Japanese Foundation for Cancer Research, (A
             Non-U.S. Company or Corporation), Tokyo, JP (Japan)
             Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 12629;  33923;  41839]
 
 
  5/PN,TI,PA/11     (Item 11 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
RECOMBINANT  MYCOBACTERIAL  ISOLEUCYL-TRNA SYNTHETASE GENES, TESTER STRAINS
AND ASSAYS
[ Isolated  nucleic acid encoding isoleucyl-transferRNA synthetase of genus
Mycobacterium]
 
PATENT NO.:  5,756,327
ISSUED:      May 26, 1998 (19980526)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/12     (Item 12 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
AMINOACYL ADENYLATE MIMICS AS NOVEL ANTIMICROBIAL AND ANTIPARASITIC AGENTS
 
PATENT NO.:  5,726,195
ISSUED:      March 10, 1998 (19980310)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc , (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/13     (Item 13 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
RECOMBINANT  MYCOBACTERIAL  SERYL-TRNA SYNTHETASE GENES, TESTER STRAINS AND
ASSAYS
 
PATENT NO.:  5,656,470
ISSUED:      August 12, 1997 (19970812)
ASSIGNEE(s): Cubist Pharmaceuticals, Inc, (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             [Assignee Code(s): 41839]
 
 
  5/PN,TI,PA/14     (Item 14 from file: 654) 
DIALOG(R)File 654:(c) format only 2001 The Dialog Corp. All rts. reserv.
 
HUMAN  ALANYL-TRNA  SYNTHETASE  PROTEINS,  NUCLEIC ACIDS AND TESTER STRAINS
COMPRISING SAME
[ Methods  for  making human alanyl-tRNA synthetase, e.g., maintaining host
cells  comprising  recombinant gene encoding synthetase under conditions in
which gene is expressed, isolating synthetase; related vectors, plasmid]
 
PATENT NO.:  5,629,188
ISSUED:      May 13, 1997 (19970513)
ASSIGNEE(s): Cancer Institute, Japanese Foundation for Cancer Research, (A
             Non-U.S. Company or Corporation), Tokyo, JP (Japan)
             Cubist Pharmaceuticals, Inc, (A U.S. Company or Corporation),
             Cambridge, MA (Massachusetts), US (United States of America)
             Massachusetts Institute of Technology, (A U.S. Company or
             Corporation), Cambridge, MA (Massachusetts), US (United States
             of America)
             [Assignee Code(s): 12629;  33923;  41839;  52912]
?
 
 
 B PATENTS

>>>         123 is unauthorized
>>>         340 is unauthorized
>>>         344 is unauthorized
>>>         345 is unauthorized
>>>         351 is unauthorized
>>>         352 is unauthorized
>>>         353 is unauthorized
>>>         447 is unauthorized
>>>         670 is unauthorized
>>>9 of the specified files are not available
       19sep01 18:16:28 User726734 Session D434.3
            $0.32    0.212 DialUnits File653
               $0.00  1 Type(s) in Format  2
               $0.00  1 Type(s) in Format  9 (UDF)
            $0.00  2 Types
     $0.32  Estimated cost File653
            $0.18    0.120 DialUnits File652
     $0.18  Estimated cost File652
            $4.98    3.321 DialUnits File654
               $0.00  2 Type(s) in Format  2
               $0.00  55 Type(s) in Format  2 (UDF)
               $0.00  2 Type(s) in Format  9 (UDF)
            $0.00  59 Types
     $4.98  Estimated cost File654
            OneSearch, 3 files,  3.654 DialUnits FileOS
     $0.60  INTERNET
     $6.08  Estimated cost this search
     $6.87  Estimated total session cost   4.149 DialUnits
 
SYSTEM:OS  - DIALOG OneSearch
  File 342:Derwent Patents Citation Indx  1978-01/200147
         (c) 2001 Derwent Info Ltd
 *File 342: Price changes as of 1/1/01.  Please see HELP RATES 342. 
  File 347:JAPIO  OCT 1976-2001/May(UPDATED 010905)
         (c) 2001 JPO & JAPIO
 *File 347: JAPIO data problems with year 2000 records are now fixed. 
Alerts have been run.  See HELP NEWS 347 for details.
  File 348:EUROPEAN PATENTS  1978-2001/Sep W02
         (c) 2001 European Patent Office
  File 349:PCT Fulltext  1983-2001/UB=20010906, UT=20010830
         (c) 2001 WIPO/MicroPat
  File 371:French Patents  1961-2001/BOPI 200136
         (c) 2001 INPI.  All rts. reserv.
  File 652:US Patents Fulltext  1971-1979
         (c) format only 2001 The Dialog Corp.
 *File 652: Reassignment data current through June 6, 2001 recordings. 
Due to processing problems, the SORT command is not working.
  File 653:US Patents Fulltext  1980-1989
         (c) format only 2001 The Dialog Corp.
 *File 653: Reassignment data current through June 6, 2001 recordings. 
Due to processing problems, the SORT command is not working.
  File 654:US PAT.FULL.  1990-2001/Sep 18
         (c) format only 2001 The Dialog Corp.
 *File 654: Reassignment data current through June 6, 2001 recordings 
 
      Set  Items  Description
      ---  -----  -----------
?
 
 
 B 342

       19sep01 18:16:47 User726734 Session D434.4
            $0.03    0.022 DialUnits File342
     $0.03  Estimated cost File342
            $0.03    0.022 DialUnits File347
     $0.03  Estimated cost File347
            $0.03    0.022 DialUnits File348
     $0.03  Estimated cost File348
            $0.03    0.022 DialUnits File349
     $0.03  Estimated cost File349
            $0.03    0.022 DialUnits File371
     $0.03  Estimated cost File371
            $0.03    0.022 DialUnits File652
     $0.03  Estimated cost File652
            $0.03    0.022 DialUnits File653
     $0.03  Estimated cost File653
            $0.03    0.022 DialUnits File654
     $0.03  Estimated cost File654
            OneSearch, 8 files,  0.173 DialUnits FileOS
     $0.02  INTERNET
     $0.26  Estimated cost this search
     $7.13  Estimated total session cost   4.321 DialUnits
 
File 342:Derwent Patents Citation Indx  1978-01/200147
       (c) 2001 Derwent Info Ltd
 *File 342: Price changes as of 1/1/01.  Please see HELP RATES 342. 
 
      Set  Items  Description
      ---  -----  -----------
?
 
?
 
 
 S PN=US 4642762 OR PN=US 5577239 OR PN=US 5950192

               1  PN=US 4642762
               1  PN=US 5577239
               1  PN=US 5950192
      S2       2  PN=US 4642762 OR PN=US 5577239 OR PN=US 5950192
?
 
 
 TYPE 2/2/ALL


  2/2/1 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
02412679  WPI Acc No: 96-151525/15
 Chemical structure storage method using relational database - storing 
 matrix of chemical structure including atoms and bonds in relational 
 database table and generating and storing search keys for each atom in 
 chemical structure 
Patent Assignee: (PSII-) PSI INT INC
Author (Inventor): MOORE J; BRAZIL J; HOOVER J R
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  WO 9606391     A2 960229 (BASIC)  None
  AU 9533202     A  960314
  EP 777882      A1 970611
  EP 777882      A4 971229
  JP 3193383     B2 010730
  JP 10507285    W  980714
  US 5577239     A  961119 364/496; 364/497; 364/499; 364/DIG.1; 364/DIG.2;
                               395/600
  US 5950192     A  990907 395/496; 395/497; 395/499; 395/600; 395/603;
                               702/19; 702/20; 702/22; 702/27; 707/1;
                               707/100; 707/102; 707/104; 707/19; 707/20;
                               707/22; 707/27; 707/3
  WO 9606391     A3 960509 364/496; 364/497; 364/499; 395/600
Derwent Week (Basic): 9615
Priority Data: US 288503 (940810)
Applications:  US 288503 (940810); AU 9533202 (950810); EP 95929457 (950810
    ); WO 95US10171 (950810); JP 96508133 (950810); US 883165 (970626)
Designated States
   (National): AM; AT; AU; BB; BG; BR; BY; CA; CH; CN; CZ; DE; DK; ES; FI;
     GB; GE; HU; JP; KE; KG; KP; KR; KZ; LK; LT; LU; LV; MD; MG; MN; MW; MX
     ; NO; NZ; PL; PT; RO; RU; SD; SE; SI; SK; TJ; TT; UA; US; UZ; VN
   (Regional): AT; BE; CH; DE; DK; ES; FR; GB; GR; IE; IT; KE; LI; LU; MC;
     MW; NL; OA; PT; SD; SE; SZ; UG
Derwent Class: T01
Int Pat Class: G06F-017/30
Number of Patents: 009
Number of Countries: 060
Number of Cited Patents: 036
Number of Cited Literature References: 039
Number of Citing Patents: 006
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 777882   A  EP 90895    A  A      83-790076/42 (NIIN-) JAPAN INFORM CENT
                          /ARAKI K
EP 777882   A  EP 213483   A  A      87-066235/10 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
EP 777882   A  US 4642762  A  A      87-056518/08 (AMCH-) AMER CHEMICAL SOC
                          /FISANICK W
US 5577239  A  US 4642762  A         87-056518/08 (AMCH-) AMER CHEMICAL SOC
                          /FISANICK W
US 5577239  A  US 4811217  A         86-259665/40 (NIAS-) JAPAN ASSOC INT
                          CHE/TOKIZANE S; CHIHARA H
US 5577239  A  US 4855931  A         90-014420/02 (UYYA ) UNIV YALE/
                          SAUNDERS M
US 5577239  A  US 5025388  A         91-200832/27 (CRAM/) CRAMER R D/CRAMER
                          R D; WOLD S V
US 5577239  A  US 5056035  A         91-317888/43 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
US 5577239  A  US 5249137  A         91-283380/39 (XERO ) XEROX CORP/KAPLAN
                          S; MALLGREEN W R; FACCI J S; DONALDSON J M
US 5577239  A  US 5367058  A         95-005895/01 (BECT ) BECTON DICKINSON
                          CO/MIZE P D; PITNER J B; LINN
US 5577239  A  US 5379234  A         91-283380/39 (XERO ) XEROX CORP/KAPLAN
                          S; MALLGREEN W R; FACCI J S; DONALDSON J M
US 5577239  A  US 5386507  A         95-081830/11 (KAHN/) KAHN S D; (TEIG/)
                          TEIG S L/TEIG S L; KAHN S D
US 5577239  A  US 5418944  A         92-260461/32 (IBMC ) INT BUSINESS
                          MACHINES CORP; (IBMC ) IBM SEMEA SRL/FABROCINI F;
                          DI PACE L
US 5577239  A  US 5463564  A         95-392190/50 (THRE-) 3-DIMENSIONAL
                          PHARM INC/BONE R F; AGRAFIOTIS D K; SALEMME F R;
                          SOLL R M
US 5950192  A  EP 90895    A2        83-790076/42 (NIIN-) JAPAN INFORM CENT
                          /ARAKI K
US 5950192  A  EP 213483   A2        87-066235/10 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
US 5950192  A  US 4642762  A         87-056518/08 (AMCH-) AMER CHEMICAL SOC
                          /FISANICK W
US 5950192  A  US 4811217  A         86-259665/40 (NIAS-) JAPAN ASSOC INT
                          CHE/TOKIZANE S; CHIHARA H
US 5950192  A  US 4855931  A         90-014420/02 (UYYA ) UNIV YALE/
                          SAUNDERS M
US 5950192  A  US 5025388  A         91-200832/27 (CRAM/) CRAMER R D/CRAMER
                          R D; WOLD S V
US 5950192  A  US 5056035  A         91-317888/43 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
US 5950192  A  US 5259137  A         93-367341/46 (BLAS-) BLASER
                          JAGDWAFFENFABRIK HORST/BLENK G; ZEH M
US 5950192  A  US 5367058  A         95-005895/01 (BECT ) BECTON DICKINSON
                          CO/MIZE P D; PITNER J B; LINN
US 5950192  A  US 5379234  A         91-283380/39 (XERO ) XEROX CORP/KAPLAN
                          S; MALLGREEN W R; FACCI J S; DONALDSON J M
US 5950192  A  US 5386507  A         95-081830/11 (KAHN/) KAHN S D; (TEIG/)
                          TEIG S L/TEIG S L; KAHN S D
US 5950192  A  US 5418944  A         92-260461/32 (IBMC ) INT BUSINESS
                          MACHINES CORP; (IBMC ) IBM SEMEA SRL/FABROCINI F;
                          DI PACE L
US 5950192  A  US 5463564  A         95-392190/50 (THRE-) 3-DIMENSIONAL
                          PHARM INC/BONE R F; AGRAFIOTIS D K; SALEMME F R;
                          SOLL R M
US 5950192  A  US 5577239  A         96-151525/15 (PSII-) PSI INT INC/MOORE
                          J; BRAZIL J; HOOVER J R
WO 9606391  A  US 4811217  A  X      86-259665/40 (NIAS-) JAPAN ASSOC INT
                          CHE/TOKIZANE S; CHIHARA H
WO 9606391  A  US 4855931  A  A      90-014420/02 (UYYA ) UNIV YALE/
                          SAUNDERS M
WO 9606391  A  US 5025388  A  A      91-200832/27 (CRAM/) CRAMER R D/CRAMER
                          R D; WOLD S V
WO 9606391  A  US 5056035  A  A      91-317888/43 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
WO 9606391  A3 US 4811217  A  X      86-259665/40 (NIAS-) JAPAN ASSOC INT
                          CHE/TOKIZANE S; CHIHARA H
WO 9606391  A3 US 4855931  A  A      90-014420/02 (UYYA ) UNIV YALE/
                          SAUNDERS M
WO 9606391  A3 US 5025388  A  A      91-200832/27 (CRAM/) CRAMER R D/CRAMER
                          R D; WOLD S V
WO 9606391  A3 US 5056035  A  A      91-317888/43 (FUJF ) FUJI PHOTO FILM
                          CO LTD/FUJITA S
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Inventor:
WO 9606391  A2         Oracle Relational Database Management System by
                       Oracle Corporation, World Headquarters, 500 Oracle
                       Pkwy., Redwood Shores, CA 94065
WO 9606391  A2         Molecular Access System (MACCS) created by Molecular
                       Design Ltd., MDL Information Systems, 14600 Catalina
                       Street, San Leandro, CA 94577
WO 9606391  A2         Integrated Scientific Information System (ISIS)
                       created by Molecular Design Ltd., MDL Information
                       Systems, 14600 Catalina Street, San Leandro, CA
                       94577
EP 777882   A          JOURNAL OF CHEMOMETRICS, ENGLAND GB, XP002040815
EP 777882   A          See also references of WO 9606391A3
US 5577239  A          Viking Instruments Corp. (Hewlett Packard); Spectra
                       Trak Transportable GC/MS System; (brochure), No
                       date.
US 5577239  A          Chemical Structure, The International Language of
                       Chemistry; Wendy A. War (Ed.); "Interfacing
                       DARC-Oracle" AJCM (Juus) de Jong (1988).
US 5577239  A          J. Chem. Inf. Comput. Sci. (1983), vol. 23, No. 3
                       pp. 102-108; DARC Substructure Search System; A New
                       Approach to Chemical Information; Roger Attias.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2;
                       pp. 74-82; DARC System; Notions of Defined and
                       Generic Substructures. Filiation and Coding of FREL
                       Substructure (SS) Classes; Jacques-Emile Dubois et
                       al.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1990), vol. 30, No. 2;
                       pp. 191-199, Substructure Search Systems, 1,
                       Performance Comparison of the MACCS, DARC, HTSS, CAS
                       Registry MVSSS, and S4 Substructure Search Systems;
                       Martin G. Hicks.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1988), vol. 28, No. 4;
                       pp. 221-226; An Efficient Graph Approach to Matching
                       Chemical Structures, O. Owolabi.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1990), vol. 30, No. 4;
                       pp. 332-339; Reactions in the Bellstein Information
                       System: Nonaporic Organic Synthesis; Martin G.
                       Hicks.
US 5577239  A          Analytica Chimica Acta, 235 (1990), pp. 87-92;
                       Substructure Search Systems for Large Chemical Data
                       Bases; Martin G. Hicks et al.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1991), vol. 31, No. 2;
                       pp. 320-326; The Bellstein Structure Registry
                       System, 1, General Design; Laszio Domokos.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4;
                       pp. 255-260; 3DSearch; A System for
                       Three-Dimensional Substructure Searching; Robert P.
                       Sheridan, et al.
US 5577239  A          Substructure Searches of Chemical Structure Files;
                       (Jan. 23, 1973); Strategic Considerations in the
                       Design of a Screening System for Substructure
                       Searches of Chemical Structure Files; George W.
                       Adamson, et al.
US 5577239  A          Chemical Structure Searching; (Jan. 21, 1975); An
                       Efficient Design for Chemical Structure Searching,
                       I, The Screens; Alfred Feldman et al.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1982), vol. 22, No. 4;
                       The Third BASIC Fragment Search Dictionary; W. Graf,
                       H. K. Kaindl, et al.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1983), vol. 23, No. 3;
                       The CAS ONLINE Search System, 1, General System
                       Design and Selection, Generation, and Use of Search
                       Screens; P. G. Dittmar, et al.
US 5577239  A          Computer Chemical, (1991), vol. 15, No. 2; pp.
                       103-107; A Central Atom Based Algorithm and Computer
                       Program for Substructure Search; Alf Dengler and
                       Ivar Ugi.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1993), vol. 33, No. 4;
                       pp. 545-547; Structure Searching in Chemical
                       Databases by Direct Lookup Methods; Bradley D.
                       Christie et al.
US 5577239  A          J. Chem. Inf. Comput. Sci. (1993); vol. 33, No. 4;
                       pp. 539-541; Substructure Searching on Very Large
                       Files by Using Multiple Storage Techniques;
                       Alexander Bartmann et al.
US 5950192  A          Viking Instruments Corp. (Hewlett Packard);
                       SpectraTrak Transportable GS/MS Systems;
                       (brochure)-No Date.
US 5950192  A          Chemical Structures, The International Language of
                       Chemistry; Wendy A. War (Ed.); "Interfacing
                       DARC-Oracle" AJCM (Juus) de Jong (1988).
US 5950192  A          J. Chem. Inf. Comput. Sci. (1983) , vol. 23, No. 3;
                       pp. 102-108; DARC Substructure Search System: A New
                       Approach to Chemical Information; Roger Attias.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1987), vol. 27, No. 2;
                       pp. 74-82; DARC System: Notions of Defined and
                       Generic Substructures. Filiation and Coding of FREL
                       Substructure (SS) Classes; Jacques-Emile Dubois et
                       al.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1990), vol. 30, No. 2;
                       pp. 191-199, Substructure Search Systems. 1.
                       Performance Comparison of the MACCS, DARC, HTSS, and
                       CAS Registry MVSSS, and S4 Substructure Search
                       System; Martin G. Hicks & Clemens.
US 5950192  A          J. Chem. Inf. Comput. sci. (1988), vol. 28, No. 4;
                       pp. 221-226; An Efficient Graph Approach to Matching
                       Chemical Structures, O. Owolabi.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1990), vol. 30, No. 4;
                       pp. 332-339; Reactions in the Beilstein Information
                       System: Nonaporic Organic Synthesis; Martin G.
                       Hicks.
US 5950192  A          Analytica Chimica Acta, 235 (1990), pp. 87-92;
                       Substructure Search Systems for Large Chemical Data
                       Bases; Martin G. Hicks et al.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1991), vol. 31, No. 2;
                       pp. 320-326; The Beilstein Structure Registry
                       System. 1. General Design; Laszio Domokos.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1989), vol. 29, No. 4;
                       pp. 255-260; 3DSearch; A System for
                       Three-Dimensional Substructure Searching; Robert P.
                       Sheridan, et al.
US 5950192  A          Substructure Searches of Chemical Structure Files;
                       (Jan. 23, 1973); Strategic Considerations in the
                       Design of a Screening System for Substructure
                       Searches of Chemical Structure Files; George W.
                       Adamson, et al.
US 5950192  A          Chemical Structure Searching; (Jan. 21, 1975); An
                       Efficient Design for Chemical Structure Searching.
                       I. The Screens; Alfred Feldman et al.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1982), vol. No. 4; The
                       Third BASIC Fragment Search Dictionary; W. Graf, H.
                       K. Kaindl, et al.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1983), vol. 23, No. 3;
                       The CAS Online Search System. 1. General System
                       Design and Selection, Generation, and Use of Search
                       Screens; P. G. Dittmar, et al.
US 5950192  A          Computer Chemical, ((1991), vol. 15, No. 2, pp.
                       103-107; A Central Atom Based Algorithm and Computer
                       Program for Substructure Search; Alf Dengler and
                       Ivar Ugi.
US 5950192  A          J. Chem. Inf. Comput. Sci. (1993), vol. 33, No. 4;
                       pp. 545-547; Sturcture Searching in Chemical
                       Databases by Direct Lookup Methods; Baradley D.
                       Christie et al.
WO 9606391  A          See also references of EP 0777882A4
 
 
  2/2/2 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
00376357  WPI Acc No: 87-056518/08
 Markush structure database system which can handle Markush queries - in 
 which separate specific atom and generic term connection tables are linked 
 to reference data e.g. patent nuMbers 
Patent Assignee: (AMCH-) AMER CHEMICAL SOC
Author (Inventor): FISANICK W
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  US 4642762     A  870210 (BASIC)
Derwent Week (Basic): 8708
Priority Data: US 614219 (840525)
Applications:  US 614219 (840525)
Derwent Class: J04; T01
Int Pat Class: G06F-015/40
Number of Patents: 001
Number of Countries: 001
Number of Cited Patents: 001
Number of Cited Literature References: 000
Number of Citing Patents: 026
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
US 4642762  A  US 4473890  A         83-790076/42 (NIIN-) JAPAN INFORM CENT
                          /ARAKI K
?
 
 
 S PN=US 5917146

      S3       1  PN=US 5917146
?
 
 
 TYPE 3/2/1


  3/2/1 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
03378757  WPI Acc No: 99-059790/05
 Low-smoke pyrotechnic compositions - containing dihydrazino-s-tetrazine or 
 its derivatives or salts, an oxidizing agent and a colourant, useful for 
 firework displays and special effects in the film industry 
Patent Assignee: (REGC ) UNIV CALIFORNIA; (HISK/) HISKEY M A; (CHAV/)
    CHAVEZ D E
Author (Inventor): HISKEY M A; CHAVEZ D E
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  WO 9854113     A1 981203 (BASIC)
  AU 9877092     A  981230
  US 5917146     A  990629 149/36; 149/46; 149/61; 149/76
Derwent Week (Basic): 9905
Priority Data: US 865412 (970529)
Applications:  US 865412 (970529); AU 9877092 (980529); WO 98US11062 (
    980529)
Designated States
   (National): AL; AM; AT; AU; AZ; BA; BB; BG; BR; BY; CA; CH; CN; CU; CZ;
     DE; DK; EE; ES; FI; GB; GE; GH; GM; GW; HU; ID; IL; IS; JP; KE; KG; KP
     ; KR; KZ; LC; LK; LR; LS; LT; LU; LV; MD; MG; MK; MN; MW; MX; NO; NZ;
     PL; PT; RO; RU; SD; SE; SG; SI; SK; SL; TJ; TM; TR; TT; UA; UG; UZ; VN
     ; YU; ZW
   (Regional): AT; BE; CH; CY; DE; DK; EA; ES; FI; FR; GB; GH; GM; GR; IE;
     IT; KE; LS; LU; MC; MW; NL; OA; PT; SD; SE; SZ; UG; ZW
Derwent Class: E13; K04
Int Pat Class: C06B-031/02
Number of Patents: 003
Number of Countries: 082
Number of Cited Patents: 016
Number of Cited Literature References: 001
Number of Citing Patents: 001
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
US 5917146  A  US 3244702  A                      /MARCUS
US 5917146  A  US 3697339  A         71-68351S/43 (MESR )
                          MESSERSCHMITT-BOELKOW-BLOHM GMBH
US 5917146  A  US 3797238  A         74-24569V/13 (UNAC ) UNITED AIRCRAFT
                          CORP
US 5917146  A  US 3940298  A         76-18661X/10 (USNA ) US SEC OF NAVY
US 5917146  A  US 4078954  A         77-06100Y/04 (POUE ) SOC NAT POUDRES &
                          EXPLOSIFS
US 5917146  A  US 5197758  A         93-125801/15 (MORN ) MORTON INT INC/
                          LUND G K; STEVENS M R; EDWARDS W W; SHAW G C
US 5917146  A  US 5198046  A         92-115569/15 (FRAU ) FRAUNHOFER-GES
                          FORD ANGE/BUCERIUS K M; WASMANN F W; MENKE K
US 5917146  A  US 5281706  A         94-042864/05 (USAT ) US DEPT ENERGY/
                          OTT D G; COBURN M D
US 5917146  A  US 5449423  A         94-151167/18 (CIOF/) CIOFFE A/CIOFFE A
US 5917146  A  US 5472534  A         96-029697/03 (THIO ) THIOKOL CORP/
                          WARDLE R B; BLAU R J
US 5917146  A  US 5525166  A         94-237199/29 (STFI-) STANDARD
                          FIREWORKS LTD/COOK B
WO 9854113  A  US 3797238  A  A      74-24569V/13 (UNAC ) UNITED AIRCRAFT
                          CORP
WO 9854113  A  US 3940298  A  A      76-18661X/10 (USNA ) US SEC OF NAVY
WO 9854113  A  US 4078954  A  A      77-06100Y/04 (POUE ) SOC NAT POUDRES &
                          EXPLOSIFS
WO 9854113  A  US 5449423  A  A      94-151167/18 (CIOF/) CIOFFE A/CIOFFE A
WO 9854113  A  US 5525166  A  A      94-237199/29 (STFI-) STANDARD
                          FIREWORKS LTD/COOK B
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
US 5917146  A          Marcus et al., Journal of Org. Chem., vol. 28, pp.
                       2372-2378, Sep. 1963.
?
 
 
 S FLUOXETINE

      S4      51  FLUOXETINE
?
 
 
 TYPE 4/TI/1-5


  4/TI/1 
DIALOG(R)File 342:(c) 2001 Derwent Info Ltd. All rts. reserv.
 
 Crystallizing (R)-fluoxetine hydrochloride from a mixed solvent in high 
 enantiomeric purity with high recovery rate ... 
 
 
  4/TI/2 
DIALOG(R)File 342:(c) 2001 Derwent Info Ltd. All rts. reserv.
 
 Preparation of (S)-3-phenyl-3-phenoxy propylamine derivatives e.g. 
 (S)-fluoxetine, comprises converting 3-substituted-1-phenyl propene 
 derivative to racemic epoxide or alcohol, resolving and further reacting 
 pure enantiomer... 
 
 
  4/TI/3 
DIALOG(R)File 342:(c) 2001 Derwent Info Ltd. All rts. reserv.
 
 Production of fluoxetine, useful as antidepressant, from 
 3-halo-1-phenylpropan-1-ol without use of protecting groups... 
 
 
  4/TI/4 
DIALOG(R)File 342:(c) 2001 Derwent Info Ltd. All rts. reserv.
 
 Composition comprising R(-)-fluoxetine, mostly free of the S(+)-isomer, and 
 9-hydroxy-risperidone, for the treatment of psychotic or psychiatric 
 disorders in children, e.g. anxiety disorder, depression, Tourette's 
 disorder... 
 
 
  4/TI/5 
DIALOG(R)File 342:(c) 2001 Derwent Info Ltd. All rts. reserv.
 
 N,N-dimethyl-(3-phenyl-3-((4-trifluoromethyl)-phenoxy)-propylamine)-p-tolue 
 ne sulfonate is a new intermediate for the synthesis of fluoxetine... 
?
 
 
 S S4 AND LILLY

              51  S4
               0  LILLY
      S5       0  S4 AND LILLY
?
 
 
 S S4 AND PA=LILLY

              51  S4
            3949  PA=LILLY
      S6      18  S4 AND PA=LILLY
?
 
 
 TYPE 6/2/1-5


  6/2/1 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
04371800  WPI Acc No: 01-031829/04
 Crystallizing (R)-fluoxetine hydrochloride from a mixed solvent in high 
 enantiomeric purity with high recovery rate  - 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): BRENNAN J; DISEROAD W D; HAY L A; MITCHELL D
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  WO 200068182   A1 001116 (BASIC)
  AU 200047984   A  001121
Derwent Week (Basic): 0104
Priority Data: US 133264 (990510)
Applications:  AU 200047984 (000428); WO 2000US9801 (000428)
Designated States
   (National): AE; AG; AL; AM; AT; AU; AZ; BA; BB; BG; BR; BY; CA; CH; CN;
     CR; CU; CZ; DE; DK; DM; DZ; EE; ES; FI; GB; GD; GE; GH; GM; HR; HU; ID
     ; IL; IN; IS; JP; KE; KG; KP; KR; KZ; LC; LK; LR; LS; LT; LU; LV; MA;
     MD; MG; MK; MN; MW; MX; NO; NZ; PL; PT; RO; RU; SD; SE; SG; SI; SK; SL
     ; TJ; TM; TR; TT; TZ; UA; UG; US; UZ; VN; YU; ZA; ZW
   (Regional): AT; BE; CH; CY; DE; DK; EA; ES; FI; FR; GB; GH; GM; GR; IE;
     IT; KE; LS; LU; MC; MW; NL; OA; PT; SD; SE; SL; SZ; TZ; UG; ZW
Derwent Class: B05
Int Pat Class: C07C-213/10
Number of Patents: 002
Number of Countries: 092
Number of Cited Patents: 001
Number of Cited Literature References: 000
Number of Citing Patents: 000
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
WO 200068182A  EP 545425   A  A      93-184173/23 (HMRI ) HOECHST ROUSSEL
                          PHARM INC/EFFLAND R C; KLEIN J T
 
 
  6/2/2 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
04208425  WPI Acc No: 01-015842/02
 Process for preparing racemic fluoxetine, used for treatment of depression, 
 from mixture enriched in either enantiomer comprises reacting the mixture 
 with base in aprotic highly dipolar solvent  - 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): KOENIG T M; MITCHELL D
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  WO 200064855   A1 001102 (BASIC)
  AU 200040100   A  001110
Derwent Week (Basic): 0102
Priority Data: US 131074 (990426)
Applications:  AU 200040100 (000328); WO 2000US6683 (000328)
Designated States
   (National): AE; AG; AL; AM; AT; AU; AZ; BA; BB; BG; BR; BY; CA; CH; CN;
     CR; CU; CZ; DE; DK; DM; DZ; EE; ES; FI; GB; GD; GE; GH; GM; HR; HU; ID
     ; IL; IN; IS; JP; KE; KG; KP; KR; KZ; LC; LK; LR; LS; LT; LU; LV; MA;
     MD; MG; MK; MN; MW; MX; NO; NZ; PL; PT; RO; RU; SD; SE; SG; SI; SK; SL
     ; TJ; TM; TR; TT; TZ; UA; UG; US; UZ; VN; YU; ZA; ZW
   (Regional): AT; BE; CH; CY; DE; DK; EA; ES; FI; FR; GB; GH; GM; GR; IE;
     IT; KE; LS; LU; MC; MW; NL; OA; PT; SD; SE; SL; SZ; TZ; UG; ZW
Derwent Class: B05
Int Pat Class: C07B-055/00
Number of Patents: 002
Number of Countries: 092
Number of Cited Patents: 001
Number of Cited Literature References: 000
Number of Citing Patents: 000
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
WO 200064855A  US 5847214  A  A      99-059172/05 (LAPO-) LAPORTE ORGANICS
                          SPA FRANCIS/ROSSETTI V; BERATTO S G V; AROSIO R
 
 
  6/2/3 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
03268666  WPI Acc No: 98-586218/50
 Fluoxetine enteric pellet - comprising fluoxetine and excipients, optional 
 separating layer, enteric layer of hydroxypropyl methylcellulose acetate 
 succinate and excipients and optional finishing layer 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): ANDERSON N R; HARRISON R G; LYNCH D F; OREN P L
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  GB 2325623     A  981202 (BASIC)  None
  AT 408068      B  010715
  AT 9800931     A  010115
  AU 726690      B  001116
  AU 9869048     A  981203
  BE 1011925     A3 000307
  BR 9801989     A  000208
  CA 2234826     A  981129
  CA 2234826     C  001219
  CN 1200924     A  981209
  CN 1285189     A  010228
  CZ 9801143     A3 981216
  DE 19823940    A1 981203 A61K-009/6; A61K-031/135
  DK 9800728     A  981130
  FI 9800846     A  981130
  FR 2763846     A1 981204
  GB 2325623     B  990414 None
  HU 9800882     A2 000328
  JP 10330253    A  981215
  KR 98086622    A  981205
  MX 9803636     A1 990201
  NL 1009259     C2 981201
  NO 9802197     A  981130
  NZ 330192      A  990828
  PT 102152      A  981231
  RU 2164405     C2 010327
  SE 9801336     A  981130
  SG 72805       A1 000523
  US 5910319     A  990608 424/458; 424/459; 424/461; 424/464; 424/465;
                               424/489; 514/646; 514/962
  US 5985322     A  991116 424/458; 424/459; 424/461; 424/464; 424/465;
                               424/489; 424/490; 424/494; 514/646; 514/962
  ZA 9803173     A  991229
Derwent Week (Basic): 9850
Priority Data: US 867196 (970529)
Applications:  US 867196 (970529); CA 2234826 (980414); RU 98106998 (980414
    ); CZ 981143 (980415); GB 987939 (980415); HU 98882 (980415); NZ 330192
    (980415); ZA 983173 (980415); FI 98846 (980416); SE 981336 (980417); SG
    98871 (980417); KR 9814014 (980420); CN 98108778 (980424); JP 98153501
    (980424); CN 2000122209 (980424); MX 3636 (980507); PT 102152 (980508);
    FR 986040 (980513); NO 982197 (980514); BR 981989 (980519); BE 98383 (
    980520); NL 981009259 (980526); AU 9869048 (980528); DE 19823940 (
    980528); DK 98728 (980528); AT 98931 (980529); US 265610 (990310)
Derwent Class: A96; B05
Int Pat Class: A61K-009/16; A61K-009/20; A61K-009/24; A61K-009/28;
    A61K-009/30; A61K-009/32; A61K-009/36; A61K-009/48; A61K-009/50;
    A61K-009/52; A61K-009/54; A61K-009/60; A61K-009/62; A61K-031/02;
    A61K-031/13; A61K-031/135; A61K-031/138; A61K-031/40; A61K-047/36;
    A61K-047/38; A61P-025/24
Number of Patents: 031
Number of Countries: 025
Number of Cited Patents: 034
Number of Cited Literature References: 039
Number of Citing Patents: 000
 
                              CITED PATENTS 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
GB 2325623  A  EP 687472   A2        96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
GB 2325623  A  US 4847092  A         87-137609/20 (ELIL ) LILLY & CO ELI/
                          THAKKAR A L; GIBSON L L
GB 2325623  A  US 5508276  A         96-078389/09 (ELIL ) LILLY & CO ELI;
                          (SHIO ) SHIONOGI & CO LTD/ANDERSON N R; OREN P L;
                          OGURA T; FUJII T
GB 2325623  B  EP 687472   A2        96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
GB 2325623  B  US 4847092  A         87-137609/20 (ELIL ) LILLY & CO ELI/
                          THAKKAR A L; GIBSON L L
GB 2325623  B  US 5508276  A         96-078389/09 (ELIL ) LILLY & CO ELI;
                          (SHIO ) SHIONOGI & CO LTD/ANDERSON N R; OREN P L;
                          OGURA T; FUJII T
US 5910319  A  EP 687472   A2        96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
US 5910319  A  EP 693281   A2        96-078388/09 (ELIL ) LILLY SA/ARCE
                          MENDIZABAL F
US 5910319  A  US 4314081  A         75-49665W/30 (ELIL ) LILLY & CO ELI
US 5910319  A  US 4444778  A         84-120590/19 (COUG/) COUGHLIN S R/
                          COUGHLIN S R
US 5910319  A  US 4626549  A         86-338922/51 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 5910319  A  US 4847092  A         87-137609/20 (ELIL ) LILLY & CO ELI/
                          THAKKAR A L; GIBSON L L
US 5910319  A  US 5104899  A         92-150223/18 (SEPR-) SEPRACOR INC/
                          YOUNG J W; BARBERICH T J
US 5910319  A  US 5356934  A         91-289974/40 (ELIL ) LILLY & CO ELI/
                          ROBERTSON D W; WONG D T
US 5910319  A  US 5508276  A         96-078389/09 (ELIL ) LILLY & CO ELI;
                          (SHIO ) SHIONOGI & CO LTD/ANDERSON N R; OREN P L;
                          OGURA T; FUJII T
US 5910319  A  WO 9213452  A1        92-299671/36 (YOUN/) YOUNG J W;
                          (BARB/) BARBERICH T J; (TEIC/) TEICHER M H/YOUNG
                          J W; BARBERICH T J; TEICHER M H
US 5910319  A  WO 9219226  A1        92-398501/48 (DYNA-) DYNAGEN INC/
                          KITCHELL J P; MUNI I A; BOYER Y N
US 5910319  A  WO 9318755  A1        93-320425/40 (DEPO-) DEPOMED SYSTEMS
                          INC/SHELL J W
US 5910319  A  WO 9324154  A1        93-405434/50 (FUIS-) FUISZ
                          TECHNOLOGIES LTD/FUISZ R C
US 5910319  A  WO 9512385  A1        95-185578/24 (ISOT-) ISOTECH MEDICAL
                          INC/CHO Y W
US 5985322  A  EP 687472   A2        96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
US 5985322  A  EP 693281   A2        96-078388/09 (ELIL ) LILLY SA/ARCE
                          MENDIZABAL F
US 5985322  A  US 4314081  A         75-49665W/30 (ELIL ) LILLY & CO ELI
US 5985322  A  US 4444778  A         84-120590/19 (COUG/) COUGHLIN S R/
                          COUGHLIN S R
US 5985322  A  US 4626549  A         86-338922/51 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 5985322  A  US 4847092  A         87-137609/20 (ELIL ) LILLY & CO ELI/
                          THAKKAR A L; GIBSON L L
US 5985322  A  US 5104899  A         92-150223/18 (SEPR-) SEPRACOR INC/
                          YOUNG J W; BARBERICH T J
US 5985322  A  US 5356934  A         91-289974/40 (ELIL ) LILLY & CO ELI/
                          ROBERTSON D W; WONG D T
US 5985322  A  US 5508276  A         96-078389/09 (ELIL ) LILLY & CO ELI;
                          (SHIO ) SHIONOGI & CO LTD/ANDERSON N R; OREN P L;
                          OGURA T; FUJII T
US 5985322  A  WO 9213452  A1        92-299671/36 (YOUN/) YOUNG J W;
                          (BARB/) BARBERICH T J; (TEIC/) TEICHER M H/YOUNG
                          J W; BARBERICH T J; TEICHER M H
US 5985322  A  WO 9219226  A1        92-398501/48 (DYNA-) DYNAGEN INC/
                          KITCHELL J P; MUNI I A; BOYER Y N
US 5985322  A  WO 9318755  A1        93-320425/40 (DEPO-) DEPOMED SYSTEMS
                          INC/SHELL J W
US 5985322  A  WO 9324154  A1        93-405434/50 (FUIS-) FUISZ
                          TECHNOLOGIES LTD/FUISZ R C
US 5985322  A  WO 9512385  A1        95-185578/24 (ISOT-) ISOTECH MEDICAL
                          INC/CHO Y W
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
US 5910319  A          Johnson, World Patents Index, #93-352132, 1993.
US 5910319  A          Oguiza et al., World Patents Index, #96-031591,
                       1996.
US 5910319  A          Wong et al., Chemical Abstracts, vol. 124, #156011,
                       1995.
US 5910319  A          Montgomery, et al., Eur. Arch. Psychiatry Clin.
                       Neurosci., 244:211-215 (1994).
US 5910319  A          Burke, et al., Psychopharmacol. Bull. 31(3); 524
                       (1995).
US 5910319  A          Stafford, et al., Drug Development and Industrial
                       Pharmacy, 8(4):513-530 (1982).
US 5910319  A          Osterwald, Hermann P., Pharmaceutical Research,
                       2:14-18 (1985).
US 5910319  A          Davis, et al., Drug Development and Industrial
                       Pharmacy, 12(10):1419-1448 (1986).
US 5910319  A          Bloor, et al., Drug Development and Industrial
                       Pharmacy, 15(15-16):2227-2243.
US 5910319  A          Nagai, et al., Aqueous Polymeric Coating for
                       Pharmaceutical Dosage Forms, Marcel Dekker, N.W. and
                       Basel, 81-152 (1989).
US 5910319  A          Chang, Rong-Kun, Pharmaceutical Technology,
                       14(10):2-70 (1990).
US 5910319  A          Fujii, et al., Recent Advances On Aqueous Polymeric
                       Coating System and Related Techniques, Proceedings
                       of Pre-World Congress Particle Technology in Gifu,
                       Sep. 17-18, 1990, Gifu, Japan, 80-85 (1990).
US 5910319  A          Delattre, et al., Proceed. Intern. Symp. Control.
                       Rel. Bioact. Mater., 19:267-268 (1992).
US 5910319  A          Schmidt, et al., Drug Development and Industrial
                       Pharmacy, 18(18):1969-1979 (1992).
US 5910319  A          Wyatt "Enhanced Stability of Aqueous Cellulose
                       Acetate Phthalate (CAP) Enteric Films." Presented
                       AAPS Annual Meeting, San Antonio, TX, Nov. 15-19,
                       1992.
US 5910319  A          Takahata, et al., Chemical and Pharmaceutical
                       Bulletin, 41(6):1137-1143 (1993).
US 5910319  A          Obara, et al., Pharmaceutical Research,
                       11(11):1562-1567 (1994).
US 5910319  A          Shin-Etsu Chemical Co., Ltd., "An Improved Aqueous
                       Coating Using Shin-Etsu AQOAT", AQOAT Technical
                       Information Bulletin, 1994.
US 5910319  A          Japan Pharmaceutical Excipients Council,
                       "Hydroxypropylmethylcellulose Acetate Succinate",
                       Japanese Pharmaceutical Excipients 1993 (JPE 1993),
                       183-187, Yakuji Nippo, Ltd., Tokyo, Japan, 1994.
US 5910319  A          Shin-Etsu Chemical Co., Ltd. "Dry Coating", AQOAT
                       Technical Inormation No. A-3, Sep., 1996.
US 5910319  A          Obara, et al., Poster PT6115, Dry Coating'-A Novel.
US 5985322  A          Montgomery, et al., Eur. Arch. Psychiatry Clin.
                       Neurosci., 244:211-215 (1994).
US 5985322  A          Burke, et al., Psychopharmacol. Bull. 31 (3); 524
                       (1995).
US 5985322  A          Stafford, et al., Drug Development and Industrial
                       Pharmacy, 8(4) :513-530 (1982).
US 5985322  A          Osterwald, Hermann P., Pharmaceutical Research,
                       2:14-18 (1985).
US 5985322  A          Davis, et al., Drug Development and Industrial
                       Pharmacy, 12(10):1419-1448 (1986).
US 5985322  A          Bloor, et al., Drug Development and Industrial
                       Pharmacy, 15 (15-16) :2227-2243.
US 5985322  A          Nagai, et al., Aqueous Polymeric Coating for
                       Pharmaceutical Dosage Forms, Marcel Dekker, N.W. and
                       Basel, 81-152 (1989).
US 5985322  A          Chang, Rong-Kun, Pharmaceutical Technology, 14(10)
                       :2-70 (1990).
US 5985322  A          Fujii, et al., Recent Advances On Aqueous Polymeric
                       Coating System and Related Techniques, Proceedings
                       of Pre-World Congress Particle Technology in Gifu,
                       Sep. 17-18, 1990, Gifu,Japan, 80-85 (1990).
US 5985322  A          Delattre, et al., Proceed. Intern. Symp. Control.
                       Rel. Bioact. Mater., 19:267-268 (1992).
US 5985322  A          Schmidt, et al., Drug Development and Industrial
                       Pharmacy, 18(18) :1969-1979 (1992).
US 5985322  A          Wyatt "Enhanced Stability of Aqueous Cellulose
                       Acetate Phthalate (CAP) Enteric Films." Presented
                       AAPS Annual Meeting, San Antonio, TX, Nov. 15-19,
                       1992.
US 5985322  A          Takahata, et al., Chemical and Pharmaceutical
                       Bulletin, 41(6) :1137-1143 (1993).
US 5985322  A          Obara, et al., Pharmaceutical Research, 11(11)
                       :1562-1567 (1994).
US 5985322  A          Shin-Etsu Chemical CO., Ltd., "An Improved Aqueous
                       Coating using Shin-Etsu AQOAT", AQOAT Technical
                       Information Bulletin, 1994.
US 5985322  A          Japan Pharmaceutical Excipients Council,
                       "Hydroxypropylmethylcellulose Acetate Succinate",
                       Japanese Pharmaceutical Excipients 1993 (JPE 1993),
                       183-187, Yakuji Nippo, Ltd., Tokyo, Japan, 1994.
US 5985322  A          Shin-Etsu Chemical Co., Ltd. "Dry Coating", AQOAT
                       Technical Inormation No. A-3, Sep., 1996.
US 5985322  A          Obara, et al., Poster PT6115, Dry Coating-A Novel.
 
 
  6/2/4 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
03015930  WPI Acc No: 98-065080/07
 Potentiation of anti-depressant drugs e.g. fluoxetine, using 
 N-(2-(4-(2-methoxyphenyl)piperazin-1-yl))-N-(2-pyridyl)cyclohexanecarboxami 
 de - increases serotonin levels in patients beyond levels normally achieved 
 with drug alone 
Patent Assignee: (ELIL ) LILLY SA
Author (Inventor): ARTIGAS PEREZ F
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  EP 818198      A1 980114 (BASIC)
Derwent Week (Basic): 9807
Priority Data: EP 96500097 (960709)
Applications:  EP 96500097 (960709)
Designated States
   (Regional): ES
Derwent Class: B03; B05
Int Pat Class: A61K-031/505; A61K-031-135
Number of Patents: 001
Number of Countries: 001
Number of Cited Patents: 003
Number of Cited Literature References: 009
Number of Citing Patents: 000
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 818198   A  EP 687472   A  Y      96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
EP 818198   A  EP 714663   A  Y      96-261402/27 (ELIL ) LILLY & CO ELI/
                          OGUIZA J I; WONG D T
EP 818198   A  US 4940585  A  A      90-231154/30 (HAPW/) HAPWORTH W E/
                          HAPWORTH W E; HAPWORTH M S
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
EP 818198   A          BR. J. PHARMACOL., vol. 115, no. 6, 1995, pages
                       1064-1070, XP000604130 GARTSIDE ET AL: "INTERACTION
                       BETWEEN A SELECTIVE 5-HT1A RECEPTOR ANTAGONIST AND
                       AN SSRI IN VIVO: EFFECTS ON 5-HT CELL FIRING AND
                       EXTRACELLULAR 5-HT"
EP 818198   A          ARCHIVES OF GENERAL PSYCHIATRY, vol. 51, no. 3,
                       1994, pages 248-251, XP000605678 ARTIGAS ET AL:
                       "PINDOLOL INDUCES A RAPID IMPROVEMENT OF DEPRESSED
                       PATIENTS TREATED WITH SEROTONIN REUPTAKE INHIBITORS"
EP 818198   A          J. NEUROCHEM., vol. 66, no. 2, 1996, pages 599-603,
                       XP000604121 ENGLEMAN ET AL: "ANTAGONISM OF SEROTONIN
                       5-HT1A RECEPTORS POTENTIATES THE INCREASES IN
                       EXTRACELLULAR MONOAMINES INDUCED BY DULOXETINE IN
                       RAT HYPOTHALAMUS"
EP 818198   A          Newman-Tancredi, A. et al.: NEUROPHARMACOLOGY, vol.
                       36, no. 4/5, 1997, pp.451-459
EP 818198   A          Newman-Tancredi, A. et al.: NEUROPHYCHOPHARMACOLOGY,
                       vol. 18, no,. 5 1998pp.395-398
EP 818198   A          Romero, L. et al.: Strategies to optimize the
                       antidepressant action of selective serotonin
                       reuptake inhibitors in ANTIDEPRESSANTS: NEW
                       PHARMACOLOGICAL STRATEGIES,1996, Ed. P. Skolnic, pp.
                       1-33, Humana Press, Totowa, NJ, USA
EP 818198   A          Romero, L. et al.: NEUROPSYCHOPHARMACOLOGY, vol. 15,
                       no. 4, 1996, pp. 349-360
EP 818198   A          Romero, L. et al.: JOURNAL OF NEUROCHEMISTRY, vol.
                       68, 1997, pp. 2593-2603
EP 818198   A          Romero, L. et al.: NEUROSCIENCE LETTERS, vol. 219,
                       1996, pp. 123-126
 
 
  6/2/5 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
02847079  WPI Acc No: 97-427083/40
 Treatment of sleep disorders with two of specified compounds - one a 
 serotonin uptake inhibitor and the other a serotonin receptor 1A 
 antagonist, e.g. fluoxetine or duloxetine, and pindolol 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): JAMES S P
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  EP 792649      A1 970903 (BASIC)
  AU 9720586     A  970916
  WO 9731629     A1 970904
Derwent Week (Basic): 9740
Priority Data: US 12523 (960229)
Applications:  EP 97301057 (970219); AU 9720586 (970227); WO 97US3068 (
    970227)
Designated States
   (National): AL; AM; AU; AZ; BA; BB; BG; BR; BY; CA; CN; CU; CZ; EE; GE;
     GH; HU; IL; IS; JP; KE; KG; KP; KR; KZ; LC; LK; LR; LS; LV; MD; MG; MK
     ; MN; MW; MX; NO; NZ; PL; RO; RU; SD; SG; SI; SK; TJ; TM; TR; TT; UA;
     UG; US; UZ; YU
   (Regional): AT; BE; CH; DE; DK; EA; ES; FI; FR; GB; GH; GR; IE; IT; KE;
     LI; LS; LU; MW; NL; OA; PT; RO; SD; SE; SZ; UG
Derwent Class: B02; B05
Int Pat Class: A61K-031/05
Number of Patents: 003
Number of Countries: 074
Number of Cited Patents: 004
Number of Cited Literature References: 002
Number of Citing Patents: 001
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 792649   A  EP 687472   A  X      96-031591/04 (ELIL ) LILLY & CO ELI/
                          WONG D T; OGUIZA J I
EP 792649   A  US 4584404  A  Y      86-125093/19 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
EP 792649   A  US 5250571  A  Y      91-261604/36 (ELIL ) LILLY & CO ELI/
                          FULLER R W; MITCHELL D; ROBERTSON D W; STEPHENSON
                          G A; WONG D T
EP 792649   A  US 5356934  A  Y      91-289974/40 (ELIL ) LILLY & CO ELI/
                          ROBERTSON D W; WONG D T
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
EP 792649   A          ARCHIVES OF GENERAL PSYCHIATRY, vol. 51, no. 3,
                       March 1994, pages 248-251, XP000605678 "PINDOLOL
                       INDUCES A RAPID IMPROVEMENT OF DEPRESSED PATIENTS
                       TREATED WITH SEROTONIN REUPTAKE INHIBITORS"
WO 9731629  A          DATABASE CA ON STN, No. 107:1364, HILAKIVI I. et
                       al., "Effects of Serotonin and Noradrenaline Uptake
                       Blockers on Wakefulness and Sleep in Cats"; &
                       PARMACOL. TOXICOLOGY, 60(3), 1987.
?
 
 
 TYPE /2/15-18


  6/2/15 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
00541572  WPI Acc No: 88-347698/49
 Antidiabetic fluoxetine compsn. - without causing major body weight loss 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): WONG D T
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  EP 294028   A  881207 (BASIC)
  AU 8815533  A  881110
  DE 3883606  G  931007
  DK 8802385  A  881105
  EP 294028   B1 930901
  ES 2058272  T3 941101
  JP 2721507  B2 980304 0 None
  JP 63284126 A  881121
  ZA 8803115  A  900131
Derwent Week (Basic): 8849
Priority Data: US 45509 (870504)
Applications:  DE 3883606 (880429); EP 88303930 (880429); JP 88109838 (
    880502)
Designated States
   (Regional): AT; BE; CH; DE; ES; FR; GB; GR; IT; LI; LU; NL; SE
Derwent Class: B05
Int Pat Class: A61K-031/13; A61K-031/135
Number of Patents: 009
Number of Countries: 017
Number of Cited Patents: 003
Number of Cited Literature References: 000
Number of Citing Patents: 004
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 294028   A  US 4626549  A         86-338922/51 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
EP 294028   B1 US 4626549  A         86-338922/51 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
JP 2721507  B2 US 4626549  A         86-338922/51 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
 
 
  6/2/16 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
00334893  WPI Acc No: 86-233903/36
 Analgesic compsn. contg. codeine and fluoxetine or norfluoxetine - and opt. 
 aspirin or acetominophen 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): HYNES M D
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  EP 193355      A  860903 (BASIC)
  AU 8654001     A  860828
  CA 1267092     A  900327
  DE 3684626     G  920507
  EP 193355      B  920401
  JP 61200911    A  860905
  JP 95045405    B2 950517
  US 4683235     A  870728
  ZA 8601211     A  870818
Derwent Week (Basic): 8636
Priority Data: US 705176 (850225); US 889157 (860725)
Applications:  ZA 861211 (860218); EP 86301207 (860220); JP 8640123 (860224
    ); US 889157 (860725)
Designated States
   (Regional): BE; CH; DE; FR; GB; IT; LI; LU; NL; SE
Derwent Class: B05
Int Pat Class: A61K-031/135; A61K-031/485; A61K-031/61; A61K-031/615
Number of Patents: 009
Number of Countries: 015
Number of Cited Patents: 010
Number of Cited Literature References: 005
Number of Citing Patents: 005
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 193355   A  FR 1373     A         66-05230F/00 (SOIF ) SOC IND
                          FABRICATION ANTIBIOTIQUES
EP 193355   A  US 4035511  A         77-52098Y/29 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY
EP 193355   A  US 4083982  A         78-40255A/22 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY/MESSING R B; LYTLE L D
EP 193355   B  FR 1373     A         66-05230F/00 (SOIF ) SOC IND
                          FABRICATION ANTIBIOTIQUES
EP 193355   B  US 4035511  A         77-52098Y/29 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY
EP 193355   B  US 4083982  A         78-40255A/22 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY/MESSING R B; LYTLE L D
US 4683235  A  US 4035511  A         77-52098Y/29 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY
US 4683235  A  US 4083982  A         78-40255A/22 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY/MESSING R B; LYTLE L D
US 4683235  A  US 4313896  A         82-13654E/07 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 4683235  A  US 4314081  A         75-49665W/30 (ELIL ) LILLY & CO ELI
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
US 4683235  A          Chem. Abst. 92 (1980) 191195z.
US 4683235  A          Hynes et al., Drug Development Research, 2, 33
                       (1982).
US 4683235  A          Messing et al., Psychopharmacology Communications,
                       1(5), 511 (1975).
US 4683235  A          Larson et al., Life Sciences, 21, 1807 (1977).
US 4683235  A          Sugrue et al., J. Pharm. Pharmac., 28, 447 (1976).
 
 
  6/2/17 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
00319969  WPI Acc No: 86-169186/26
 Potentiation of dextropropoxyphene analgesia - using fluoxetine or 
 nor-fluoxetine 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): HYNES M D
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  US 4594358     A  860610 (BASIC)
  AU 8653900     A  860828
  CA 1268120     A  900424
  EP 193354      A  860903
  JP 61200912    A  860905
  ZA 8601212     A  870818
Derwent Week (Basic): 8626
Priority Data: US 705177 (850225)
Applications:  US 705177 (850225); ZA 861212 (860218); EP 86301206 (860220
    ); JP 8640124 (860224)
Designated States
   (Regional): BE; CH; DE; FR; GB; IT; LI; LU; NL; SE
Derwent Class: B05
Int Pat Class: A61K-031/13
Number of Patents: 006
Number of Countries: 015
Number of Cited Patents: 007
Number of Cited Literature References: 005
Number of Citing Patents: 011
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 193354   A  US 4012525  A         77-21776Y/12 (ELIL ) LILLY & CO ELI
EP 193354   A  US 4035511  A         77-52098Y/29 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY
EP 193354   A  US 4083982  A         78-40255A/22 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY/MESSING R B; LYTLE L D
US 4594358  A  US 4035511  A         77-52098Y/29 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY
US 4594358  A  US 4083982  A         78-40255A/22 (MASI ) MASSACHUSETTS
                          INST TECHNOLOGY/MESSING R B; LYTLE L D
US 4594358  A  US 4313896  A         82-13654E/07 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 4594358  A  US 4314081  A         75-49665W/30 (ELIL ) LILLY & CO ELI
 
                       CITED LITERATURE REFERENCES 
 
Family Member   Cat    Citation
By Examiner:
US 4594358  A          Merck Index, 9th Ed (1976) pp. 1015-1016.
US 4594358  A          Hynes et al., Drug Development Research, 2, 33
                       (1982).
US 4594358  A          Messing et al., Psychopharmacology Communications,
                       1(5), 511 (1975).
US 4594358  A          Larson et al., Life Sciences, 21, 1807 (1977).
US 4594358  A          Sugrue et al., J. Pharm. Pharmac., 28, 447 (1976).
 
 
  6/2/18 
DIALOG(R)File 342:Derwent Patents Citation Indx
(c) 2001 Derwent Info Ltd. All rts. reserv.
 
00186200  WPI Acc No: 84-252258/41
 Use of fluoxetine or norfluoxetine as anti-anxiety agents - pref. 
 administered orally as hydrochloride salts 
Patent Assignee: (ELIL ) LILLY & CO ELI
Author (Inventor): STARK P
Patent Family:
  Patent No   Kind Date           Examiner Field of Search
  GB 2137496     A  841010 (BASIC)
  AU 8426457     A  841011
  DE 3413093     A  841011
  DE 3467704     G  880107
  DK 166479      B  930601
  DK 8401132     A  841009
  EP 123469      A  841031
  EP 123469      B  871125
  GB 2137496     B  861001
  IT 1175977     B  870812
  JP 59193821    A  841102
  JP 94035382    B2 940511
  US 4590213     A  860520
  ZA 8402457     A  851002
Derwent Week (Basic): 8441
Priority Data: US 483087 (830408)
Applications:  US 483087 (830408); DK 841132 (840228); ZA 842457 (840402);
    JP 8467310 (840403); DE 3413093 (840406); EP 84302361 (840406); GB
    848880 (840406); GB 84408880 (840406)
Designated States
   (Regional): BE; CH; DE; FR; GB; IT; LI; LU; NL; SE
Derwent Class: B05
Int Pat Class: A61K-031/13; A61K-031/135
Number of Patents: 014
Number of Countries: 015
Number of Cited Patents: 009
Number of Cited Literature References: 000
Number of Citing Patents: 022
 
                              CITED PATENTS 
 
Family Member  Cited Patent  Cat     WPI Acc No   Assignee/Inventor
 
By Examiner:
EP 123469   A  US 4194009  A         75-49665W/30 (ELIL ) LILLY & CO ELI
EP 123469   A  US 4313896  A         82-13654E/07 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
EP 123469   B  US 4194009  A         75-49665W/30 (ELIL ) LILLY & CO ELI
EP 123469   B  US 4313896  A         82-13654E/07 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 4590213  A  US 4018895  A         75-49665W/30 (ELIL ) LILLY & CO ELI
US 4590213  A  US 4194009  A         75-49665W/30 (ELIL ) LILLY & CO ELI
US 4590213  A  US 4313896  A         82-13654E/07 (ELIL ) LILLY & CO ELI/
                          MOLLOY B B; SCHMIEGEL K K
US 4590213  A  US 4314081  A         75-49665W/30 (ELIL ) LILLY & CO ELI
US 4590213  A  US 4329356  A         82-43710E/21 (ELIL ) LILLY & CO ELI/
                          HOLLAND D R
?