How to Design Diverse Libraries of Solid Catalysts? How to Design Diverse Libraries of Solid Catalysts?* Catharina Klannera, David Farrussengb, Laurent Baumesc, Claude Mirodatosb, Ferdi Sch¸tha a b c MPI f¸r Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 M¸lheim, Germany Phone: 49-208-306 2373, Fax: 49-208-306 2995, email: [email protected] Institut de Recherches sur la Catalyse ± CNRS 2, avenue A. Einstein, 69626 Villeurbanne Cedex, France Equipe de Recherche en Inge¬nierie des Connaissances, Universite¬ Lumie¡re Lyon 2, Ba√timent L, 5, avenue P. Mende¡s-France, 69676 Bron Cedex, France and Institut de Recherches sur la Catalyse ± CNRS 2, avenue A. Einstein, 69626 Villeurbanne Cedex, France Review Article 1 Introduction High throughput experimentation (HTE) in catalysis research and materials science has ± in spite of its relatively short history ± already reached an impressive level of sophistication with respect to synthetic methods [1], reactor technology [2], and fast analytical assays [3], and several review papers are available which cover these developments [4]. In order to fully exploit the advantages associated with the success in the above mentioned areas, equally sophisticated methods are required to manage the flow of data and to extract useful information from these data. However, suitable solutions to these problems, often even partial solutions, are still lacking. Fully integrated and adapted informatics tools to capture, store and treat the high throughput workflow of data for heterogeneous catalysis and materials are yet to be developed. Equally important, efficient software based methods for library design, for which developments on the fundamental level are still necessary, are urgently needed to make full use of the novel experimental tools. Some of the problems are similar as in high throughput drug discovery, where advanced software support solutions followed the experimental developments. However, the high complexity of solids and heterogeneously catalyzed processes creates novel challenges going beyond those faced in drug discovery. These challenges are often not acknowledged, yet, in the community. In this essay we will point out often-underestimated fundamental differences between drug discovery and materials science, which are faced in software assisted library design for high throughput approaches. Even if HTE increases the screening power by orders of magnitude, the number of potential experiments to be carried out is infinite (see, for example, the considerations of Jansen concerning the number of possible solid compounds [5]). Therefore new methodologies have to be developed which allow the design of efficient libraries. * We would like to acknowledge the FCI and the EU (MarieCurie-Program) for financial support and S. Kaskel for helpful discussions. QSAR Comb. Sci. 22 (2003) DOI: 10.1002/qsar.200320003 Novel concepts and strategies for screening with specialized software components are proposed to enhance discovery and optimization rates. HTE in heterogeneous catalysis and materials science relies on the iterative preparation and testing of large libraries of solids, either in a parallel mode or sequentially. The process starts with the design of an initial set of catalysts, which can be done either randomly, or following certain rules, or be based on the experience and intuition of the chemist who designs the library. Such an initial library is then prepared and tested. After analyzing the results and based on the analysis, a new set of experiments is designed. This methodology is not fundamentally different from the one used in the past in the search for novel catalysts and processes. However, the role of the chemist drastically changes because the numbers of experiments to be conducted and the amount of data to be collected and treated are orders of magnitude higher. Without an efficient informatics environment, it is impossible to plan and design such vast numbers of experiments [6]. In the beginning of a high throughput discovery program two possible starting situations can be identified: (i) screening is based on prior information and catalytic systems are available which show some activity for the desired reaction, or (ii) there is essentially no precedence of a catalyst, or the systems previously investigated do not seem to have the potential for further improvement. The first situation is often described by the term ™optimization program∫, the second situation subsumed under the label ™discovery program∫. 2 Design of Focused Libraries in Optimization Programs In the first case, relevant and/or potentially relevant factors are thought to be known so that libraries can be designed in a well defined frame. Library design would concentrate on a composition and parameter space around systems which are known to work and which are varied in a systematic and efficient way. In order to do this, some tools, which will be briefly discussed in the following, are available. These tools ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 729 Catharina Klanner et al. have already been used in conventional and also in high throughput catalysis research. The response surface, which is a modeling technique from the DoE (Design of Experiments) toolbox, provides quantitative interpolations while minimizing the number of experiments (for examples of the application in catalysis see refs. [7] [8] [9]). The experiments are designed in such a way that the multivariate regression results in a more robust model with respect to statistical calculations. However, as for all modeling techniques, a model has to be a priori postulated and a posteriori validated. The main limitation is that regressions can only operate on continuous variables. In addition, this technique can hardly be applied for more than eight parameters. Expert system based methods have also been used to select or to optimize catalysts for given applications even before the advent of high throughput experimentation in catalysis research. The expert systems were trained either by input of heuristic knowledge, based on literature data or experience [10], or neural networks were used to suggest optimized catalysts by correlating crucial parameters with performance. The latter approach was chosen by Hattori in the nineties[11] [12] [13] for several examples with some success but does not seem to have been widely adapted. The reason for this may be, that initially only very limited consistent data sets were available, on which these methods could be applied. The situation, however, is changing, since high throughput experimentation now provides the possibilities to generate large, consistent sets of performance data in reasonable time frames. Artificial neural networks can become efficient tools to guide the combinatorial development of catalysts and economize on experiment time. Thus, in this context the use of neural networks will probably become much more important in the future, and the first publications are emerging [14]. Many optimization methods inspired by Darwin×s evolution theory (Genetic Algorithms, Evolutionary Strategies, ... ) have been developed for numerous and various purposes. They are commonly grouped under the name of Evolutionary Algorithms. The principles of natural evolution based on population selection, crossover and mutation are their common features, and some applications in heterogeneous catalysis have been reported for optimizing catalysts. The group of Baerns applied an evolutionary strategy to optimize catalysts for the oxidative dehydrogenation of propane [15] and in the low temperature oxidation of propane [16]. Elements of the periodic table were a priori selected according to heuristic knowledge to form the parameter space which bound the search. After evaluation of randomly selected quarternary formulations in a catalytic flow system, the catalysts were ranked with respect to the target criterion. The next generation was designed following principles of biological evolution and thus an altered catalyst population was generated. Optimization occurs through generations of populations of trial solutions with increasing average fitness. For oxidative dehydrogenation of propane the conversion-selectivity plot as a function of the 730 number of generation shows clearly the convergence of the algorithm after 4 generations towards a zone at 15% conversion and 55% selectivity to propene achieved over catalysts containing of V, Mg, Mo and Ga. Similar approaches as the one described above were recently presented at two conferences for different reactions [17]. 3 Diversity and Descriptors While rational methods for improving initial hits are thus available, library design in case (ii) listed above is much more difficult. If there is no useful guideline for the design of an initial library, or if one deliberately decides to discard previous ideas about the relevant factors influencing the performance of solids in a given reaction (a reason for this possibly being that so far discovery programs using conventional wisdom have failed), one will typically desire to design a ™diverse∫ library in order to increase chances of discovering regions in parameter space which would justify further exploration. Such libraries are meant to result in the discovery of compositions and conditions around which more focused libraries can be constructed to optimize the systems further. In the following the terms ™diverse∫ and ™descriptor∫ are central in the discussion. We will therefore at this point explain what is meant by these terms which are being used extensively in high throughput drug discovery. There is no objective definition of ™molecular diversity∫ in drug discovery [18], and the same holds for solids. We will call a library of solids ™diverse∫, if it gives a maximum of different responses in a certain target application. It would also be possible to use the term ™diverse∫ to describe the range of different solids used, i.e. with respect to composition, synthesis conditions, etc. However, for the application of this term as outlined in the following, the first definition is more appropriate. One should keep in mind though, that the term ™diverse∫ is highly context dependent. As a simple example for the definition introduced, in the catalytic oxidation of propene a library of only four solids would be called ™diverse∫, if under given conditions one catalyst would be totally inactive, one catalyst would lead to total combustion, one catalyst lead to the formation of acrolein and one catalyst to the formation of oligomers. A nondiverse library would be one where all solids result in the same reaction product under the same conditions. It is clear that the diversity in a given application has to be defined more precisely, for instance by describing many classes of performance, into which catalysts and conditions could be grouped. This classification could also entail different responses to different reactions conditions, for instance, different temperature levels, different feed compositions, pressure, residence time and so on. Even if the definition has to be specified for a certain application, the general principle of a diverse library should have become clear. Another key concept is that of a ™descriptor∫ which is also known from drug discovery. This concept is explained in ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim QSAR Comb. Sci. 22 (2003) How to Design Diverse Libraries of Solid Catalysts? Figure 1. Exemplary representation of different attributes which can be assigned to a solid catalyst. The inner circle contains attributes which are correlated to the entire solid. The middle circle represents the mass ratio of all elements in the catalyst, and the outer contains parameters which are related to each element, respectively. Attributes and combinations of attributes which correlate with catalytic activity are descriptors, which all together form a descriptor vector. fig. 1: A catalyst in a catalytic reaction is characterized by different attributes, which are connected with the catalyst synthesis, the ingredients, the reaction conditions and so on. Some of these attributes are irrelevant with respect to catalytic performance, but others are thought to somehow correlate with catalytic activity, and these latter ones are called ™descriptors∫. A descriptor is not necessarily only just one attribute, but also combinations of attributes may be descriptors. This shall be exemplified for the case of acidic isomerization catalysts. If zeolite based catalysts are considered, the aluminum content will certainly be important for the performance. In the case of zirconias, however, aluminum content is most probably an unimportant category, and instead the sulfate content plays a major role. Thus the combination of silicon and aluminum (for zeolites) or zirconium and sulfur (for sulfated zirconias) will be descriptors which correlate with activity in isomerization reactions. In addition, the performance of a catalyst will typically not be related to one descriptor only, but to a combination of different descriptors. For instance, in the example mentioned above (reactions catalyzed by strong acids), the presence of silicon and aluminum alone is not sufficient to correlate to activity. Additional requirements to induce activity are synthesis under hydrothermal conditions and the presence of protons obtained by ion exchange and possibly other parameters. The individual descriptors are thus combined to form a ™descriptor vector∫. QSAR Comb. Sci. 22 (2003) 4 Concept of a Discovery Program The discussion below will address the question how to create diverse libraries for solids and how to identify a descriptor vector. The major question in the design of a discovery library is how to select the samples to be tested in order to maximize the browsing. When no knowledge is available, on which the design of an initial library can be based, it is obvious, that one should try to cover the available compositional and parameter space as completely as possible with as few as necessary experiments. ™Design of Experiments∫ (DoE) is a long established mathematical technique also in catalysis research [19] which is very powerful for designing experiments to optimize efficiency and to arrive at statistically significant conclusions concerning the influence of parameters and their cross-correlations with a minimum number of experiments. Software commercially available can generate experimental matrices for the analysis of more than 100 factors. However, they are in some cases not sufficiently flexible to be adapted to problems like materials synthesis, for which a set of synthetic rules with given boundary conditions already exist. In addition, in a DoE approach one needs to decide prior to design, which factors should be analyzed, and it is not a priori clear whether these are the ones which are the most relevant ones. DoE tools will suggest the statistically optimally diverse library within the preselected boundaries, but not de novo create diverse ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 731 Catharina Klanner et al. libraries. Nevertheless, DoE techniques can be suitable means to increase efficiency in high throughput experimentation. In the following, however, we will discuss some concepts how software assisted methods may actually go beyond a statistical optimization and support the de novo design of diverse libraries without predefining boundaries in which to operate. In order to illustrate the relevant issues, we will first discuss, how diverse libraries are created in drug discovery programs and then contrast this with the problems faced when dealing with solids and their catalytic performance. The search for optimal diversity in libraries is an important subfield in high throughput drug discovery programs, and many methods have been proposed to optimally design diverse libraries (see, for instance, two excellent reviews [20] [21]). Catalysis research might be able to build upon the experience gained in these applications. 4.1 The ™Similar Property Principle∫ in Drug Discovery Scientists working in the field of drug discovery mostly rely on the ™similar property principle∫ which states that structurally similar molecules will exhibit similar physicochemical and biological activity [22] [23]. This assumption is reasonable, since identical functionality, for instance the presence of a donor group in a drug molecule, could be expected to lead to similar binding properties, or similar sizes and shapes of two molecules might lead to similar fit into a pocket of a possible receptor. Although the exact correlation between structure of a molecule and its performance as a drug is normally not known, computer algorithms are used to identify ™similar∫ molecules. Then only selected examples out of a class of ™similar∫ molecules are actually tested and the number of syntheses and tests necessary is dramatically reduced. In order to create diverse libraries, one tries to select molecules from as many different classes as possible so that maximally dissimilar molecules are explored. The assignment of a molecule to a certain class is made possible by representing the molecules in a computer readable form by ™descriptors∫ valid for molecular entities. These captured features of the molecule are thought to be correlated with its function. The descriptors can, for instance, be two-dimensional fingerprints, such as absence or presence of certain chemical functionalities, or can be socalled pharmocophores, which relate to the relative spatial arrangement of three selected chemical functionalities, or physico-chemical properties, or many others [24]. Many different of these descriptors have been suggested, and typically, molecules are not only described by one descriptor but by a whole set of descriptors, which represent the molecule in the computer and which are the basis for diversity analysis. Using these descriptors, the degree of similarity of two molecules can be described by certain parameters, such as the Tanimoto coefficient, or other measures [25]. With these means the molecules are grouped together, for which different algorithms are available, and 732 thus classes are identified where molecules belonging to the same class are ™similar∫ and molecules from different classes are ™dissimilar∫. For the creation of a diverse library, only few molecules from each class are selected and actually synthesized and tested. 4.2 Is there a ™Similar Property Principle∫ for Solids? If a similar approach would be possible for solids, the efficiency of HTE in catalysis and materials research could be substantially increased. However, there are fundamental differences between molecules and solids with respect to diversity and with respect to their representation in a computer. If the synthesis sequence for an organic molecule is known, the identity of the resulting product is typically known a priori, even if there may be complications by side reactions or incomplete conversion, while the properties of a solid can only be established a posteriori by analysis, after the solid has been made. Synthesis planning of inorganic solids is only in its infancy [5, 26], even considering just the structure of solids, and even more so their textural parameters and defect structures. There is thus no basis to judge on the property of a solid based on the ingredients alone and the information on the synthesis protocol. In drug discovery, many of the possible descriptors can easily be calculated from the two-dimensional molecular structure of the molecule, quite similar to what chemists do intuitively while looking at the structural formula. A simple representation as the two-dimensional structural formula of an organic molecule is impossible for complex solids. Descriptors of solids thus have to be developed following different lines, as will be discussed below. Even more difficult is the question of diversity. The diversity of a library of solids can be considered from two sides, i.e. from the side of the solid, or from the side of the performance of the solid in the desired application. This is the same as in drug discovery. It can be the molecules which are diverse, but it may also be their performance in a binding test. However, due to the assumed validity of the similar property principle, diversity in the molecule library should correspond to diversity in performance in the case of drug development. Generating diverse libraries in drug discovery is therefore simpler than when dealing with solids ± even if it still is a formidable task ± since the problem can be reduced to analyze the molecules without initially considering their performance. If dealing with solid catalysts on the other hand, a high diversity of the library of solids could be totally irrelevant with respect to the catalytic performance of this library. As pointed out by Schlˆgl [27] at a very early stage of the development of HTE approaches towards heterogeneous catalysis, properties of solids can change in a discontinuous way with composition, for instance, if new phases are developed or if a miscibility region is left and phase separation occurs. These differences between solid state chemistry and molecular chemistry have long been acknowledged in solid state chemistry and materials science ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim QSAR Comb. Sci. 22 (2003) How to Design Diverse Libraries of Solid Catalysts? [28]. In catalysis, also, the effect of additives on the promoter level is strong when the promoter is present at low concentrations, but might level out, if the concentration is increased or could even be detrimental for the performance. In addition to these points, the surface of a solid, on which the reaction proceeds, is typically not uniform, but consists of many different sites, many of which could be metastable and therefore difficult to predict, the catalytically relevant sites may in fact not be regular sites, but rather defects in the solid. Thus, the underpinning for descriptor based preselection of compounds, which is used in the drug industry, the similar property principle, is basically missing when dealing with solids. A purely compositional diversity does not need to result also in a performance diversity. Solids which appear to be very similar, judging from their composition, may have very different catalytic performance. Gold in the form of very small particles performs radically different than bulk gold, for example, although the catalysts in both cases might consist of the same support and contain identical amounts of gold [29]. It will thus be necessary to develop novel ways which take into account these nonlinear dependencies for the classification of solids to allow a computer prescreening. However, taking into account the nonlinearities for the creation of a diverse library creates a problem: to do so, one would need the information, how the performance of the solid depends on its composition and on its preparation parameters and this information is not available (except in very simple cases, where, for instance, volcano shaped dependencies between certain properties and performance are known [30]). In fact, if it were available, one would not need the high throughput approach, because the right composition could be calculated back from the desired performance. 4.3 Methodology to Design Discovery Libraries How could one then develop suitable descriptors to predict the performance diversity of a library of solid catalysts? We propose an approach to design diverse discovery libraries based upon such descriptors. Briefly, the process consists of the following steps, some of which are discussed in more detail below, some of which are only mentioned, because they relate to information science problems which are less relevant in a chemistry context, although crucial for the success of the program: (i) a diverse based upon experience and ™chemical intuition∫ collection of compounds is synthesized, (which means that also compounds are included for which it is known that they do not have any interesting catalytic properties in the target application) (ii) these compounds are described by as many attributes as possible and encoded in computer readable form (iii) the compound collection is tested in a target application, (iii) the results of the test together with the attributes classified and subjected to algorithms for dimensionality reduction and correlation to obtain the descriptor vector. (v) Finally a new diverse QSAR Comb. Sci. 22 (2003) library is created in silico, made and tested to validate the descriptor vector. The key requirement is thus to develop a set of descriptors ± regardless what they are ± which can predict diversity in an initial catalyst library. Then the question arises, what kinds of properties are potentially useful as elements of a descriptor vector and how would one identify these descriptors among the attributes of a catalyst? A descriptor certainly needs to be a property which is either known from the synthesis, or tabulated in a data base, or which can be very easily calculated. Since descriptor vectors shall be used for virtual prescreening in order to create a diverse library, any descriptor which needs lengthy calculations or requires the real synthesis of the solid is not useful. Possible descriptors are then (i) synthesis related characteristics, for instance, elements added to the synthesis and their concentrations, additives used in the synthesis, synthesis temperature, synthetic method, etc., (ii) tabulated data, for instance, electronegativities of constituent elements, the differences in electronegativities, melting points, enthalpy of formation of different oxides of the constituent elements, differences in the formation enthalpies of the different oxides, number of readily accessible oxidation states of constituents in oxides, ratios of ionic radii (relates to structure), or (iii) calculated data, for instance, adsorption enthalpies of reagents on constituents of the catalyst, provided that the calculation is fast. It is expected that a successful descriptor vector will have many elements, since no simple correlations of properties and the performance of solids in certain applications can be expected for complex catalysts. The question, however, is, how one would select those elements which can indeed be used in a descriptor vector for a certain target application, since no a priori information should enter. To implement an approach for the identification of significant descriptor vectors, a process lined out in the following for the example of oxidation reactions seems to be useful (Fig. 2). At the beginning, an initial library of solids has to be designed and synthesized. The solids are chosen using the criterion to be as diverse as possible. This initial diverse library is created based on chemical intuition and literature data, which can be supported by software based visualization tools. In the next step each solid is described as precisely as possible. Every accessible information on the synthesis procedure should be taken into account and stored in a relational database. The synthesis related data is very important because in a later validation step, not only physicochemical data on the solid is needed, but also a recipe for synthesis. Another part of the database contains element specific attributes which have been measured for almost all elements. Two major groups of fixed parameters may be taken into account, i.e. physical properties related to the elements themselves and values related to the element oxides, in the case of an oxidation reaction. The first group consists of physical properties such as e.g. boiling point, heat capacity, electronegativity, and atomic radius, whereas the ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 733 Catharina Klanner et al. Figure 2. Outline of a concept for the evaluation of descriptors for solids. a) Development of a descriptor vector b) De novo library design. For detailed information see text. second group contains parameters like e.g. dielectric constant, enthalpy of oxide formation, or density. Additionally, the characteristics of the element ions such as ionic radius, or possible oxidation states are taken into consideration. It is obvious that no simple relationship will exist between such descriptors of a solid and its performance, but combinations of many parameters could allow to predict the diversity in the performance of different catalysts. Parallel to the storage of the characteristics in the data base, the solids are really synthesized and their performance evaluated in a catalytic test in a model reaction. A good model reaction should lead to a variety of reaction products, which assures a high diversity in the response of the library. As a good model for the evaluation of hydrocarbon oxidation catalysts the oxidation of propene is suggested. The wider the variety of possible products the more precisely the activity patterns can be described. With the knowledge of the performance of each solid in a specific reaction, all solids can be classified into clusters of similar behavior. This can be done by well-known classification techniques such as clustering and partitioning methods. It is a priori not clear which method is the most suitable one, but in drug discovery programs hierarchical methods have proved to be superior to non-hierarchical clustering techniques if property data are used to describe molecules [31] [32]. Now both parts, the measured response patterns, which are grouped in clusters of similar response, and the catalysts coded with their attributes, are accessible for correlation. The aim is to find descriptor vectors which are discriminative with respect to the cluster assignment. In other 734 words, each cluster, corresponding to an unique activity pattern, will be represented by a distinct set of attributes and correlations of attributes, the descriptor vector. This descriptor vector has predictive power for diversity of the catalytic behavior of this library of solids. One should not underestimate the problems with respect to the algorithms and data treatment associated with this work. The attributes of the catalysts in a given reaction will have different data formats, such as binary, continuous, nominal, ordinal, and not all algorithms are equally well ± or at all ± suited to deal with different data types. Furthermore, the number of attributes is very large compared with the number of catalysts, and reduction of the diversity space is thus necessary. Since several of the attributes may be related and not independent of each other, suitable methods to identify these related attributes with subsequent elimination of the redundant ones have to be used. One of the most frequently employed methods to do this is principle component analysis. A discussion of these more technical aspects, however, would go far beyond the scope of this essay and we only mention it here to alert the reader to this problem. Many techniques have been developed in drug industry, and some of the solutions found there will be transferable to catalyst libraries. In a last step, the descriptor vector suggested on the basis of the procedure outlined above has to be validated on solids, which have not been used to identify the descriptors. Therefore a new, very big library of solids has to be created ™in silico∫ at random, for instance by using randomizers to create compositions and synthetic procedures which would be used in the synthesis of the solids. Then the descriptor ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim QSAR Comb. Sci. 22 (2003) How to Design Diverse Libraries of Solid Catalysts? vectors of these randomly generated solids in the library are determined. Classification algorithms could be used to group these suggested catalysts in different classes, and then only few solids from each class would be selected. By this procedure, a smaller subset of the in silico library would be identified which should still represent the diversity of the huge parent library. Only these selected solids are then indeed synthesized in order to verify whether the library is diverse as predicted by the algorithm. For the verification, the new library has to undergo a catalytic test, and the results have to be classified like previously described. A comparison of the experimentally measured and the theoretically predicted performance classes will reveal whether the concept is applicable. 5 Conclusions We have outlined the similarities and dissimilarities between library design in drug discovery and in heterogeneous catalysis and suggested methods to create diverse libraries of catalysts. This method of creating diverse libraries does not need to assume the validity of the similar property principle, since the descriptors are identified via a ™calibration∫ run which is assumed to be prototypical for a class or reactions, such as propene oxidation may be a prototype for alkene oxidation reactions, possibly even for hydrocarbon oxidation reactions. A related approach has, in fact, also been used in drug discovery programs. The so called ™affinity fingerprint∫ [33] [34] is not any more obtained from the molecular structure, but by (initially) measuring the binding constant of the molecules to be described against certain reference proteins. The list of binding constants (the ™affinity fingerprint∫) is used as a descriptor. For the use of the affinity fingerprint, the validity of the similar property principle is not necessary. Since it was subsequently realized that binding constants could also be calculated, so that no time consuming experimental determination of the binding constants are necessary any more, in silico prescreening of molecule libraries is now possible and successfully applied using the affinity fingerprint [35]. We believe that the methodology outlined above could become as useful in library design for HTE in catalysis. In principle, such an approach would even be very useful without its implementation in a HTE program, since the discovery of descriptors actually generates ™knowledge∫ on the performance of solid catalysts. However, the generation of consistent datasets without employing high throughput experimentation would need an excessive amount of experiment time. Thus, the very methodology of HTE which make library design tools necessary, for the first time also allows to develop powerful methods for this library design. If approaches as the one lined out above are successful, catalysis research will benefit to a much larger extent from HTE approaches than just by the acceleration of synthesis and testing. The identification and analysis of attributes which are useful as descriptors will give a deeper insight in QSAR Comb. Sci. 22 (2003) the factors which are key for a certain catalytic performance and will thus bring us closer to a real design of catalysts. Eventually, at the very end, HTE approaches may make themselves obsolete because our basis of consistent data is so solid that sufficient ™knowledge∫ for the design of catalysts is available without having the need to run excessive discovery programs. In addition, in the future high throughput technologies might help to advance a more knowledge based catalyst development also in another respect: In the previous discussion it was stated, that a descriptor should be a property which is known or can rapidly be computed to allow ™in silico∫ screening. If high throughput characterization tools are sufficiently powerful, also the information from physico-chemical characterization could be used to develop descriptors. This would still mean that the full libraries have to be synthesized as opposed to the pure ™in silico∫ screening, but the level of physico-chemical understanding could be dramatically improved if measured properties were incorporated. The discussion suggests, that software assisted methods for the creation of diverse libraries are more difficult to implement in the case of solids as compared to drug-like molecules. However, there are two aspects where dealing with solids is easier, especially for catalytic applications. First, if the concept lined out above is properly implemented, the software based library design tool will suggest how to prepare a certain solid to be tested, since synthesis related parameters are among the attributes used to encode the virtual library. This is typically not the case for drug-like molecular entities for which a whole software based methodology for synthesis planning has been developed in parallel to other tools used in high throughput drug discovery [36]. Second, and perhaps more importantly, although a solid is more difficult to describe than a molecule, the response space is far more simple: in the case of solid catalysts, one wants to predict their behavior if exposed to gaseous or liquid reagents in a (normally) simple reactor. Although this is difficult enough, in drug industry one tries to predict the performance of a potential drug molecule in a living body with its multitude of feedback loops and wide variety of different environments. This is a problem of formidable complexity. We are thus optimistic that in spite of the problems related to the computer representation of solids and the non-validity of the similar property principle there is much room for software assisted strategies in the discovery of solid catalysts and other materials. References [1] a) X. D. Xiang, X. Sun, G. Briceno, Y. Lou, K.-A. Wang, H. Chang, W. G. Wallace-Freedman, S.-W. Chen, P. G. Schultz, Science 1995, 268, 1738 ± 1740; b) D. E. Akporiaye, I. M. Dahl, A. Karlsson, R. Wendelbo, Angew. Chem. 1998, 110, 629 ± 631; Angew. Chem. Int. Ed. 1998, 37, 609 ± 611; c) E. Reddington, A. Sapienza, B. Gurau, R. Viswanathan, S. Sarangapani, E. S. Smotkin, T. E. Mallouk, Science 1998, 280, 1735 ± 1737; d) J. Klein, C. W. Lehmann, H.-W. Schmidt, W. F. ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 735 Catharina Klanner et al. Maier, Angew. Chem. 1998, 110, 3557 ± 3561; Angew. Chem. Int. Ed. 1998, 37, 3369 ± 3372; e) C. Hoffmann, A. Wolf, F. Sch¸th, Angew. Chem. 1999, 111, 2971 ± 2975; Angew. Chem. Int. Ed. 1999, 38, 2800 ± 2803; f) T. Johann, A. Brenner, M. Schickardi, O. Busch, F. Marlow, S. Schunk, F. Sch¸th, Angew. Chem. 2002, 114, 3096 ± 3100; Angew. Chem. Int. Ed. 2002, 41, 2966 ± 2968. [2] a) C. Hoffmann, A. Wolf, F. Sch¸th, Angew. Chem. 1999, 111, 2971 ± 2975; Angew. Chem. Int. Ed. 1999, 38, 2800 ± 2803; b) C. Hoffmann, H.-W. Schmidt, F. Sch¸th, J. Catal. 2001, 198, 348 ± 354; c) P. Claus, D. Hˆnicke, T. Zech, Catal. Today 2001, 67, 319 ± 339; d) S. Thomson, C. Hoffmann, S. Ruthe, H.-W. Schmidt, F. Sch¸th, Appl. Catal., A 2001, 220, 253 ± 264; e) U. Rodemerck, P. Ignaszewski, M. Lucas, P. Claus, M. Baerns, Top. Catal. 2000, 13, 249 ± 252. [3] a) F. C. Moates, M. Somani, J. Annamalai, J. T. Richardson, D. Luss, R. C. Willson, Ind. Eng. Chem. Res. 1996, 35, 4801 ± 4803; b) A. Holzwarth, H.-W. Schmidt, W. F. Maier, Angew. Chem. 1998, 110, 2788 ± 2792; Angew. Chem. Int. Ed. 1998, 37, 2644 ± 2647; c) S. M. Senkan, Nature 1998, 394, 350 ± 353; d) E. Reddington, A. Sapienza, B. Gurau, R. Viswanathan, S. Sarangapani, E. S. Smotkin, T. E. Mallouk, Science 1998, 280, 1735 ± 1737; e) P. Cong, R. D. Doolen, Q. Fan, D. M. Giaquinta, S. Guan, E. W. McFarland, D. M. Pooray, K. Self, H. W. Turner, W. H. Weinberg, Angew. Chem. 1999, 111, 507 ± 512; Angew. Chem. Int. Ed. 1999, 38, 483 ± 488; f) S. M. Senkan, S. Ozturk, Angew. Chem. 1999, 111, 867 ± 871; Angew. Chem. Int. Ed. 1999, 38, 791 ± 795; g) M. Orschel, J. Klein, H.-W. Schmidt, W. F. Maier, Angew. Chem. 1999, 111, 2961 ± 2965; Angew. Chem. Int. Ed. 1999, 38, 2791 ± 2794; h) H. Su, E. S. Yeung, J. Am. Chem. Soc. 2000, 122, 7422 ± 7423; i) C. M. Snively, G. Okarsdottir, J. Lauterbach, Angew. Chem. 2001, 113, 3117 ± 3120; Angew. Chem. Int. Ed. 2001, 40, 3028 ± 3030; j) T. Johann, A. Brenner, M. Schickardi, O. Busch, F. Marlow, S. Schunk, F. Sch¸th, Angew. Chem. 2002, 114, 3096 ± 3100; Angew. Chem. Int. Ed. 2002, 41, 2966 ± 2968; k) O. Busch, C. Hoffmann, T. Johann, H.-W. Schmidt, W. Strehlau, F. Sch¸th, J. Am. Chem. Soc., in print. [4] a) B. Jandeleit, D. J. Schaefer, T. S. Powers, H. W. Turner, W. H. Weinberg, Angew. Chem. 1999, 111, 2648 ± 2689; Angew. Chem. Int. Ed. 1999, 38, 2494 ± 2532; b) S. Senkan, Angew. Chem. 2001, 113, 322 ± 341; Angew. Chem. Int. Ed. 2001, 40, 312 ± 329; c) M. T. Reetz, Angew. Chem. 2001, 113, 292 ± 320; Angew. Chem. Int. Ed. 2001, 40, 284 ± 310; d) J. M. Newsam, F. Sch¸th, Biotechnol. Bioeng. 1999, 61, 203 ± 216;. e) F. Gennari, P. Seneci, S. Miertus, Catal. Rev. ± Sci. Eng. 2000, 42, 385 ± 402. [5] M. Jansen, Angew. Chem. 2002, 114, 3896 ± 3917; Angew. Chem. Int. Ed. 2002, 41, 3747 ± 3766. [6] D. Farrusseng, L. Baumes, C. Hayaud, I. Vauthey, P. Denton, C. Mirodatos in Principles and Methods for Accelerated Catalyst Design (Eds.: E. Derouane et al.) NATO Science Series, Kluwer Academic Publishers, Dordrecht, 2002, p. 469 ± 479. [7] J. P. Pirard, B. Kalitvenzeff, Ind. Eng. Chem. Fundam. 1978, 17, 11 ± 17. [8] M. Iborra, J. F. Izquierdo, F. Cunill, J. Tejero, Ind. Eng. Chem. Res. 1992, 31, 1840 ± 1848. [9] P. Rao, S. Divakar, World J. Microbiol. Biotech. 2002, 18, 341 ± 345. [10] a) E. Kˆrting, M. Baerns, Chem. Ing. Tech. 1990, 62, 365 ± 372; b) M. Baerns, N. Guan, E. Kˆrting, U. Lindner, M. Lohrengel, H. Papp, Int. J. Energy Res. 1994, 18, 197 ± 204. [11] T. Hattori, S. Kito, Catal. Today 1995, 23, 347 ± 355. 736 [12] S. Kito, T. Hattori, Y. Murakami, Appl. Catal., A. 1994, 114, L173 ± 178. [13] S. Kito, T. Hattori, Y. Murakami, Ind. Eng. Chem. Res. 1992, 31, 979 ± 981. [14] A. Corma, J. M. Serra, E. Argente, V. Botti, S. Valero, Chem. Phys. Chem. 2002, 3, 939 ± 045. [15] D. Wolf, O. V. Buyevskaya, M. Baerns, Appl. Catal., A 2000, 200, 63 ± 77. [16] U. Rodemerck, D. Wolf, O. V. Buyevskaya, P. Claus, S. M. Senkan, M. Baerns, Chem. Eng. J. 2002, 82, 3 ± 11. [17] From Eurocombicat 2002: a) A. Corma, J. M. Serra, A. Chica, Book of Abstracts p. 47 ± 48; b) G. Grubert, S. Kolf, L. Cholinska, M. Baerns, P. van Geem, R. Parton, Book of Abstracts p. 49 ± 51; c) J. M. Serra, A. Corma, D. Farrusseng, L. Baumes, C. Mirodatos, C. Flego, C. Perego, Book of Abstracts p. 98 ± 99. [18] H. Kubinyi, Curr. Opin. Drug Discovery. Dev. 1998, 1, 16 ± 27. [19] For instance: a) R. Ramos, M. Menendez, J. Santamaria, Catal. Today 2000, 56, 239 ± 245; b) M. Nele, A. Vidal, D. L. Bhering, J. C. Pinto, V. M. M. Salim, Appl. Catal,. A 1999, 178, 177 ± 189; c) E. A. Dawson, P. A. Barnes, Appl. Catal., A 1992, 90, 217 ± 231. [20] J. Bajorath, J. Chem. Inf. Comput. Sci. 2001, 41, 233 ± 245. [21] D. K. Agrafiotis, J. C. Myslik, F. R. Salemme, Mol. Diversity 1999, 4, 1 ± 22. [22] G. M. Maggiora, M. A. Johnson, Concepts and Applications of Molecular Similarity, Wiley, New York, 1990, p. 99 ± 117. [23] R. D. Brown, Y. C. Martin, J. Chem. Inf. Comput. Sci. 1996, 36, 572 ± 584. [24] R. Todeschini, V. Consonni, Handbook of Molecular Descriptors, Wiley-VCH, Weinheim, 2000. [25] For a good discussion of similarity searching see, for instance: P. Willett, J. M. Barnard, G. M. Downs, J. Chem. Inf. Comput. Sci. 1998, 38, 983 ± 996. [26] J. C. Schˆn, M. Jansen, Angew. Chem. 1996, 108, 1358 ± 1377; Angew. Chem. Int. Ed. 1996, 35, 1286 ± 1304. [27] R. Schlˆgl, Angew. Chem. 1998, 110, 2467 ± 2470; Angew. Chem. Int. Ed. 1998, 37, 2333 ± 2336. [28] H. Sch‰fer, Angew. Chem. 1971, 83, 35 ± 42; Angew. Chem. Int. Ed. 1971, 10, 43 ± 50. [29] M. Haruta, N. Yamada, T. Kobayashi, S. Iijima, J. Catal. 1989, 115, 301 ± 309. [30] See, for instance, several examples in: G. Ertl, H. Knˆzinger, J. Weitkamp (Eds.) Handbook of Heterogeneous Catalysis, Wiley-VCH, Weinheim, 1997. [31] G. M. Downs, P. Willett, W. Fisanick, J. Chem. Inf. Comput. Sci. 1994, 34, 1094 ± 1102. [32] R. D. Brown, Y. C. Martin, J. Chem. Inf. Comput. Sci. 1996, 36, 572 ± 584. [33] J. N. Weinstein, K. W. Kohn, M. R. Grever, V. N. Viswanadhan, L. V. Rubinstein, A. P. Monks, D. A. Scudiero, L. Welch, A. D. Koutsoukos, A. J. Chiausa, K. D. Paull, Science 1992, 258, 447 ± 451. [34] L. M. Kauvar, D. L. Higgins, H. O. Villar, J. R. Sportsman, A. Engvist-Goldstein, R. Bukar, K. E. Bauer, H. Dilley, D. M. Rocke, Chem. Biol. 1995, 2, 107 ± 118. [35] H. Briem, U. F. Lessel, Perspect. Drug Discovery Des. 2000, 20, 231 ± 244. [36] a) W.-D. Ihlenfeldt, J. Gasteiger, Angew. Chem. 1995, 107, 2807 ± 2829; Angew. Chem. Int. Ed. 1995, 34, 2613 ± 2633; b) J. Gasteiger, M. Pfˆrtner, M. Sitzmann, R. Hˆllering, O. Sacher, T. Kostka, N. Karg, Perspect. Drug Discovery Des. 2000, 20, 1 ± 21. Received on May 15, 2003; Accepted on May 8, 2003 ¹ 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim QSAR Comb. Sci. 22 (2003)
© Copyright 2024