Review of Fuzzy Rule Based Classification systems
Chandrasekar Ravi, Neelu Khare
School of Information Technology and Engineering, VIT University, Vellore, India
*Corresponding Author E-mail: chandrasekar.r@vit.ac.in
ABSTRACT:
Fuzzy Rule Based Classification systems (FRBCs) have received significant attention among the researchers due to the good behaviour in the real time databases. An important issue in the design of fuzzy rule-based classification system is the optimized generation of fuzzy if-then rules and the membership functions. The inductive learning of fuzzy rule classifier suffers in rule generation and rule optimization when the search space or variables becomes high. This creates the new idea of making the fuzzy system with precise rules leading to less scalability and improved accuracy. Accordingly, different approaches have been presented in the literature for optimal finding of fuzzy rules using optimization algorithms. Among the different techniques available in the literature, choosing the type, number of membership functions and defining parameters of membership function are still challenging tasks. In this paper, the optimization algorithms for optimal design of membership function and optimal rule generation are reviewed.
KEYWORDS: Fuzzy Rule Based Classification System, Fuzzy Logic, Data Classification, Literature Review.
INTRODUCTION:
Solving complex engineering problems by traditional means is not an easy task. Nowadays, computational intelligence becomes a popular technique to solve such complex problems. In the research field, a standard approach called Hybrid computational intelligence which includes neural network, decision tree and fuzzy rule based system is most commonly used. In machine learning framework, Fuzzy rule-based classification systems (FRBCSs) are generally used, as they are able to present a clear representation for the end user. FRBCSs are widely used in various real applications such as in anomaly intrusion detection, image processing, medical applications and in many other applications. In such applications, large number of patterns and variables are present in the useful data or in the available data. This condition leads to a rapid increase of the fuzzy rule search space which in turn affects the inductive learning of FRBC systems. With this rapid growth, the learning process becomes further complicated, which results in scalability and complexity problems.
In fuzzy expert system, a set of if-then rules and membership functions are used. A set of these rules leads to the formation of rule base for the fuzzy expert system in which qualitative reasoning is carried out to deduce the results. Based on the fuzzy if-then rules, a fuzzy relation is created which is used to express the relation between the input and output. Fuzzy if-then rule is described by domain experts. If domain experts are unavailable, then the rules are taken from the training data space. The extraction of rules from the data space is a search problem in high dimensional space in which each point indicates a rule set, membership function and the equivalent system behaviour.
Several techniques are available for generating and learning fuzzy classification rules from numerical data which includes simple heuristic procedures, neuro-fuzzy techniques clustering methods and genetic algorithms. A Lot of heuristic and metaheuristic algorithms including Particle swarm optimization (PSO), Simulated Annealing, Firefly, Artificial Bee colony Optimization (ABC), etc., are formed from the behaviour of biological systems and physical systems in nature. All of the above algorithms have some advantages and disadvantages. For example, simulating annealing provide an optimal solution if the simulation is running long enough the cooling process is slow enough. On the other hand, the fine adjustment in parameters has effect on the convergence rate of the optimization process. Hybrid fuzzy (HF) method is proposed in for extracting a compact rule base. It represents only the rule set in the genetic population and so it failed to model the fuzzy system clearly. In the fuzzy system, the membership function and rule set are inter-reliant, and so it should be designed at the same time.
Review of FRBCs:
The fuzzy system1 has been modelled to generate optimal number of rules using union rule configuration (URC) and a union rule matrix (URM). URC/URM constraints have been outlined and the performance of the two configurations is compared. The advantage in computation time is demonstrated. A URC based system was easier to build, easier to tune, and could provide a smoother operational function. The union-rule configuration has demonstrated significant design and performance advantages. In those instances where performance is a consideration, its structure should be carefully investigated as a viable option in the future.
The FuGeNeSys2 program generates simple, effective fuzzy models of complex systems from knowledge of the input–output data. The learning technique used is essentially based on GA’s. To enhance the learning speed, a hill-climbing genetic operator based on neural techniques has been used. FuGeNeSys is also capable of correctly selecting significant features. It has been demonstrated that the results achieved represent a significant advance in the use of mixed techniques in soft computing.
Techniques for automated knowledge acquisition using neural networks, genetic algorithms, and rough sets3 have been described, and a number of research problems have been identified. The problem of appropriate evaluation criteria for new knowledge acquisition techniques for classification systems has been discussed and a number of evaluation criteria described. An empirical study in the application of several of these automated knowledge acquisition techniques on three data sets has been conducted. Two evaluation criteria; rule base accuracy and rule base comprehensibility, have been applied and evaluated. Our research to date indicates that an automated knowledge acquisition technique based on a genetic algorithm/fuzzy system approach provides excellent results. These compare more than favourably with those obtained from both a neural network fuzzy rule extraction technique and a rough sets approach. For the three data sets investigated the genetic algorithm/fuzzy system also evolved rule sets that exhibited higher accuracy and comprehensibility that those obtained using the C4.5 inductive algorithm.
An approach to fuzzy modelling of high-dimensional systems4 has been proposed. The proposed method can be divided into the following steps: 1) generation of fuzzy rules directly from data; 2) rule similarity checking for deletion of the redundant and inconsistent rules; 3) optimization of the rule structure using genetic algorithms based on a local performance index; 4) further training of the rule parameters using gradient based learning method and deletion of the inactive rules; 5) interpretability improvement using regularization. In this way, a compact and interpretable fuzzy model can be obtained for a high-dimensional system. Through structure optimization, the relationship between the inputs and the output can also be revealed, which is very important for understanding an unknown system. The effectiveness of the method is shown by an example. With 20000 training data and 11 input variables, the final fuzzy system has only 27 fuzzy rules with a very good performance on both training and test data sets.
An approach to the automatic generation of fuzzy rule-based models5 has been presented. Two encoding procedures that allow the optimisation of the model structure and parameters by GA have been described. The learning procedure has been discussed and an enhancement on the mean square error search criterion was found to produce better optimal solutions for the problem investigated in the work (in terms of engineering judgement). The implementation of the approach has been described and an application example of HVAC modelling based on real system data has been demonstrated. The validation of the approach in this paper was based on data collected from a real cooling coil subsystem comprising of numerous components. The results demonstrate the ability of the method to derive a model of a highly dimensioned problem using relatively few rules). The low number of rules increases the interpretability of the model. The accuracy of the model on the validation data is reasonable. The approach has been extensively validated with a variety of data types, oriented towards component and process modelling.
The learning algorithm6 finds fuzzy rules for classification problems based on the processing of the Apriori algorithm. Significantly, our method tries to find a compact set of fuzzy rules by using the GA to automatically find the appropriately min FS and min FC. Simulation results on the iris data and the appendicitis data demonstrate that the classification accuracy rates of the proposed method are comparable to the other fuzzy or non-fuzzy methods. Thus, the goal of acquiring an effectively compact set of fuzzy rules for classification problems can be achieved.
A trade-off between accuracy and complexity controls the complexity of the induced model7 in an explicit way. In this regard, the first empirical results are rather promising. Still, there is of course scope for further development. For example, both phases of the algorithm, association rule mining and genetic search, can be optimized in various ways. Moreover, association analysis can be replaced by alternative methods for generating candidate rules by modifying standard rule induction algorithms.
An accuracy-based Michigan-style fuzzy rule-based system8 for continuous state and action is developed. Its main advantages are compared with the most of genetic fuzzy systems, its capability to perform online learning and, compared with other Michigan-style genetic fuzzy systems, its capability to obtain maximal generalization, i.e., representation of the fuzzy rule set as compact as possible. The algorithm design is inspired with the well-know XCS algorithm for non-fuzzy rules. The fitness function is based on the estimation of the performance prediction error in order to look for robust (in the sense of the received reward) fuzzy rules. Niche search is also considered. Promising results of the proposal have been obtained in some function approximation problems and a realistic robot simulation online learning. Future work involves investigating the behavior of the proposal in other reinforcement problems with continuous actions and multi-step tasks with immediate reward.
The hybridization between fuzzy systems and GAs in GFSs is proposed9. GAs allow us to represent different kinds of structures, such as weights, features together with rule parameters, etc., allowing us to code multiple models of knowledge representation. This provides a wide variety of approaches where it is necessary to design specific genetic components for evolving a specific representation. Nowadays, it is a mature research area, where researchers need to reflect in order to advance towards strengths and distinctive features of the GFSs, providing useful advances in the fuzzy systems theory.
The bottleneck of fuzzy expert system for microarray data classification is knowledge acquisition in the form of if-then rules and membership function. An Ant Bee Algorithm10 is proposed to address the Accuracy- Interpretability trade-off in the design of fuzzy expert system for sample classification. In the proposed ABA, Rule set is represented using integer numbers and evolved using ACO. The values of membership function use floating point numbers and are evolved using ABC simultaneously along with the rule set. The effectiveness of the proposed approach has been demonstrated using six microarray data sets. From the simulation result, it is understood that the learning ability of ABA is comparable and its classification error estimated for all the data sets using MCCV procedure during generalization is minimum than the other approaches. Further, through ROC analysis, it is observed that the proposed ABA approach has low false positive rate and high discrimination power in improving the classification accuracy. With the help of GO analysis, it is confirmed that the linguistics identified for the genes using the proposed ABA approach are tangled in metabolic progression and have biological significance in classification of microarray samples. On the whole, for all the data sets, the proposed ABA approach generated a compact (average of 5.4 genes in a rule), accurate (average of 98.5 percent overall classification accuracy) and interpretable (average of 2.3 linguistics for a gene in a rule) fuzzy expert system than GSA and other approaches reported in the literature.
The bottleneck of fuzzy expert system for microarray data classification is knowledge acquisition. The proposed Genetic Swarm Algorithm11 acquires knowledge in the form of if-then rules and membership function. In the proposed approach, a mixed form of representation is used to encode the solution variables and treat them as single individual in the population. During the course of run, the rule set is evolved using GA and values of membership function is evolved using PSO simultaneously. In addition to the basic genetic operators, problem specific and advanced genetic operators have been applied for fine tuning of solution variables. The effectiveness of the proposed approach has been demonstrated using six gene expression data sets. For all the data sets, the proposed approach generated a compact fuzzy system with high classification accuracy which can be used as a helping tool for the physician to take decision in the diagnosis of disease.
Based on complex linguistic data summaries12, a method for extracting linguistic rules from data sets is proposed, in which, the degree of confidence of linguistic rules from a data set can be explained by linguistic quantifiers and its linguistic truth from the fuzzy logical point of view. Genetic algorithm is used to optimize the number and parameters of membership functions of linguistic values, optimized linguistic rules have higher fuzzy linguistic quantifier and linguistic truth.
By the combination of Genetic Algorithm with the fuzzy set a new classifier namely, Adaptive Genetic Fuzzy classifier13 has been proposed. At this point, rule optimization was done by AGA with the aid of new systematic addition and the classification was passed out by fuzzy classifier. Here the frequency of occurrence of the rules in the training data is considered as the fitness for AGA. In conclusion, by means of quantitative, qualitative and comparative analysis the achievement of the proposed genetic-fuzzy classifier is evaluated. From the resultants, AGFS generated 86.05% accurateness but, the existing system attained only 79.39% in glass data. Correspondingly, for PID data, AGFS generated accuracy of 89.80% while the existing fuzzy-GA attained only 89.74%. To produce optimal rules and also, it is extendible to a few others the fuzzy systems in future; the proposed system can be extensive with several other optimization algorithms. The AGFS can be functional to different kinds of applications also.
A rule-based CCACO14is proposed for Fuzzy System design optimization. For a given number of rules, the CCACO is used to optimize all of the free parameters in the FS to achieve high learning accuracy. In the CCACO, a single fuzzy rule forms a population, and different populations cooperate to form a complete FS. The simulation results show that the CCACO outperforms the GA, PSO, continuous ACO algorithms, and evolving NFSs used for comparison. In particular, the comparison with different continuous ACO algorithms using a single population shows the advantage of introducing the multipopulation and cooperative structure into continuous ACO. In addition, the comparison with PSO algorithms using the same multipopulation topology shows the advantage of using the new continuous ACO algorithm for solution generation and update. In the future, the CCACO can be applied to multiobjective FS design problems for optimization performance improvement.
A new classifier called as Bat based Fuzzy Classification (BFC)15 has been proposed by unifying bat algorithm with fuzzy set. Here, BA has been used to optimize rules, whereas fuzzy system classifies the test data. The primary contributions that were reported in this paper are (1) proposing a simple technique for discretization and design of membership function and (2) design of fitness function based on the frequency of occurrence of rules in training data. Eventually, quantitative, qualitative and comparative analyses were performed on the proposed Bat—fuzzy classifier to study its performance. The experimental results have revealed that BFC is achieved 76.67%, 75.21% and 68.67% of accuracy in Indian liver, Lung cancer and mammographic mass dataset respectively. Future works can also be developed by unifying the proposed system with other optimization algorithms for generating optimal rules and can also be extended with other fuzzy systems.
CONCLUSION:
Fuzzy system is widely applied for classification since it has the more advantageous of providing flexibility and avoided learning time as compared with other classifiers like, neural network and support vector machine. Even though the fuzzy system has good application, it found difficult in its designing process like, rule base and membership function designing where, domain experts’ knowledge is required even though the historic data is available. These two steps should be automatically performed to avoid the requirement of expert’s knowledge in fuzzy classification system. This paper reviews the challenges in the designing of membership function and the optimized rule set.
REFERENCES:
1. W. Combs and J. Andrews. Combinatorial rule explosion eliminated by a fuzzy rule configuration. IEEE Trans. Fuzzy Syst. 6(1); 1998:1–11, 1998.
2. M. Russo. FuGeNeSys—A fuzzy genetic neural system for fuzzy modelling. IEEE Transaction on Fuzzy Systems. 6(373); 1998.
3. Jagielska et. al. An investigation into the application of neural networks, fuzzy logic, genetic algorithms, and rough sets to automated knowledge acquisition for classification problems. Neuro Computing. 24(37); 1999.
4. Y. Jin. Fuzzy modeling of high-dimensional systems: Complexity reduction and interpretability improvement. IEEE Trans. Fuzzy Syst. 8(2); 2000:212–221.
5. P. P. Angelov and R. A. Buswell. Automatic generation of fuzzy rule-based models from data by genetic algorithms. Information Sciences. 150(17); 2003.
6. Y. Huet. al. Finding fuzzy classification rules using data mining techniques. Pattern Recognit. Lett. 24(1–3); 2003:509–519.
7. Y. Yi and E. Hullermeier. Learning complexity-bounded rule-based classifiers by combining association analysis and genetic algorithms. Proc. 4th Conf. Eur. Soc. Fuzzy Logic Technol., Barcelona, Spain. 2005:47–52.
8. J. Casillaset. al. Fuzzy-XCS: A michigan genetic fuzzy system. IEEE Transactions on Fuzzy Systems. 15(536); 2007.
9. F. Herrera. Genetic fuzzy systems: taxonomy, current research trends and prospects. Evolutionary Intelligence, Springer. 2008:27–46.
10. Pugalendhi Ganesh Kumaret. al. Hybrid Ant Bee Algorithm for Fuzzy Expert System Based Sample Classification. IEEE/ACM transactions on computational biology and bioinformatics. 11(2); 2014:347-360.
11. P. Ganesh Kumaret. al. Design of fuzzy expert system for microarray data classification using a novel Genetic Swarm Algorithm. Expert Systems with Applications. 39; 2012: 1811–1821.
12. Dan Meng and Zheng Pei. Extracting linguistic rules from data sets using fuzzy logic and genetic algorithms. Neurocomputing. 78; 2012:48–54.
13. Binu Dennis and S. Muthukrishnan. AGFS: Adaptive Genetic Fuzzy System for medical data classification. Applied Soft Computing. 25; 2014:242–252.
14. Chia-Feng Juanget. al. Rule-Based Cooperative Continuous Ant Colony Optimization to Improve the Accuracy of Fuzzy System Design. IEEE transactions on fuzzy systems. 22(4);2014:723-735.
15. Binu, D. and Selvi, M. BFC: Bat Algorithm Based Fuzzy Classifier for Medical Data Classification. Journal of Medical Imaging and Health Informatics. 5(3); 2015:599-606.
Received on 13.05.2016 Modified on 20.05.2016
Accepted on 27.05.2016 © RJPT All right reserved
Research J. Pharm. and Tech 2016; 9(8):1299-1302.
DOI: 10.5958/0974-360X.2016.00247.X