Pattern recognition Mauricio Orozco-Alzate
 
:   Home   ::   Teaching   ::   Research   ::   Publications  ::   Students   ::   CV   ::   CvLAC   :
 

We, human beings, perceive information of the surrounding environment by the senses. Using a series of general concepts or patterns that we have learned about the objects as well as with multi-sensorial information and the cognitive ability of recognition we can, for instance, recognize each character of the alphabet, distinguish between male and female faces or identify a known person when hearing a voice on the phone. The same basic ability, enriched with the knowledge acquired throughout a learning process (e.g. years of education), allows physicians for making a medical diagnosis based on clinical findings and symptoms, or analogously, gives to an expert in seismology the ability to recognize a type of volcanic event, based on the analysis of seismic signals and their corresponding spectra.

The aforementioned examples and, in general, all the processes of recognition, involve a classification or an identification of objects, persons, events or situations. Afterwards, a decision is taken and/or a particular action is executed; e.g. the rejection of pieces identified as faulty or damaged in a production line or the provision of an entrance authorization in a security system.

The complex and repetitive classification tasks, the necessity of increasing the reliability and objectivity of decisions, judgments or diagnoses and the research works about the human brain promoted the development of algorithms —also called machines— capable of extracting essential knowledge from the environment and representing it mathematically, in order to learn from such a representation a concept of class or category and to identify objects or classify them automatically. The areas of artificial intelligence (AI), machine learning (ML) and pattern recognition (PR) arose from such interests. Even though they share a significant part of their foundations and applications, some authors consider them as a single area while others claim that there are important differences, particularly in the underlying concepts as well as in the applications. Bishop (2006), for example, just makes a distinction between the different origins of PR and ML: "Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science. However, these activities can be viewed as two facets of the same field". Kuncheva (2005) sketched an illustrative map of how AI, ML, and PR as well as other close areas such as Statistics, Data Mining and Neural Computation interact with and depend on each other. Such a map is shown in the following figure:

The core of my research is related to the field of pattern recognition and particularly to statistical PR; that is, the PR based on probability theory i.e. decision-theoretical approach. There are several definitions of PR in the classical textbooks (Fukunaga, 1990; Duda, 2001; Webb, 2002; van der Heijden, 2004; Theodoridis, 2006). Some of them are very general while others are quite specific. Theodoridis and Koutroumbas (2006) define PR as "the scientific discipline whose goal is the classification of objects into a number of categories"; Duda et al. (2001) define it as "the act of taking raw data and taking an action based on the category of the pattern". Taking into account those definitions, I give mine here with the intention to be a comprehensive one: "Pattern recognition is the discipline that attempts to find ways of imitating the human capacity of using sensorial information and knowledge intelligently, providing mathematical foundations, models and methods for learning from a limited number of examples, in order to automate the process of classification or categorization". Usually, the knowledge cited in my definition is called prior knowledge. It is that we —or the machine— know about the object beforehand. Similarly, sensorial information, perceived through our senses or acquired by sensors according to the case, is called empirical knowledge. The combined use of both sources of information produces the posterior knowledge, in which decisions and actions for minimizing errors or losses are based.

Recommended books

R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification., 2nd ed. New York: Wiley-Interscience, 2001.
Table of Contents

K. Fukunaga, Introduction to Statistical Pattern Recognition. New York: Aca­ demic Press, 1990.
Preface xi(2)
Acknowledgments xiii
Chapter 1 Introduction
1(10)
1.1 Formulation of Pattern Recognition Problems
1(6)
1.2 Process of Classifier Design
7(2)
Notation
9(1)
References
10(1)
Chapter 2 Random Vectors and Their Properties
11(40)
2.1 Random Vectors and Their Distributions
11(6)
2.2 Estimation of Parameters
17(7)
2.3 Linear Transformation
24(11)
2.4 Various Properties of Eigenvalues and Eigenvectors
35(12)
Computer Projects
47(1)
Problems
48(2)
References
50(1)
Chapter 3 Hypothesis Testing
51(73)
3.1 Hypothesis Tests for Two Classes
51(14)
3.2 Other Hypothesis Tests
65(20)
3.3 Error Probability in Hypothesis Testing
85(12)
3.4 Upper Bounds on the Bayes Error
97(13)
3.5 Sequential Hypothesis Testing
110(9)
Computer Projects
119(1)
Problems
120(2)
References
122(2)
Chapter 4 Parametric Classifiers
124(57)
4.1 The Bayes Linear Classifier
125(6)
4.2 Linear Classifier Design
131(22)
4.3 Quadratic Classifier Design
153(16)
4.4 Other Classifiers
169(7)
Computer Projects
176(1)
Problems
177(3)
References
180(1)
Chapter 5 Parameter Estimation
181(73)
5.1 Effect of Sample Size in Estimation
182(14)
5.2 Estimation of Classification Errors
196(23)
5.3 Holdout, Leave-One-Out, and Resubstitution Methods
219(19)
5.4 Bootstrap Methods
238(12)
Computer Projects
250(1)
Problems
250(2)
References
252(2)
Chapter 6 Nonparametric Density Estimation
254(46)
6.1 Parzen Density Estimate
255(13)
6.2 kNearest Neighbor Density Estimate
268(19)
6.3 Expansion by Basis Functions
287(8)
Computer Projects
295(1)
Problems
296(1)
References
297(3)
Chapter 7 Nonparametric Classification and Error Estimation
300(67)
7.1 General Discussion
301(4)
7.2 Voting kNN Procedure -- Asymptotic Analysis
305(8)
7.3 Voting kNN Procedure -- Finite Sample Analysis
313(9)
7.4 Error Estimation
322(29)
7.5 Miscellaneous Topics in the kNN Approach
351(11)
Computer Projects
362(1)
Problems
363(1)
References
364(3)
Chapter 8 Successive Parameter Estimation
367(32)
8.1 Successive Adjustment of a Linear Classifier
367(8)
8.2 Stochastic Approximation
375(14)
8.3 Successive Bayes Estimation
389(6)
Computer Projects
395(1)
Problems
396(1)
References
397(2)
Chapter 9 Feature Extraction and Linear Mapping for Signal Representation
399(42)
9.1 The Discrete Karhunen-Loeve Expansion
400(17)
9.2 The Karhunen-Loeve Expansion for Random Processes
417(8)
9.3 Estimation of Eigenvalues and Eigenvectors
425(10)
Computer Projects
435(3)
Problems
438(2)
References
440(1)
Chapter 10 Feature Extraction and Linear Mapping for Classification
441(67)
10.1 General Problem Formulation
442(3)
10.2 Discriminant Analysis
445(15)
10.3 Generalized Criteria
460(6)
10.4 Nonparametric Discriminant Analysis
466(14)
10.5 Sequential Selection of Quadratic Features
480(9)
10.6 Feature Subset Selection
489(14)
Computer Projects
503(1)
Problems
504(2)
References
506(2)
Chapter 11 Clustering
508(56)
11.1 Parametric Clustering
509(24)
11.2 Nonparametric Clustering
533(16)
11.3 Selection of Representatives
549(10)
Computer Projects
559(1)
Problems
560(2)
References
562(2)
Appendix A DERIVATIVES OF MATRICES 564(8)
Appendix B MATHEMATICAL FORMULAS 572(4)
Appendix C NORMAL ERROR TABLE 576(2)
Appendix D GAMMA FUNCTION TABLE 578(1)
Index 579

C. M. Bishop, Pattern Recognition and Machine Learning. Singapore: Springer, 2006.

A. R. Webb, Statistical Pattern Recognition, 2nd ed. London, UK: Wiley, 2002.

F. van der Heijden, R. P. W. Duin, D. de Ridder, and D. M. J. Tax, Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB. Chichester , UK: Wiley, 2004.

S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd, Ed. Amsterdam: Elsevier Academic Press, 2006.

P. A. Devijver and J. Kittler, Pattern Recognition: a Statistical Approach. Lon­ don: Prentice Hall International, 1982.

Software

PRTools
PRToolsPRTools is a Matlab based toolbox for pattern recognition. It started in the Delft Pattern Recognition Group at the Faculty of Applied Physics (later Applied Sciences) of Delft University of Technology, around 1993. This research group had already been working in the field since 1963. Its field of interest covers many areas in pattern recogniton and image processing, as well as robotics. The subgroup concentrating on pattern recognition methodology designed PRTools for educational purposes, as well as a service to colleagues who need a basic software package for building applications or for comparative studies. It can be downloaded from http://www.prtools.org/. PRTools supplies about 200 user routines for traditional statistical pattern recognition tasks. It includes procedures for data generation, training classifiers, combining classifiers, features selection, linear and non-linear feature extraction, density estimation, cluster analysis, evaluation and visualisation. It is intended to aid students and researchers in designing and evaluating new algorithms and in building prototypes.

Statistics Toolbox - MATLAB
RStatistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of tools to assess and understand their data. Statistics Toolbox software includes functions and interactive tools for analyzing historical data, modeling data, simulating systems, developing statistical algorithms, and learning and teaching statistics. The Statistics Toolbox supports a wide range of tasks, from basic descriptive statistics to developing and visualizing multidimensional nonlinear models. It offers a rich set of statistical plot types and interactive graphics, such as polynomial fitting and response surface modeling. All Statistics Toolbox functions are written in the open MATLAB language. This means that you can inspect the algorithms, modify the source code, and create your own custom functions. See more at http://www.mathworks.com/products/statistics/

R
RR is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control. See more at http://www.r-project.org/.

.
:   Home   ::   Teaching   ::   Research   ::   Publications  ::   Students   ::   CV   ::   CvLAC   :
Mauricio Orozco-Alzate

Locations of visitors to this page