Sunday, February 26, 2012

3D Object recognition by classification using neural networks.(Report)

1. Introduction

The growing number of 3D objects available on the Internet or in specialized databases, mandates the establishment of methods to develop techniques for recognition and description to access the content of these smartly objects. However the objects' 3D object recognition has been as extensive and has become very important in many areas.

In this framework, several approaches exist: In terms of statistical approaches, the statistical shape descriptors for recognition generally consist of either calculating various statistical moments [1-3] or estimating the distribution of the measurement of a given geometric primitive, which is deterministic [3] or random [2]. Among the approaches by statistical distribution, we mention the specter of a 3D shape (3D SSD) [4] which is invariant to geometric transformations and algebraic invariants [5], providing of global descriptors, expressed in terms of moments of different orders. For structural approaches, the approaches representing the segmentation of a 3D object into plot of land and representation by the adjacency graph are presented in [6] and [7].

In the same vein, Tang elder et al. [8] have developed an approach based on representations by points of interest. In approaches by transformation a very rich literature emphasizes any interest in approaches based on Haugh transformation [9-11] which include detecting different varieties of (n - 1) dimension diving into the space.

In the same vein, this work focuses on defining a method that allows the recognition of 3D objects by classification based on neural networks. Classification is a computing tool that expects as input a list of numbers, and which provides, at its output, an indication of class. A classifier must be able to model the best borders that separate classes from each other. This modeling uses the concept of discriminant function, which allows to express the classification criterion. Its role is to determine, among a finite set of classes, to which class a particular item belongs.

2. Representation of 3D Objects

3D object is represented by a set of points denoted M = [{[P.sub.i]}.sub.i=1,...,n] where [P.sub.i] = ([x.sub.i], [y.sub.j], [z.sub.i]) [member of] [R.sup.3], arranged in a matrix X. Under the action of an affine transformation, the coordinates (x, y, z) are transformed into other coordinates ([??],[??],[??]) by the following procedure:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

with A = [([a.sub.i,j]).sub.i,j=1,2,3] invertible matrix associated with

B and [R.sup.3] is a vector translation in [R.sup.3].

3. Classification

Classification is a research area that has been developed in the sixties. It is the basic principle of multiple support systems for diagnosis. It assigns a set of objects to a set of classes according to the description thereof. This description is done through properties or specific conditions typical to these classes.

Objects are then classified according to whether or not they check these conditions or properties. Classification methods can be supervised or unsupervised.

Supervised methods require the user a description of the classes while those unsupervised are independent of the user. Rather they are methods of statistical grouping that sort objects according to their properties and form sets with similar characteristics.

4. The Artificial Neural Networks

The artificial neural networks (ANN) are mathematical models inspired by the structure and behavior of biological neurons [12]. They are composed of interconnected units called artificial neurons capable of performing specific and precise functions [13]. ANN can approximate nonlinear relationships of varying degrees of complexity and significant to the recognition and classification of data. Figure 1 illustrates this situation.

4.1. Architecture of Artificial Neural Networks

For an artificial neural network, each neuron is interconnected with other neurons to form layers in order to solve a specific problem concerning the input data on the network [14,15].

The input layer is responsible for entering data for the network. The role of neurons in this layer is to transmit the data to be processed on the network. The output layer can present the results calculated by the network on the input vector supplied to the network. Between network input and output, intermediate layers may occur; they are called hidden layers. The role of these layers is to transform input data to extract its features which will subsequently be more easily classified by the output layer. In these networks, information is propagated from layer to layer, sometimes even within a layer via weighted connections.

A neural network operates in two consecutive phases: a design phase and use phase. The first step is to choose the network architecture and its parameters: the number of hidden layers and number of neurons in each layer. Once these choices are fixed, we can train the network. During this phase, the weights of network connections and the threshold of each neuron are modified to adapt to different conditions of input. Once the training on this network is completed, it goes into use phase to perform the work for which it was designed.

[FIGURE 1 OMITTED]

4.2. Multilayer Perceptron

For a multilayer network, the number of neurons in the input layer and output layer is determined by the problem to be solved [14-16]. The architecture of this type of network is illustrated in Figure 2. According to R. LEPAGE and B. Solaiman [14],the neural network has a single layer with a hidden number of neurons approximately equal to: J = 1 + [square root of (N(M + 2))] where:

N : number of input parameters.

M : the number of neurons in the output layer.

4.3. Figures and Tables

The learning algorithm used is the gradient back propagation algorithm. [17] This algorithm is used in the feed forward type networks, which are networks of neurons in layers with an input layer, an output layer, and at least one hidden layer. There is no recursion in the connections and no connections between neurons in the same layer.

The principle of backpropagation is to present the network a vector of inputs to the network, perform the calculation of the output by propagating through the layers, from the input layer to the output layer through hidden layers. This output is compared to the desired output, an error is then obtained. From this error, is calculated the gradient of the error which in turn is propagated from the output layer to the input layer, hence the term back propagation. This allows modification of the weights of the network and the refore learning. The operation is repeated for each input vector and that until the stop criter ion is verified [18].

4.4. Learning Algorithm

The objective of this algorithm is to minimize the maximum possible error between the outputs of the network (or calculated results) and the desired results. We spread the signal forward in the layers of the neural network: [x.sup.(n-1).sub.k] [right arrow] [x.sup.(n).sub.j] The spread forward is calculated using the activation function, the aggregation function h (often a scalar product between the weights and the inputs of the neuron) and synaptic weight [w.sub.jk]between the neuron [x.sup.(n-1).sub.k] and the neuron [x.sup.(n).sub.j] as follows:

[x.sup.(n).sub.k] = [g.sup.(h)]([h.sup.(n).sub.j]) = [g.sup.(n)]([summation over k][w.sup.(n).sub.jk] [x.sup.(n- 1).sub.k]) (1)

[FIGURE 2 OMITTED]

When the forward propagation is complete, we get the output result y. It then calculates the error between the output y given by the network and the desired vector s.

For each neuron i in output layer is calculated:

[e.sup.sortie.sub.i] = g'([h.sup.sortie.sub.i])[[s.sub.i] - [y.sub.i]] (2)

It propagates the error backward [e.sup.(n).sub.i] [??] [e.sup.(n-1).sub.j] through the following formula:

[e.sup.(n-1).sub.i] = [g'.sup.(n-1)([h.sup.(n-1).sub.i])[SIGMA][w.sub.ij] [e.sup.(n).sub.j] (3)

It updates the weights in all layers:

[DELTA][w.sup.(n).sub.ij] = [lambda][e.sup.(n).sub.i] [X.sup.(n-1).sub.j] (4)

where [DELTA] is the learning rate (of low magnitude and less than 1.0).

5. Principle of the Proposed Method

The principle of the proposed method is as follows: Step 1:

Given two 3D objects (object of database) X and Y (query object) that seeks to verify if they are related by a linear transformation and therefore are part of the same class (class of objects and transformed by an affine transformation), so we take as a first step r random samples of size p of points X respectively Y points named [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] respectively [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. After that we study the association between the samples [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. To do this we extract the parameters [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] that can transmit [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] as follows :

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (5)

using neural networks as shown in the Figure 3.

Step 2:

In this step, we first calculates the points of the vector [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] using the previously extracted parameters [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] by the Formula (1) as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (6)

[FIGURE 3 OMITTED]

and [for all]j [member of] [1,2,..., r ] as such j [not equal to] [i.sub.0], then we proceed to calculate the errors [e.sub.rrj] defined as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (7)

corresponding to the pairs of samples ([MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]) which k represents the number of vector elements [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Step 3:

This step involves the classification of objects using multilayer neural networks whose input vector is the vector of errors: VectErr = [[err.sub.0], [err.sub.1], ..., [err.sub.r]] and the output is the class c, with c = 1 (class [c.sub.1]) if [phi](VectErr) [less than or equal to] [delta] else c = 0 (class [c.sub.0]) where [delta] a preset threshold small enough (very close to zero).

If c = 1 so all errors are part of the same class, a class where errors are very close to zero, i.e. that:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (8)

This means that all points of X are converted into points of Y by the same parameters [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is an affine transformation X. Else, this is not the case of an affine transformation of X into Y.

6. Results and Discussion

Consider two 3D objects X (object of a database) and Y (query object) related by an affine transformation (Figure 4 and Figure 5) and divided into 20 samples. The goal is to reach neural structures capable of recognizing the errors corresponding to samples of objects belonging to the same class. The learning base is comprised of input variables (errors) and desired outputs variables (classes).

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

[FIGURE 6 OMITTED]

[FIGURE 7 OMITTED]

The desired output will serve as a comparison with the calculated output during the learning phase. After training the network, tests were conducted on a number of data (errors) to check their performance. According to the results (Figure 6 and Figure 7) of the validation test we notice that over 95% of errors are classified in the class [c.sub.1], which shows that according step 3 Y is an affine transformation of X.

7. Conclusions

In this work, we presented a classification method for recognizing 3D objects. Indeed, we have developed an approach based on neural networks which is a first step to split the objects into n samples and then calculate the errors for these samples. Using the proposed method, they will be subject to classification for the recognition of these objects. Recognition is performed using the classification errors rate corresponding to objects (object of origin and its clone) as shown in Figure 6. The simulation results were presented and an evaluation of the designed system has been made. They were generally satisfactory and show the validity of the proposed method.

doi:10.4236/jsea.2011.45033

REFERENCES

[1] T. Murao, "Descriptors of Polyhedral Data for 313-Shape Similarity Search," Proposal P177, MPEG-7 Proposal Evaluation Meeting, Lancaster, February 1999.

[2] M. Elad, A. Tal and S. Ar, "Directed Search in a 3D Objects Database Using SVM," Hewlett-Packard Research Report HPL-2000-20R1, 2000.

[3] C. Zhang and T. Chen, "Efficient Feature Extraction for 2D/3D Objects in Mesh Representation," Proceeding of the International Conference on Image Processing (ICIP 2001), Thessaloniki, Greece, 2001.

[4] T. Zaharia and F. Preteux, "3D-Shape-Based Retrieval within the MPEG-7 Framework," Proceeding SPIE Conference on Nonlinear Image Processing and Pattern AnalysisXII, San Jose, Vol. 4304, 2001, pp. 133-145.

[5] G. Taubin and D. B. Cooper, "Object Recognition Based on Moment (or Algebraic) Invariants," In: J. L. Mundy and A. Zisserman, Eds., Geometric Invariants in Computer Vision, MIT Press, Cambridge, 1992.

[6] S. J. Dickinson, D. Metaxas and A. Pentland, "The Role of Model-Based Segmentation in the Recovery of Volumetric Parts from Range Data," IEEE Transactions on PAMI, Vol. 19, No. 3, 1997, pp. 259-267. doi:10.1109/34.584104

[7] S. Dickinson, A. Pentland and S. Stevenson, "View-point-Invariant Indexing for Content-based Image Retrieval," Proceedings of the 1998 International Workshop on Content-Based Access of Image and Video Databases, Washington, 3 January 1998.

[8] J. W. H. Tangelder and R. C. Veltkamp, "Polyhedral Model Retrieval Using Weighted Point Sets," Rapport Technique No UU-CS-2002-019, Universitd de Utrecht, Pays-Bas, 2002.

[9] P. V. C. Hough, "Method and Means for Recognizing Complex Patterns," US Patent 3 069 654, 1962.

[10] D. H. Ballard, "Generalizing the Hough Transform to Detect Arbitrary Shapes," Pattern Recognition, Vol. 13, No. 2, pp. 111-122, 1981. doi:10.1016/0031-3203(81)90009-1

[11] J. Illingworth and J. Kittler, "A Survey of the Hough Transform," Computer Vision, Graphics and Image Processing, Vol. 44, No. 1, pp. 87-116, 1988. doi:10.1016/S0734-189X(88)80033-1

[12] A. Benyettou, A. Mesbahi, H. Abdoune and A. Ait-ouali, "La Reconnaissance de Formes Spatio-Temporelles par les Reseaux de Neurones a Delais Temporels," Conference Nationale sur L'Ingenierie de L'Electronique--CNIE'02, University USTOran, Algerie, 2002.

[13] B. Muller, J. Reinhardt and M. T. Strckland, "Neural Networks: An Introduction," Springer-Verlag, Berlin, 1995.

[14] R. Lepage and B. Solaiman, "Les Reseaux de Neurones Artificiels et Leurs Applications en Imagerie et en Vision par Ordinateur," Ecole de Technologie Superieure, 2003.

[15] I. Khanfir, K. Taouil, M. S. Bouhlel and L. Kamoun, "Strategie de Traitement des Images de Lesions Dermatologiques," In: M. S. Bouhlel, B. Solaiman and L. Kamoun, Eds., Sciences Electronique, Technologies de L'Information et des Telecommunications, ISBN 9973-41-685-6, 2003.

[16] I. Maglogiannis, P. D. Koutsouris and D. Koutsouris, "An Integrated Computer Supported Acquisition, Handling, and Characterization System for Pigmented Skin Lesions in Dermatological Images," IEEE Transactions on Information Technology in Biomedicine, Vol. 9, No. 1, March 2005, pp. 86-98. doi:10.1109/TITB.2004.837859

[17] S. E. Fahlman, "An Empirical Study of Learning Speed in Backpropagation Networks," Computer Science Department, Carhengie Mellon University, Pittsburgh, 1988.

[18] F. Bloyo and M.Verleysen, "Les Reseaux de Neurones Artificiels," Presse Universitaire de France, Paris, 1996.

Mostafa Elhachloufi (1), Ahmed El Oirrak (1), Aboutajdine Driss (2), M. Najib Kaddioui Mohamed (1)

(1) University Cady Ayyad, Faculty Semlalia, Department of Informatics, Marrakech, Morocco; (2) University Mohamed V, Faculty of Science, Department of Physique, LEESA-GSCM, Rabat, Morocco.

Email: elhachloufi@yahoo.fr, oirrek@yahoo.fr, aboutajadine@fsr.ac.ma, kaddioui@yahoo.fr

Received March 27th, 2011; revised April 28th, 2011; accepted May 12th, 2011.

No comments:

Post a Comment