مقالات التدريسيين


مقالات التدريسيين


Using Machine Learning In Medical Classification


By: Hadeel Mohammed Saleh


Machine learning is a branch of Artificial intelligence (AI) that deals with the development and design of algorithms and techniques, which enable the computer to possess the characteristics. It makes the computer able to learn without being explicitly programmed. Learning of the machine has its focus on the development of the computer programs, which can be used for the teaching of the growth and to change when it is exposed to the new data.

           Learning of the machine is the programming of the computer, which is used for the optimization of the performance of the criteria by the adaptation of the example data or experience of the past. Learning of machine has been categorized as supervised or unsupervised which means that the supervised algorithms can be applied to see what has been learning in the past to newly made data while the unsupervised algorithms can draw the inferences from the sets of data.

Data mining is a procedure of discovering patterns of interest and knowledge from a huge amount of data. It is a natural evolution of database technology, in great demands, with a wide range of applications It can also be defined as the procedure that discovers meaningful new correlations, patterns and trends through shifting along with massive amounts of data that is stored in databases, with the use of pattern recognition approaches, e statistical and mathematical methods. There is also one more definition for this technique, which is the analysis of observational data groups for the sake of finding unsuspected relationships and for summarizing the data in several methods, which are both comprehensible and useful to the data analyzer.

      Data mining is an interdisciplinary area that brings together the approaches from machine learning, pattern recognition, statistics, databases and visualization for the sake of addressing the problems of information extracted from large databases. This is an improving and developing field of study. Scientists should pay attention to mining various types of data, and that includes numerical and alphanumerical formats, video, voice, speech, graphics, text, images and their combinations as well.

The tasks of data mining classified into two categories are  Descriptive Mining approach where the basic features of the data in the database are listed. Clustering, Association and Sequential mining are the basic jobs in the descriptive mining approach jobs and  Predictive mining extracts the patterns from the data in an identical way as predictions. The approaches of this category include tasks such as Classification, Regression and Deviation detection.

Classification is a type of data mining that can be utilized for extracting models that describe significant classes of data or for predicting upcoming trends of data.

Data classification is a two steps process. In the first step, a model is constructed that describes a specified group of data classes. The structure is built through the analysis of database tuples described via attributes.

Every tuple is assumed to belong to a specified class, as predetermined by one of the attributes, known as the class label. In the second stage, the model is used to classify new patterns.

Several tools are utilized for solving the issue of classification. Those tools have been successfully utilized in various fields like the classification of the radar signal, character recognition, remote sensing, medical diagnosis, expert systems, speech recognition and others.

Probably, the most significant feature of the decision trees is their capability of breaking down a complicated decision-making procedure into a group of simpler decisions, therefore offering a solution that is usually simpler to understand.

          For machine, learning techniques the procedure of classification usually includes two stages: the training stage and testing stage. In the first stage, the characteristics of conventional data features are pre-processed. In the second stage, those characteristic space divisions are utilized for classifying the data.

     For instance, the person might be classified as sick or healthy; also, a patient might be classified to be under “high risk” or “low risk” according to the pattern of their disease by utilizing the data classification method. In this case, it is considered a supervised type of learning method that has known categories of classes. There are two types of classification, Binary classification and multi-classification. Two probable classes only in binary classification like the “high” or “low” risk patient, while the multi-classification method includes more than two targets, such as “high”, “medium” and “low” risk patient.

In classification methods, the data set is subdivided into two sets training and testing data set. Training data set that is used for the sake of training the classifier, while the testing data set used to validate the classifier.

Classification is commonly used in the organizations of healthcare, especially for diagnosis.

Data mining has been used widely in many organizations and fields. When we talk about medical diagnosis, data mining is becoming increasingly popular. One of the most popular topics is medical data mining in the area of the data mining community as it has much importance in the field, as well as all the challenges, act in the area of data mining.

One of the most necessary tasks which are needed to be done for the execution is the diagnosis of the medical and that is the reason that the design of all of the systems in this area can be proved very effective and efficient. Not all doctors have the expertise in this area and that is the reason that the human resources are insufficient. As a result of this, an automatic diagnosis system has been introduced which would be beneficial by taking all of the cases together and there is also an appropriate computer bases system which is used for supporting the information and can help for the achievement of the clinical test by reducing the cost. There is a need for comprehensive study for efficient and effective implementation of these automated systems.

Predictive modelling is also used in the automated systems which is one of the most often used application and this type of classification are used for the prediction of the variables which are in target and are categorical by nature like classification according to the healthy cases of the individuals. Usually, three different expected errors occur, these are The False -Positive case (FP), which means the person who is not a patient incorrectly diagnosed as a patient. There can be an induction of the financial cost and worry due to the false positive. The two error type is False-Negative case (FN), which means a patient person who is incorrectly diagnosed as healthy. This may give results of having the wastage of precious time, which may also be the reason for the loss of the life, which are not treated well. The last type of error is the Unclassifiable error is the last type of error in which the system is notable for the classification for the case where there is a possibility of the case occurrence as there is a lack of the historical data overlapped information, or perhaps as a result of the inadequate algorithm.


 [1] Al paydin, E. (2014). Introduction to machine learning. MIT press.

[2] Tom,M.(1997),MachineLearning,McGraw-Hill Science/Engineering .

[3] Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier.

[4] Elder, J. F., & Abbott, D. W. (1998, August). A comparison of leading data mining tools. In Fourth International Conference on Knowledge Discovery and Data Mining.

[5] Syed A.,(2013).Pattern   Recognition Using Artificial Neural Network and Other Classifiers  .MSc  Thesis,    Department  of  Computer  Science  and  Engineering, Jadavpur University, Kolkata, India.

[6] Franco Arcega, A. (2006). ADT: A decision tree algorithm based on concepts.

[7] Vasantha, M., Bharathi, D. V. S., & Dhamodharan, R. (2010). Medical image feature, extraction, selection and classification. International Journal of Engineering Science and Technology1(2), 2071-2076.

تهيئة الطابعة   العودة الى صفحة تفاصيل الخبر