1) Decision treeDecision tree methodologyis a usually used data mining method for founding classification systems basedon multiple covariates or for evolving prediction algorithms for a targetvariable.The basic concept of thedecision tree 1. Nodes. Thereare three types of nodes. (Lu and Song, 2017)- A root hub, likewise called a choice hub, speaksto a decision that will bring about the subdivision of all records into atleast two totally unrelated subsets. – Internal hubs, likewise called possibility hubs,speak to one of the conceivable decisions accessible by then in the treestructure, the best edge of the hub is associated with its parent hub and thebase edge is associated with its kid hubs or leaf hubs.
– Leaf hubs, additionally called end hubs, speak tothe last aftereffect of a blend of choices or occasions.2. Branches. (Lu and Song, 2017)- Branchesrepresent chance outcomes or occurrences that emanate from root nodes andinternal nodes.
– A decisiontree model is formed using a hierarchy of branches. Each path from the rootnode through internal nodes to a leaf node represents a classification decisionrule. – Thesedecision tree pathways can also be represented as ‘if-then’ rules.3. Splitting. (Lu and Song, 2017)- Only inputvariables related to the target variable are used to split parent nodes intopurer child nodes of the target variable. – Both discrete inputvariables and continuous input variables which are collapsed into two or morecategories can be used.
– When building themodel one must first identify the most important input variables, and thensplit records at the root node and at subsequent internal nodes into two ormore categories or ‘bins’ based on the status of these variables. The type of the decision tree · Classification tree analysis is when thepredicted outcome is the class to which the data belongs.· Regression tree analysis is when thepredicted outcome can be considered a real number (e.
g. the price of a house,or a patient’s length of stay in a hospital). 2. Logistic Regression – Logistic regression is used to find theprobability of event=Success and event=Failure.
We should use logisticregression when the dependent variable is binary (0/ 1, True/ False, Yes/ No)in nature. – The binarylogistic model is charity to estimate the probability of a binary response basedon one or more predictor (or independent) variables (features). – It allowsone to say that the presence of a risk factor increases the odds of a givenoutcome by a specific factor.- Logistic regression doesn’t requirelinear relationship between dependent and independent variables. It can handle various types of relationshipsbecause it applies a non-linear log transformation to the predicted odds ratio.(Sachan,2017).The type of logistic regression1. Binary logistic regression (Wiley,2011)- used when the dependent variable isdichotomous and the independent variables are either continuous or categorical.
– When thedependent variable is not dichotomous and is comprised of more than twocategories, a multinomial logistic regression.2. Multinomial Logistic Regression (Wiley,2011)- The linearregression analysis to conduct when the dependent variable is nominal with morethan two levels.
Thus it is an extension of logistic regression, which analysesdichotomous (binary) dependents. – Multinomialregression is used to describe data and to explain the relationship between onedependent nominal variable and one or more continuous-level (interval or ratioscale) independent variables.The logistic regression does not assume a linear relationship betweenthe independent variable and dependent variable and it may handle nonlineareffects. The dependent variable need not be normally distributed. It does notrequire that the independents be interval and unbounded. Logistic regressioncome at a cost, it requires much more data to achieve stable, meaningfulresults.
logistic regression come at a cost: it requires much more data toachieve stable, meaningful results. With standard regression, and dependentvariable, typically 20 data points per predictor is considered the lower bound.For logistic regression, at least 50 data points per predictor is necessary toachieve stable results (Wiley,2011) 3) Neural NetworkNeural network is a method of the computing,based on the interaction of multiple connected processing elements. Ability todeal with incomplete information. When an element of the neural network fails,it can continue without any problem by their parallel nature.(Liu, Yang and Ramsay, 2011) Basic concept of theneural network (Liu, Yang and Ramsay, 2011) 1.Computational Neuroscience- understanding and modelling operations ofsingle neurons or small neuronal circuits, e.g.
minicolumns. – Modelling information processing in actualbrain systems, e.g.
auditory tract. – Modelling human perception and cognition. 2.Artificial Neural Networks- Used in Pattern recognition, adaptivecontrol, time series prediction and etc.- Theareas contributing to Artificial neural networks are Statistical Patternrecognition, Computational Learning Theory, Computational Neuroscience,Dynamical systems theory and Nonlinear optimisation.
The type of neuralnetwork (Hinton,2010)1. Feed-Forward neural network- There is the commonest type of neuralnetwork in practical application. The first layer is the input and the lastlayer is output. – If the is more than one hidden layer, wecall them ‘deep’ neural networks. They compute a series of transformation thatchange the similarities between cases.2.
Recurrent networks- These have directed cycles in theirconnection graph. That means you can sometimes get back to where you started byfollowing the arrows.- They can have complicated dynamic and this canmake them very difficult to train.A neural network can perform tasks that a linear program cannot. Aneural network learns and does not need to be reprogrammed. It can beimplemented in any application. It can be implemented without any problem.
Neuralnetworks requiring less formal statistical training, ability to implicitlydetect complex nonlinear relationships between dependent and independentvariables, ability to detect all possible interactions between predictorvariables, and the availability of multiple training algorithms. (JV,1996) Factors Decision Tree Logistic Regression Neural Network Basic concept 1. Nodes: Root node, Internal node, Leaf nodes 2. Branches 3. Splitting output can take only two values, “0” and “1”, which represent outcomes such as pass/fail and win/lose 1. Computational Neuroscience 2.
Artificial Neural Networks Type 1. Classification tree analysis 2. Regression tree analysis 1. Binary logistic regression 2. Multinomial logistic regression 1. Feed-forward network 2. Recurrent decnetwork Performance · Can quickly express complex alternatives clearly · Can easily modify · Standard decision tree notation is easy to adopt.
· It does not assume a linear relationship between the independent variable and dependent variable · It may handle nonlinear effects · Requires much more data to achieve stable · Can perform tasks that a linear program cannot. · Requiring less formal statistical training · Ability to implicitly detect complex nonlinear relationships · Ability to detect all possible interactions