Decision tree

of decision trees are a special representational form of decision rules. They illustrate successive, hierarchical decisions. They have a meaning in

function mode

decision tree for the goat problem

of decision trees beginning with a trunk, at whose end a bypass is, which leads again branched branches into several - with probabilities provided -. Each terminator point of the treeis attainable by a clear way.

Decision trees are used, in order to be able to meet better and with fewer errors a decision. In the binary decision tree a series of questions is placed, which everything can be answered with or no. This seriesa result results in, which is certain by a rule. The rule is simply readable, if one follows from the root the branches of the tree, until one arrives at a certain sheet, which the result of the question row explains.

Decision trees separatethe data into several groups, which are determined in each case by a rule with at least one condition.

In order to read off a classification, one goes along the tree downward. With each knot an attribute is queried and a decision is met. This procedure becomesso long continued, until one reaches a sheet.

The decision trees are generated usually in Top down - the principle. With each step the attribute is looked for, with which one can classify the data best. This attribute is used for the allocation of the data,so that one can regard the remaining, yet not classified data separately, in further steps. Decision trees are called therefore also classification trees.

Decision trees can be regarded as systems for rule induction. They are simply and understandably presentable. Their generation is fast feasible.

example of an application

a bank would like to sell a new service with a Direct Mailing - action. In order to maximize the profit, those households are to be addressed, which correspond to the combination of demographic variables, those with the actionthe appropriate decision tree as optimal explained. This process is called DATA Segmentation or also Segmentation Modeling.

The decision tree supplies thus good Tipps, who could positively react to the dispatch. This permits the bank to write down only those households which thatCorrespond to target group.

pro and cons

the possible size of the decision trees can affect itself negatively. Each individual rule is easy to read off to have highlights is however very with difficulty. Therefore Pruning so mentioned became - methods develops,which the decision trees on a reasonable size shorten. For example one can limit the maximum depth of the trees or specify a minimum number of the objects per knot.

Often one avails oneself of the decision trees only as intermediate step to a more efficient representation of the set of rules.In order to arrive at the rules, by different procedures different decision trees are generated. Frequently arising rules are extracted. The optimizations are overlaid, in order to receive a durable, general and correct rule set. That the rules in no relations stand to each otherand that contradictory rules can be produced, it affects unfavorably this method.

A large advantage from decision trees is that they are well explainable and comprehensible. This permits the user to evaluate the result and recognize key attributes. This is above all useful, if the quality of the data does not admit is. The rules can be taken over without large expenditure to a simple language as SQL.

effectiveness and error rate

the effectiveness of a decision tree can at the numberPer cent points to be read off, which the data correctly to classify. Some rules work better than others.

combination with neural nets

of decision trees are used frequently as basis for neural nets. They need not as many examples as the neural nets.But they can be rather inaccurate, particularly if they are small. Large trees save however the danger that some examples are not seen and are not registered with the cases of training. Therefore one tries to combine decision trees with neural nets. From this those developedTBNN so mentioned (Tree Based new ral network), which translates the rules of the decision trees into the neural nets.

algorithms in the comparison

the methods of decision making changed rather strongly in the last decades, with the arising of the current algorithms.Some technical terms such as root, edge, knot among other things were however already very early used. Yet the different algorithms, which are used for the computation of the decision trees, are not very old.

Practice differentiates between different different tree types. The CARTs is most well-known (Classification and involution Trees) and the CHAIDs (Chi square AUTOMATIC Interaction Detectors). Lately also the C4.5 was frequently used - algorithm. In former times instead often the ID3 - algorithm uses.

application programs

gives some application programs,the decision trees implemented. So for example the two statistics often commodity packages SPSS and SAS. Both by the way use - like most other DATA of Mining software packages also - the CHAID algorithm.