Last change
on this file since 110 was
27,
checked in by dtax, 13 years ago
|
This is obviously also needed... Here they are!
|
File size:
1.1 KB
|
Line | |
---|
1 | % |
---|
2 | % w = tree_train(x,y,opt) |
---|
3 | % |
---|
4 | function w = tree_train(x,y,opt) |
---|
5 | |
---|
6 | % how good are we in this node? |
---|
7 | err = tree_gini(y,opt.K); |
---|
8 | if (err==0) |
---|
9 | |
---|
10 | w = y(1); % just predict this label |
---|
11 | |
---|
12 | else |
---|
13 | % we split further |
---|
14 | n = size(x,1); |
---|
15 | |
---|
16 | % optionally, choose only from a subset |
---|
17 | if (opt.featsubset>0) |
---|
18 | fss = randperm(size(x,2)); |
---|
19 | fss = fss(1:opt.featsubset); |
---|
20 | else |
---|
21 | fss = 1:size(x,2); |
---|
22 | end |
---|
23 | |
---|
24 | % check each feature separately: |
---|
25 | besterr = inf; bestf = []; bestt = []; bestj = []; bestI = []; |
---|
26 | for i=fss |
---|
27 | % sort the data along feature i: |
---|
28 | [xi,I] = sort(x(:,i)); yi = y(I); |
---|
29 | % run over all possible splits: |
---|
30 | for j=1:n-1 |
---|
31 | % compute the gini |
---|
32 | err = j*tree_gini(yi(1:j),opt.K) + (n-j)*tree_gini(yi(j+1:n),opt.K); |
---|
33 | % and see if it is better than before. |
---|
34 | if (err<besterr) |
---|
35 | besterr = err; |
---|
36 | bestf = i; |
---|
37 | bestj = j; |
---|
38 | bestt = mean(xi(j:j+1)); |
---|
39 | bestI = I; |
---|
40 | end |
---|
41 | end |
---|
42 | end |
---|
43 | |
---|
44 | % store |
---|
45 | w.bestf = bestf; |
---|
46 | w.bestt = bestt; |
---|
47 | % now find the children: |
---|
48 | w.l = tree_train(x(bestI(1:bestj),:),y(bestI(1:bestj)),opt); |
---|
49 | w.r = tree_train(x(bestI(bestj+1:end),:),y(bestI(bestj+1:end)),opt); |
---|
50 | end |
---|
51 | |
---|
52 | |
---|
53 | |
---|
54 | |
---|
55 | |
---|
Note: See
TracBrowser
for help on using the repository browser.