Last change
on this file since 105 was
27,
checked in by dtax, 13 years ago
|
This is obviously also needed... Here they are!
|
File size:
1.1 KB
|
Rev | Line | |
---|
[27] | 1 | % |
---|
| 2 | % w = tree_train(x,y,opt) |
---|
| 3 | % |
---|
| 4 | function w = tree_train(x,y,opt) |
---|
| 5 | |
---|
| 6 | % how good are we in this node? |
---|
| 7 | err = tree_gini(y,opt.K); |
---|
| 8 | if (err==0) |
---|
| 9 | |
---|
| 10 | w = y(1); % just predict this label |
---|
| 11 | |
---|
| 12 | else |
---|
| 13 | % we split further |
---|
| 14 | n = size(x,1); |
---|
| 15 | |
---|
| 16 | % optionally, choose only from a subset |
---|
| 17 | if (opt.featsubset>0) |
---|
| 18 | fss = randperm(size(x,2)); |
---|
| 19 | fss = fss(1:opt.featsubset); |
---|
| 20 | else |
---|
| 21 | fss = 1:size(x,2); |
---|
| 22 | end |
---|
| 23 | |
---|
| 24 | % check each feature separately: |
---|
| 25 | besterr = inf; bestf = []; bestt = []; bestj = []; bestI = []; |
---|
| 26 | for i=fss |
---|
| 27 | % sort the data along feature i: |
---|
| 28 | [xi,I] = sort(x(:,i)); yi = y(I); |
---|
| 29 | % run over all possible splits: |
---|
| 30 | for j=1:n-1 |
---|
| 31 | % compute the gini |
---|
| 32 | err = j*tree_gini(yi(1:j),opt.K) + (n-j)*tree_gini(yi(j+1:n),opt.K); |
---|
| 33 | % and see if it is better than before. |
---|
| 34 | if (err<besterr) |
---|
| 35 | besterr = err; |
---|
| 36 | bestf = i; |
---|
| 37 | bestj = j; |
---|
| 38 | bestt = mean(xi(j:j+1)); |
---|
| 39 | bestI = I; |
---|
| 40 | end |
---|
| 41 | end |
---|
| 42 | end |
---|
| 43 | |
---|
| 44 | % store |
---|
| 45 | w.bestf = bestf; |
---|
| 46 | w.bestt = bestt; |
---|
| 47 | % now find the children: |
---|
| 48 | w.l = tree_train(x(bestI(1:bestj),:),y(bestI(1:bestj)),opt); |
---|
| 49 | w.r = tree_train(x(bestI(bestj+1:end),:),y(bestI(bestj+1:end)),opt); |
---|
| 50 | end |
---|
| 51 | |
---|
| 52 | |
---|
| 53 | |
---|
| 54 | |
---|
| 55 | |
---|
Note: See
TracBrowser
for help on using the repository browser.