source: prdatasets/breast.m @ 126

Last change on this file since 126 was 123, checked in by bduin, 7 years ago
File size: 1.7 KB
RevLine 
[80]1%BREAST 699 objects with 9 features in 2 classes
2%
[94]3%       X = BREAST
[80]4%
5% Breast cancer Wisconsin dataset obtained from the University of Wisconsin
6% Hospitals, Madison from Dr. William H. Wolberg.
7%
8% REFERENCE
9% O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear
10% programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18.
11%
[94]12%       X = BREAST(VAL)
[80]13%
[94]14% By default objects with missing values are removed. When something else
[123]15% is desired, use one of the options in MISVAL for Val.
[94]16%
17% SEE ALSO <a href="http://37steps.com/prtools">PRTools Guide</a>, <a href="http://archive.ics.uci.edu/ml/">UCI Website</a>
18% PRTOOLS, DATASETS, MISVAL
19
20% Copyright: R.P.W. Duin, r.p.w.duin@prtools.org
21
[80]22function x = breast(val)
23
[123]24if nargin < 1, val = 'remove'; end
25%prdatasets(mfilename,1,'http://prtools.org/prdatasets/breastorg.dat');
[80]26
[123]27a = pr_getdata('http://37steps.com/data/prdatasets/breastorg.dat',1);
28
[80]29user.desc='The original database of the Wisconsin Breast Cancer Databases from UCI, containing 699 instances, collected between 1989 and 1991. ';
30user.link = 'ftp://ftp.ics.uci.edu/pub/machine-learning-databases/breast-cancer-wisconsin/';
31cl = {'benign' 'malignant'};
32fl = {'Clump Thickness' 'Uniformity of Cell Size' ...
33'Uniformity of Cell Shape' 'Marginal Adhesion' ...
34'Single Epithelial Cell Size' 'Bare Nuclei' 'Bland Chromatin' ...
35'Normal Nucleoli' 'Mitoses'};
36
[123]37%a = load('breastorg.dat'); % Octave cannot find it
38%a = load(fullfile(fileparts(which(mfilename)),'breastorg.dat'));
[80]39J = find(a==-1);
40a(J) = NaN;
41nlab = a(:,end)/2;   % the labels for the classes are (2,4), very strange
[81]42x = pr_dataset(a(:,2:(end-1)), cl(nlab) );
[80]43x = setfeatlab(x,fl);
44x = setname(x,'Breast Wisconsin');
[123]45x = misval(x,val);
[80]46x = setuser(x,user);
47
48return
Note: See TracBrowser for help on using the repository browser.