Changeset 142
- Timestamp:
- 01/05/20 23:22:59 (5 years ago)
- Location:
- prdatasets
- Files:
-
- 44 added
- 12 edited
Legend:
- Unmodified
- Added
- Removed
-
prdatasets/Contents.m
r141 r142 1 1 % PRDATASETS: Pattern Recognition Datasets in PRTools format 2 % Version 3.0 22-Dec-20192 % Version 3.0.1 6-Jan-2020 3 3 % 4 4 %Feature based labeled datasets 5 5 %------------------------------ 6 %name objects feats classes 7 %x80 45 8 3 radial distances of characters 8 %arrhythmia 420 278 2 presence or absence of cardia arrhythmia 9 %auto_mpg 398 6 2 car/miles-per-gallon 10 %biomed 194 5 2 various patient indicators 11 %breast 683 9 2 Wisconsion breast cancer dataset 12 %car 1728 6 4 Car evaluation database 13 %cbands 12000 30 24 chromosome banding patterns 14 %chromo 1143 8 24 chromosome blob features 15 %diabetes 768 8 2 Pima Indians Diabetes Database 16 %ecoli 272 7 3 protein localisation sites 17 %glass 214 9 4 glass types from chemical components 18 %heart 297 13 2 heart disease dataset 19 %hepatitis 112 19 2 hepatitis database 20 %imox 192 8 4 radial distances of characters 21 %ionosphere 351 34 2 radar data 22 %iris 150 4 3 Fisher's Iris dataset 23 %liver 345 6 2 liver disorder 24 %malaysia 291 8 20 segment features in utility symbols 25 %satellite 6435 36 6 spectral data 26 %sonar 208 60 2 rock / metal sonar features 27 %soybean1 266 35 19 large Soybeans 28 %soybean2 136 35 4 small Soybeans 29 %twonorm 7400 20 2 Leo Breiman's two normal example. 30 %ringnorm 7400 20 2 Leo Breiman's ringnorm example. 31 %wine 178 13 3 wine recognition 32 %mfeat_fac 2000 216 10 face features in digits dataset 33 %mfeat_fou 2000 76 10 Fourier features in digits dataset 34 %mfeat_kar 2000 64 10 Karhunen Loeve features in digits dataset 35 %mfeat_pix 2000 240 10 pixel features in digits dataset 36 %mfeat_zer 2000 53 10 Zernike moments in digits dataset 37 %mfeat_mor 2000 6 10 morphological features in digits dataset 38 %mfeat 2000 649 10 combined features of the mfeat datasets 6 %name objects feats classes 7 %abalone 4177 8 28 Abalone Age Estimation 8 %adult 45222 14 2 Census Income Original 9 %annealing 898 9 5 Steel Annealing Data 10 %arcene 200 10000 2 Arcene Mass Spectra 11 %arrhythmia 452 275 13 Arrhythmia normal 12 %audiology 226 63 24 Standardized Audiology 13 %australian_sl 690 14 2 Statlog Australian Credit 14 %auto_mpg 398 6 2 Auto MPG 15 %balance_scale 625 4 5 Balance Scale 16 %balloons 76 4 2 Balloons 17 %biomed 194 5 2 Biomedical Data 18 %breast 683 9 2 Breast Wisconsin 19 %car 1728 6 4 Car Evaluation 20 %cbands 12000 30 24 Chromosome Bands 21 %census 142521 41 2 Census Income KDD 22 %chromo 1143 8 24 Chromosome Features 23 %cmc 1473 9 3 Contraceptive Method Choice 24 %connect4 67557 42 3 Connect-4 Dataset 25 %credit 690 15 2 Credit Approval Dataset 26 %cylinderbands 540 39 2 Cylinder Bands Dataset 27 %diabetes 768 8 2 Diabetes Dataset 28 %ecoli 336 7 8 Ecoli Dataset 29 %flowcyto 612 254 3 Flow Cytometry 1 30 %german_num_sl 1000 24 2 Statlog German Credit Num 31 %german_sl 1000 20 2 Statlog German Credit 32 %glass 214 9 4 Glass Identification Dataset 33 %haberman 306 3 2 Haberman''s Survival 34 %heart 297 13 2 Heart Cleveland 35 %heart_sl 270 13 2 Statlog Heart 36 %hepatitis 112 19 2 Hepatitis Data Set 37 %imox 192 8 4 IMOX Characters 38 %imsegment 2310 19 7 Image Segmentation 39 %imsegment_sl 2310 19 7 Statlog Image Segmentation 40 %ionosphere 351 34 2 Ionosphere Dataset 41 %iris 150 4 3 Iris Dataset 42 %isolet 7797 617 26 Isolet 43 %letter 20000 16 16 Letter Recognition 44 %liver 345 6 2 Liver disorder dataset 45 %magic04 19020 10 2 Magic Gamma Telescope 46 %malaysia 291 8 20 Malaysia Data 47 %mammograph 961 5 2 Mammographic Mass 48 %mfeat 2000 649 10 MFEAT Combined Features 49 %mfeat_fac 2000 216 10 MFEAT Face Features 50 %mfeat_fou 2000 76 10 MFEAT Fourier Features 51 %mfeat_kar 2000 64 10 MFEAT KL Features 52 %mfeat_mor 2000 6 10 MFEAT Morphological Features 53 %mfeat_pix 2000 240 10 MFEAT Pixel Features 54 %mfeat_zer 2000 47 10 MFEAT Zernike Moments 55 %musk1 476 166 2 Musk version 1 56 %musk2 6598 166 2 Musk version 2 57 %optdigits 5620 64 10 Optical Digit Recognition 58 %pageblocks 5473 10 5 Page Blocks 59 %pendigits 10992 16 10 Pen Based Handwritten Digits 60 %ringnorm 7400 20 2 Ringnorm Data 61 %satellite 6435 36 6 Satellite dataset 62 %satellite_sl 6435 36 6 Statlog Satellite 63 %shuttle_sl 58000 9 7 Statlog Shuttle 64 %sonar 208 60 2 Sonar dataset 65 %soybean1 266 35 19 Large soybean dataset 66 %soybean2 136 35 4 Small soybean dataset 67 %spambase 4601 57 2 Spambase 68 %spectf 80 44 2 Spectf Heart 69 %spectrometer 531 101 9 Low Resolution Spectrometer 70 %teachassist 151 5 3 Teaching Assistant Evaluation 71 %tic_tac_toe 958 9 2 Tic Tac Toe 72 %twonorm 7400 20 2 Twonorm Data 73 %waveform1 5000 21 3 Simple Waveform Data 74 %waveform2 5000 40 3 Advanced Waveform Data 75 %wine 178 13 3 Wine Recognition 76 %x80 45 8 3 80X Characters 77 %yeast 1484 8 10 Protein Localization 78 %zoo 101 16 7 Animal Recognition 39 79 % 40 80 %Multi-band images (pixels are objects, bands are features) … … 43 83 %emim 128*128 8 1 A seto of 5 8-band EM images 44 84 %lena 256*256 3 1 full-color image 45 %texturel 5*128*128 7 5 texture features for 5 different textureimages85 %texturel 5*128*128 7 5 texture features of 5 different images 46 86 %texturet 256*256 7 5 composite texture image 47 87 % 48 88 %Image datasets (pixels are features, images are objects) 49 89 %-------------------------------------------------------- 50 %name images pixels classes 51 %kimia 216 64*64 18 resampled Kimia dataset of silhouettes 52 %mnist8 70000 8*8 10 normalized MNIST digits 53 %nist16 2000 16*16 10 normalized NIST digits 54 %nist32 5000 32*32 10 resemapled MNIST digits 90 %name images pixels classes 91 %kimia 216 64*64 18 resampled Kimia dataset of silhouettes 92 %mnist 70000 28*28 10 MNIST8 Reduced Digits 93 %mnist8 70000 8*8 10 MNIST digits 94 %nist16 2000 16*16 10 normalized NIST digits 95 %nist32 5000 32*32 10 resemapled MNIST digits 96 % 97 %Most datasets are based on the <a 98 %href="http://archive.ics.uci.edu/ml/datasets/SPECTF+Heart">UCI Machine Learning Repository.</a> -
prdatasets/biomed.m
r138 r142 32 32 opt.desc = 'The purpose of the analysis is to develop a screening procedure to detect carriers and to describe its effectiveness. '; 33 33 opt.link = 'http://lib.stat.cmu.edu/datasets/'; 34 opt.dsetname = 'Biomed ';34 opt.dsetname = 'Biomedical Data'; 35 35 a = pr_download('http://prtools.tudelft.nl/prdatasets/biomed.dat',[],opt); 36 36 end -
prdatasets/car.m
r138 r142 25 25 opt.desc = 'The purpose of the analysis is to develop a screening procedure to detect carriers and to describe its effectiveness. '; 26 26 opt.link = 'http://lib.stat.cmu.edu/datasets/'; 27 opt.dsetname = 'Car dataset';27 opt.dsetname = 'Car Evaluation'; 28 28 a = pr_download('http://prtools.tudelft.nl/prdatasets/car.data',[],opt); 29 29 end -
prdatasets/diabetes.m
r138 r142 20 20 opt.link = 'ftp://ftp.ics.uci.edu/pub/machine-learning-databases/pima-indians-diabetes/'; 21 21 opt.desc = 'The Pima Indians Diabetes Database from UCI.'; 22 opt.dsetname = 'Diabetes ';22 opt.dsetname = 'Diabetes Dataset'; 23 23 a = pr_download('http://prtools.tudelft.nl/prdatasets/diabetes.dat',[],opt); 24 24 end -
prdatasets/ecoli.m
r137 r142 20 20 opt.desc='The Ecoli database from UCI. Goal is to Predict the localization site of protein in a cell, by Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University.'; 21 21 opt.link = 'ftp://ftp.ics.uci.edu/pub/machine-learning-databases/ecoli/'; 22 opt.dsetname = 'Ecoli ';22 opt.dsetname = 'Ecoli Dataset'; 23 23 a = pr_download('http://prtools.tudelft.nl/prdatasets/ecoli.dat',[],opt); 24 24 end -
prdatasets/imox.m
r137 r142 7 7 % measured form the corners along the diagnoals and from the edge midpoints 8 8 % along the horizontal and vertical central axes. 9 % 10 % REFERENCES 11 % 1. R. Dubes and A.K. Jain, Clustering techniques: The user's dilemma, 12 % Pattern Recognition, Volume 8, Issue 4, October 1976, Pages 247-260. 13 % 2. A.K. Jain, R.C. Dubes, C.C. Chen, Bootstrap Techniques for Error Estimation 14 % IEEE Trans. Pattern Anal. and Mach. Intel., 9(5), pp. 628-633, 1987. 15 % 3. W.F. Schmidt, D.F. Levelt, and R.P.W. Duin, An experimental comparison 16 % of neural classifiers with traditional classifiers, in: E.S. Gelsema, 17 % L.N. Kanal (eds.), Pattern Recognition in Practice IV, Elsevier, 18 % 1994, 391-402. 9 19 % 10 20 % See also DATASETS, PRDATASETS, X80 … … 17 27 18 28 a = pr_getdata; 19 a = setname(a,'IMOX Dataset');29 a = setname(a,'IMOX Characters'); 20 30 a = setlablist(a,char('I','M','O','X')); 21 31 a = setfeatlab(a,char(... -
prdatasets/pr_download_uci.m
r135 r142 78 78 79 79 %% if matfiles available, use them 80 [varargout{:}] = loadmatfile(comname);80 [varargout{:}] = pr_loadmatfile(comname); 81 81 if ~isempty(varargout{1}), return; end 82 82 … … 102 102 dataname = comname; 103 103 end 104 opt{j}.dsetname = dataname;104 % opt{j}.dsetname = dataname; 105 105 savemat = ~isfield(opt{j},'matfile') || opt{j}.matfile; 106 106 opt{j}.matfile = false; 107 opt{j}.delimeter= ','; 108 opt{j} = fielddef(opt{j},'dsetname',callername); 107 109 a = pr_download(data.url,fullfile(datadir,dataname),opt{j}); 108 110 a = setuser(a,data,'user'); % store dataset info 109 a = setname(a,dataname); % set dataset name111 % a = setname(a,dataname); % set dataset name 110 112 if ~isfield(opt{j},'labfeat') || isempty(opt{j}.labfeat) 111 113 a = feat2lab(a,size(a,2)); … … 120 122 if numel(ucinames) > 1 121 123 % multiple datasets loaded, alignment might be needed 122 [varargout{:}] = dset_align(varargout{:});124 [varargout{:}] = pr_dset_align(varargout{:}); 123 125 a = vertcat(varargout{:}); 124 126 a = setuser(a,data,'user'); % store dataset info 125 a = setname(a,comname); % set dataset name127 opt{end} = fielddef(opt{end},'dsetname',callername); 126 128 if ~isfield(opt{end},'matfile') || opt{end}.matfile 127 129 save(fullfile(datadir,comname),'a'); … … 167 169 dataname = prname; 168 170 end 169 filenames{j} = fullfile( thisdir,dataname);171 filenames{j} = fullfile(fullfile(thisdir,'data'),dataname); 170 172 if exist([filenames{j} '.mat'],'file') == 2 171 173 % if mat-file is available, use it … … 174 176 a = getfield(s,f{1}); 175 177 else 176 if ~exist('data' )178 if ~exist('data','var') 177 179 % get UCI info 178 180 data = parselink(name); … … 218 220 if anynew && numel(ucinames) > 1 219 221 % multiple datasets loaded, alignment might be needed 220 [varargout{:}] = dset_align(varargout{:});222 [varargout{:}] = pr_dset_align(varargout{:}); 221 223 for j=1:numel(ucinames) 222 224 a = varargout{j}; … … 273 275 data.type = type; 274 276 277 function s = fielddef(s,field,x) 278 if ~isfield(s,field) 279 s.(field) = x; 280 end 275 281 276 282 function name = callername(n) -
prdatasets/pr_getdata.m
r140 r142 7 7 % By default DSET is COMMAND.mat with COMMAND the name of the calling 8 8 % m-file. If this is not available in the directory of COMMAND the URL will 9 % be downloaded. If ASK = true (default), the user is asked for approval.9 % be downloaded. If ASK = true, the user is asked for approval. 10 10 % If given, SIZE (in MByte) is displayed in the request. 11 11 % … … 49 49 url = ['http://prtools.tudelft.nl/prdatasets/' name '.mat']; 50 50 end 51 [ dummy,uname,ext] = fileparts(url);51 [~,uname,ext] = fileparts(url); 52 52 53 53 if isempty(name) … … 91 91 out = []; 92 92 end 93 else94 a = dset;95 93 end 96 94 … … 120 118 121 119 % make sure we check for a matfile 122 [ dummy,dummy,ext] = fileparts(dset);120 [~,~,ext] = fileparts(dset); 123 121 if isempty(ext) 124 122 dsetmat = [dset '.mat']; … … 145 143 end 146 144 elseif exist(dset,'dir') == 7 147 [ dummy,dfile] = fileparts(dset);145 [~,dfile] = fileparts(dset); 148 146 if exist(fullfile(dset,[dfile '.mat']),'file') == 2 149 147 out = prdatafile(dset); -
prdatasets/pr_savematfile.m
r137 r142 18 18 if nargout == 1 19 19 a = vertcat(varargin{:}); 20 a = setname(a,name);21 20 save(matfile,'a'); 22 21 varargout{1} = a; … … 27 26 else 28 27 a = vertcat(varargin{:}); 29 a = setname(a,name);30 28 save(matfile,'a'); 31 29 for i=1:nargin -
prdatasets/pr_showdsets.m
r139 r142 1 1 %PR_SHOWDSETS Show datasets and store results in DSET 2 2 3 forget = {'Contents','mfeat_all' };3 forget = {'Contents','mfeat_all','check_','pr_'}; 4 4 commands = struct2cell(dir('*.m')); 5 5 commands = commands(1,:); … … 10 10 commands{j} = commands{j}(1:end-2); 11 11 end 12 J = []; 12 13 for i=1:numel(forget) 13 J = strcmp(forget{i},commands); 14 commands(J) = []; 14 J = [J strmatch(forget{i},commands)]; 15 15 end 16 commands(J) = []; 16 17 17 18 for j=1:numel(commands) 18 19 a = feval(commands{j}); 19 20 [m,k,c] = getsize(a); 20 fprintf('% 6i %4i %4i %15s %s\n',m,k,c,commands{j},getname(a));21 fprintf('%c%-14s %6i %6i %4i %s\n','%',commands{j},m,k,c,getname(a)); 21 22 end 22 23 -
prdatasets/wine.m
r137 r142 18 18 opt.delimeter = ','; 19 19 opt.desc = 'These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines.'; 20 opt.link = 'https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.names'; 20 21 opt.labfeat = 1; 21 22 opt.featnames = char(... … … 34 35 'proline'); 35 36 opt.classnames = {'cultivar 1','cultivar 2','cultivar 3'}; 36 opt.dsetname = 'Wine recognition data';37 opt.dsetname = 'Wine Recognition'; 37 38 a = pr_download('http://prtools.tudelft.nl/prdatasets/wine.dat',[],opt); 38 39 end -
prdatasets/x80.m
r140 r142 15 15 % 2. A.K. Jain, R.C. Dubes, C.C. Chen, Bootstrap Techniques for Error Estimation 16 16 % IEEE Trans. Pattern Anal. and Mach. Intel., 9(5), pp. 628-633, 1987. 17 % 3. W.F. Schmidt, D.F. Levelt, and R.P.W. Duin, An experimental comparison 18 % of neural classifiers with traditional classifiers, in: E.S. Gelsema, 19 % L.N. Kanal (eds.), Pattern Recognition in Practice IV, Elsevier, 20 % 1994, 391-402. 17 21 % 18 22 % See also DATASETS, PRDATASETS, IMOX … … 25 29 26 30 a = pr_getdata('http://prtools.tudelft.nl/prdatasets/80x.mat'); 27 a = setname(a,'80X Dataset');31 a = setname(a,'80X Characters'); 28 32 a = setlablist(a,char('8','0','X')); 29 33 a = setfeatlab(a,char(...
Note: See TracChangeset
for help on using the changeset viewer.