Laboratory I: To download additional .arff data sets go to: http://www.hakank.org/weka/ or search the Internet for .arff files required · What’s the difference between a “training set” and a “test set”? · Why might a pruned decision tree that doesn’t fit the data so well be better than an un-pruned one? · What’s the first thing that 1R does when making a rule based on a numeric attribute? · How does 1R avoid overfitting when making a rule based on an enumerated and/or numeric attribute? · What is the difference between Attribute, Instance and Training set? What      is the difference between ID3 and C4.5? Use the following learning      schemes to analyze the iris data (in iris.arff): OneR – weka.classifiers.OneR Decision table – weka.classifiers.DecisionTable -R C4.5 – weka.classifiers.j48.J48 · Do the decisions made by the classifiers make sense to you? Why? · What can you say about the accuracy of these classifiers? When classifying iris that has not been used for training? · How did each one of the methods perform? Use the following learning      schemes to analyze the bolts data (bolts.arff without the TIME attribute): Decision Tree – weka.classifiers.j48.J48 Decision table – weka.classifiers.DecisionTable -R Linear regression – weka.classifiers.LinearRegression M5′ – weka.classifiers.M5′ · The dataset describes the time needed by a machine to produce and count 20 bolts. (More details can be found in the file containing the dataset.) · Analyze the data. What adjustments have the greatest effect on the time to count 20 bolts? · According to each classifier, how would you adjust the machine to get the shortest time to count 20 bolts? Produce      a model for both Weather and Weather.nominal data sets. Which method(s) did you use? What did      the tree(s) look like? Laboratory II: To download additional .arff data sets go to: weka data folder for BreastTumor.arff http://www.hakank.org/weka/ zoo.arff, wine.arff, bodyfat.arff, sleep.arff, pollution.arff Use the following learning schemes to analyze the zoo      data (in zoo.arff): OneR – weka.classifiers.OneR Decision table – weka.classifiers.DecisionTable -R C4.5 – weka.classifiers.j48.J48 K-means – weka.clusterers.SimpleKMeans Try using reduced error pruning for the C4.5. Did it change the produced model? Why? For K-means, for the first run, set k =10. Adjust as needed. What was the final number of k ? Why? Use the following learning schemes to analyze the      breast tumor data. Linear regression – weka.classifiers.LinearRegression M5′ – weka.classifiers.M5′ Regression Tree – weka.classifiers.M5′ K-means clustering – weka.clusterers.SimpleKMeans A) How many leaves did the Model tree produce? Regression Tree? What happens if you change the pruning factor? How many clusters did you choose for the K-means method? Was that a good choice? Did you try a different value for k ? B) Now perform the same analysis on the bodyfat.arff data set. Use a      k-means clustering technique to analyze the iris data set. What did you      set the k value to be? Try several different values. What was the random seed value?      Experiment with different random seed values. How did changing of these values      influence the produced models? Produce      a hierarchical clustering (COBWEB) model for iris data. How many clusters did it produce? Why?      Does it make sense? What did you expect? Change the acuity and cutoff parameters in order to produce a model similar to the one obtained in the book. Use the classes to cluster evaluation – what does that tell you? Laboratory III: To download additional .arff data sets go to: http://www.hakank.org/weka/ zoo.arff, wine.arff, soybean.arff, zoo2_x.arff, sunburn.arff, disease.arff 8. Use the following learning schemes to compare the training set and 10-fold stratified cross-validation scores of the disease data (in disease.arff): Decision table – weka.classifiers.DecisionTable -R C4.5 – weka.classifiers.j48.J48 Id3 – weka.clusterers.Id3 …

Looking for solution of this Assignment?

WHY CHOOSE US?

We deliver quality original papers

Our experts write quality original papers using academic databases.We dont use AI in our work. We refund your money if AI is detected  

Free revisions

We offer our clients multiple free revisions just to ensure you get what you want.

Discounted prices

All our prices are discounted which makes it affordable to you. Use code FIRST15 to get your discount

100% originality

We deliver papers that are written from scratch to deliver 100% originality. Our papers are free from plagiarism and NO similarity.We have ZERO TOLERANCE TO USE OF AI

On-time delivery

We will deliver your paper on time even on short notice or  short deadline, overnight essay or even an urgent essay