If you are a beginner to Weka then one of the most common problems that you might face is that you are unable to load your dataset into Weka because either it gives you an error such as ” Reason: java.io:IOException premature end of file, read Token [EOF], line 1″ or “Arff file not recognised” or some other similar error.
In one of my previous post, I had provided the link to a blog that had answered this question. However, it was today when I was helping out someone else with the similar problem I realised that the most important step (Step 2 given below) was missing from the instructions on this blog. Therefore, I decided to write this post for the benefit of all.
To solve this error the solution is given below.
Step 1: Open your datafile in Excel/Access or whatever is its format
Step 2: Save the dataset file in CSV format. To save the file in .csv format do the following steps,
Click on File—>Save As—> Click on the drop down menu besides ‘Save as type’ and change it to CSV. You will see a dialogue saying something like “Some features will be lost blah blah blah”… Just click on Yes.
Step 3: Now go to Weka and then click on Tools–> Arff Viewer
Step 4: In Arff Viewer window–> Click on File–> click on Open —> go to the location where you saved the datafile in .CSV format, choose the datafile and then –> Click on Open
Step 5: Viola, your datafile will be open in Arff viewer. Now all you have to is to save it in Arff format by changing its extension to arff and in save as type choose arff.
Now the reason why you were getting that java error, because, the instructions on that blog which I had guided you too did not state that you first have to change the datafile format which is in excel or access or whatever format to .CSV (Comma Separated Value) format. Always remember, for Weka to open your data file, your dataset should first be converted into CSV format. Only then Weka will be able to load it in the ARFF viewer and subsequently will let you save it to .arff format
Hope you found this useful.