We can use sklearn to match handwriting.
Machine Learning for Handwriting |
Coding
First we read in the graphics files for each of the numbers (8x8 grey scale values):
digits = datasets.load_digits()
We can now display each of the images:
images_and_labels = list(zip(digits.images, digits.target)) for index, (image, label) in enumerate(images_and_labels[:8]): plt.subplot(2, 8, index + 1) plt.axis('off') plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest') plt.title('Training: %i' % label)
The image is now flattened:
n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1))
Next we create a classifier (SVG - upport vector classifier):
classifier = svm.SVC(gamma=0.001)
The digits are then matched against the first part of the images:
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
and then the second part can be predicted with:
expected = digits.target[n_samples / 2:] predicted = classifier.predict(data[n_samples / 2:])
Finally we can report on the output and show the predictions:
print("Classification report for classifier %s:\n%s\n" % (classifier, metrics.classification_report(expected, predicted))) print("Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted))
We will will take some images (digits.images) and match against our predicted ones:
images_and_predictions = list(zip(digits.images[n_samples / 2:], predicted)) for index, (image, prediction) in enumerate(images_and_predictions[:4]): plt.subplot(2, 4, index + 5) plt.axis('off') plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest') plt.title('Prediction: %i' % prediction) plt.show()