We have a low quality webcam pointing out into the street in front of the house that writes a file to a NAS every time it detects motion near the letter box. I often check these images to see if the postie has come. The project's objective is to automate looking at the images and recognise the postie.
Following Apple's documentation, I created an Xcode storyboard and trained it with a folder containing two sub-folders named "Postie" and "No Postie".
The storyboard is just this:
import CreateMLUI
let builder = MLImageClassifierBuilder()
builder.showInLiveView()
And it puts up a nice little UX where you drag your training folder.
I did try all of the augmentations shown above, it took a lot longer but wasn't more accurate in my case.
After saving the model, I created a simple macOS application with an ImageWell in it. You drag in an image and it shows the most likely tag like this:
Here's the main bit of the code to look at the image and display the classification.
@IBAction func droppedImage(_ sender: NSImageView) {
if let image = sender.image {
if let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil) {
let imageRequestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
do {
try imageRequestHandler.perform(self.requests)
} catch {
print(error)
}
} else {
print("Couldn't convert image")
}
}
}
private var requests = [VNRequest]()
func setupVision() -> NSError? {
// Setup Vision parts
let error: NSError! = nil
do {
let visionModel = try VNCoreMLModel(for: PostieClassifier().model)
let objectRecognition = VNCoreMLRequest(model: visionModel, completionHandler: { (request, error) in
DispatchQueue.main.async(execute: {
// perform all the UI updates on the main queue
if let results = request.results {
let prediction = results[0] as! VNClassificationObservation
let classification = prediction.identifier
let confidence = prediction.confidence
self.predictionTextField.stringValue = "\(classification) \(confidence)"
}
})
})
self.requests = [objectRecognition]
} catch let error as NSError {
print("Model loading went wrong: \(error)")
}
return error
}
Not much code is there.
While this all works, in my case, recognition of the postie in my low quality webcam image is very unreliable. Testing did show this, to be fair.
I suspect the issue is that the postie is a very small part of the image and, due to weather and the position of the sun, the images do vary quite widely.
As mentioned above, I tried training with all of the augmentations available but accuracy didn't improve. Also I tried cropping training images to just include the postie.
In summary, I'm impressed with how super easy it is to create and use ML models on Apple's platforms but I have a great deal to learn about how to train a model to get good results.
No comments:
Post a Comment