Friday, December 14, 2018

A (failed) attempt to recognise the postie with machine learning

I've tinkered with machine learning in the past using TensorFlow but wanted to try Apple's tools which are extremely simple to use by comparison.

We have a low quality webcam pointing out into the street in front of the house that writes a file to a NAS every time it detects motion near the letter box. I often check these images to see if the postie has come. The project's objective is to automate looking at the images and recognise the postie.

Following Apple's documentation, I created an Xcode storyboard and trained it with a folder containing two sub-folders named "Postie" and "No Postie".

The storyboard is just this:

import CreateMLUI

let builder = MLImageClassifierBuilder()

builder.showInLiveView()

And it puts up a nice little UX where you drag your training folder.


I did try all of the augmentations shown above, it took a lot longer but wasn't more accurate in my case.

After saving the model, I created a simple macOS application with an ImageWell in it. You drag in an image and it shows the most likely tag like this:


Here's the main bit of the code to look at the image and display the classification.

    @IBAction func droppedImage(_ sender: NSImageView) {
        if let image = sender.image {
            if let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil) {
                let imageRequestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
                do {
                    try imageRequestHandler.perform(self.requests)
                } catch {
                    print(error)
                }
            } else {
                print("Couldn't convert image")
            }
        }
    }
    
    private var requests = [VNRequest]()
    
    func setupVision() -> NSError? {
        // Setup Vision parts
        let error: NSError! = nil
        
        do {
            let visionModel = try VNCoreMLModel(for: PostieClassifier().model)
            let objectRecognition = VNCoreMLRequest(model: visionModel, completionHandler: { (request, error) in
                DispatchQueue.main.async(execute: {
                    // perform all the UI updates on the main queue
                    if let results = request.results {
                        let prediction = results[0] as! VNClassificationObservation
                        let classification = prediction.identifier
                        let confidence = prediction.confidence
                        self.predictionTextField.stringValue = "\(classification) \(confidence)"
                    }
                })
            })
            self.requests = [objectRecognition]
        } catch let error as NSError {
            print("Model loading went wrong: \(error)")
        }
        
        return error

    }

Not much code is there.

While this all works, in my case, recognition of the postie in my low quality webcam image is very unreliable. Testing did show this, to be fair.


I suspect the issue is that the postie is a very small part of the image and, due to weather and the position of the sun, the images do vary quite widely.

As mentioned above, I tried training with all of the augmentations available but accuracy didn't improve. Also I tried cropping training images to just include the postie.

In summary, I'm impressed with how super easy it is to create and use ML models on Apple's platforms but I have a great deal to learn about how to train a model to get good results.

No comments: