Machine learning has improved enormously in the past few years and the ability of trained models to recognise new images as being things like a cat or sunset are amazing.
It might be possible to train a neural network with a collection of screen shots from the waterfall of each digital mode so that a new screen shot could be automatically identified.
An internet search for some existing software that does this turned up something that looked hopeful - a windows application called Artemis: Free Signal Identification Software, but (after navigating through a truely evil free hosting Windows malware attempt) the downloaded utility is just a GUI for searching the collection of waterfall images so that the user must decide.
Google has open sourced TensorFlow which is a system which can be trained with sample images and then when given a new image will classify it for you. They ship a pre-trained model called Inception v3 that has been trained with 1,000 different classes of images from ImageNet.
There is a really excellent introductory tutorial called "TensorFlow for Poets" that I followed.
The tutorial shows how to re-train this model with additional flowers that it doesn't know including daisy, dandelion, roses, sunflowers and tulips. Here's some of the sample daisy images.
Thanks to docker, it's very easy to get TensorFlow running. The sample images are in a directory that is mounted as a volume in the docker container.
After getting all this to work - and very reliably recognise flowers, I captured and hunted down sample images of two digital modes, BPSK and RTTY. I chose these two because they are common and also rather similar to the eye. Here's some of my psk sample images.
One trap to note is that you do need a decent number of sample images, 30 - 40 or more or you'll get this mysterious error during training.
CRITICAL:tensorflow:Label rtty has no images in the category validation.
Traceback (most recent call last):
File "tensorflow/examples/image_retraining/retrain.py", line 1012, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "tensorflow/examples/image_retraining/retrain.py", line 839, in main
bottleneck_tensor))
File "tensorflow/examples/image_retraining/retrain.py", line 480, in get_random_cached_bottlenecks
bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 388, in get_or_create_bottleneck
bottleneck_dir, category)
File "tensorflow/examples/image_retraining/retrain.py", line 245, in get_bottleneck_path
category) + '.txt'
File "tensorflow/examples/image_retraining/retrain.py", line 221, in get_image_path
mod_index = index % len(category_list)
ZeroDivisionError: integer division or modulo by zero
In the end I got past this by simply duplicating my sample images which of course doesn't help improve recognition but gets past the fatal error. It is quite hard to find 40 sample images of a digital mode.
Identifying images of modes
As a first test I fed the system two images which were part of the training images.
PSK image
root@03d6e1679d7e:/tf_files# python /tf_files/label_image.py /tf_files/digital_mode_samples/psk/psk31.jpg
psk (score = 0.99392)
rtty (score = 0.00608)
RTTY image
root@03d6e1679d7e:/tf_files# python /tf_files/label_image.py /tf_files/digital_mode_samples/rtty/rtty.jpg
rtty (score = 0.96134)
psk (score = 0.03866)
As can be seen here it distinguish between PSK and RTTY with certainty.
Here I gave it a freshly captured PSK31 signal, one that was not in the training images.
root@03d6e1679d7e:/tf_files# python /tf_files/label_image.py /tf_files/untrained\ psk.jpg
psk (score = 0.74714)
rtty (score = 0.25286)
Here I gave it a freshly captured PSK31 signal, one that was not in the training images.
root@03d6e1679d7e:/tf_files# python /tf_files/label_image.py /tf_files/untrained\ psk.jpg
psk (score = 0.74714)
rtty (score = 0.25286)
As expected, not quite as sure but still very good.
I think there is a good prospect of using machine learning image recognition for guessing digital modes. Ideally this would be built in to clients but it might make a good app (using the phone camera to capture the unidentified signal) or a web site where you upload a screen shot.
The main thing I need to expand this is lots of sample waterfall images.
There's some interesting discussion in a thread in the Reddit amateur radio subreddit.
The main thing I need to expand this is lots of sample waterfall images.
There's some interesting discussion in a thread in the Reddit amateur radio subreddit.
4 comments:
Annoyingly there is a nice 'standard' for identifying what mode you're about to transmit - RSID. Fldigi, along with a few other data mode programs support it. Unfortunately, it's not widely used, leading to the situation you find yourself in.
I suspect a machine learning algorithm like you're talking about may help to distinguish between the 'base' modulation schemes (PSK, QPSK, FSK, MFSK), but won't help with the sub-modes of them.
For example, how would you distinguish something like DominoEX from THOR? They look exactly the same on a waterfall, but one has FEC and interleaving...
Let's default RSID to be "ON" for most modes as a default for HRD, fldigi and Multipsk etc...
Also a brute force approach to computer decode is perhaps easier to code and easier for a ASCII system to "recognize" clear text and Q Codes from the decoded stream. Once all modes are matched or a reduced set defined as a trial from unique features such as bandwidth, modulation, baud etc. Then clear text decodes from many many modes can be easily defined and scripted...
Another option would be to lean directly on the audio files rather than the waterfall display:
https://ccrma.stanford.edu/events/deep-learning-audio-applications-using-tensorflow
oh excellent, I will check it out. Thanks.
Post a Comment