Accessible Machine Learning and Annotation with Zegami

There is an exceptionally common problem in business and research. Organisations sit upon vast veins of untapped gold, but lack the tools to mine it up. Picture a media company sitting proudly upon hoards of unannotated image and video sports data. Monetizing their images and videos means annotating them. Annotating them requires recognition. Typical resources thrown at such problems are:

  • An almost endless source of processing power – ranging from your personal laptop, to your tens of thousands of CPUs, GPUs and TPUs eagerly waiting in the cloud.
  • An incredibly powerful skill (that even children quickly master), recognition. The ability to, as a human, intuitively classify of objects or concepts within imagery and video content, or patterns within numerical data, with very little effort.

The problem is, people cannot leverage their processing power to quickly apply recognition to their specific tasks. Generic APIs exist to take rough stabs at these problem, but recognising “dogs” and “cats” isn’t going to help if you want to find, for example – your logo, in an image. You turn to hiring 10 people to work 9-5 hand annotating your stock. This slow form of torture results in some miserable people, a heap of your budget down the drain, and mediocre chunk of your ever-growing data pile annotated.

Zegami can solve this problem. Utilising bleeding edge machine learning techniques, you can recruit one person, not 10, to train a model to intuitively recognise your specific, bespoke classes in a matter of hours to an impressive degree of accuracy. Leveraging Transfer Learning (and matterport’s m-RCNN approach for those interested) to lower the amount of input required, creating an accurate model no longer requires the annotation of tens of thousands of images – you can be done in an afternoon. I recently demoed this approach. Here are the steps I took:

An Example The fine fellow with the hook and cleaver is Pudge, from the popular video game DoTA 2. You’ll find him equipped in a large variety of equally fine apparel and weapons. I would wager that, given a completely different image, you would be able to spot this pretty face from any angle, under a wide range of lighting conditions and environments. This is not an easy thing to do from a technological standpoint, it is your innate intuition that makes it seem so easy. It turns out, trying to make a robust “Pudge finder” program from scratch is exceptionally difficult.


It isn’t just his varying appearance that makes him complex. He appears in all sorts of environments, and plays out his role alongside over 100 other colourful characters on DoTA the playing field. The variation in his background and appearance make him a very difficult test case for training a machine capable of recognition.

Using the visualization and annotation tools available through Zegami, I rapidly annotated a handful of images containing Pudge, and trained a model to recognise him anywhere:

  • First, I uploaded my stockpile of images that contained Pudge.
  • Next, I designated a handful of these images for annotation and training using a quick drag-selection.
  • Using the cutting edge segmentation tools available in Zegami Amethyst, I annotated roughly 30 images. An example of this annotation process is given below. It uses an intelligent cutting algorithm that requires little input/time to produce high quality segmentation masks (seen in pink). “part-of-object” (green) and “not-part-of-object” (blue) paints allow you to tell the cutter what the object is/isn’t. With a few hints, it will cut out accurate annotations for you.
  • With a small annotated dataset at the ready, I carried out training using Zegami’s model training suite. At the press of a button, a few annotations are all you need to get started. Monitoring training progress and quality is easy with a few training readout monitors.
  • With my bespoke “Pudge” neural net trained, I let it loose on the rest of my unannotated images and viewed the results in a separate Zegami collection. There are a few false positives, but its not bad for creating a highly specific model in under an hour! The visual similarity clustering view even lets us pick out similar images. I wanted to find the examples of Pudge with the best looking gear: And there we have it – bespoke, powerful machine learning made simple!