Feb 11, 2020

Creating a Cattle Counter app for the Parrot Anafi

Combining AI or Machine Learning together with drones will mean that many of today’s mundane tasks in construction, agriculture, building security, shipping, warehousing will be replaced by automated drones. In this blog we’re going to look at how we automated counting cattle using a Parrot Anafi drone and Google’s TensorFlow to get you started on using AI in your drone apps. The end result will be a mobile application running on an Android phone that gives a real time count of the cows that the drone’s camera can see.

Intro to ML object detection

There are four major steps we need to follow

Gather the data
Label the images
Train the model
Test the model

These tasks are independent of the mobile app programming. We can deploy the model on the phone after we’ve completed each of these steps.

Gather the data

We need approx 5000 images of cows in the field or pasture to train our neural network. These should be collected under a variety of conditions such as different seasons, different farms, different times of the day and of course different weather. We also asked our pilots to fly at a height of 50 ft, with the camera pointing straight down.

Label the images

Figure 1: Outsourced Labeled Image

We need to draw a box around each of the cows in each image, see Figure 1. These labeled images train our neural network so it learns how to recognize cows. It’s a labor intensive task and you either do it yourself using software like labelimg or you can outsource the labeling task to a labeling agency like Hive. Remember Garbage in, Garbage Out so you need to be extra careful to make sure your labeling is accurate and consistent.

Figure 2: TFRecords

Once the images are labeled we need to generate training and testing images. The testing images will help us see if our training works. Figure 2 describes the Tensorflow Records process. We can use the following scripts (originally used to detect pet raccoons) to get our images ready.

xml_to_csv.py converts our images into csv files
generate_tfrecord.py converts our csv files into TensorFlow Train.record and Test.record formats.
We also need a label map file, similar to Listing 1.

item {
      id:1
      name: 'cattle'
}

Listing 1: Cattle Label Map File

Train the model

Now that we have the data prepared, we’re ready to move onto the training. There are five steps in our training that we need to follow :-

1. Set up Google Cloud Platform

2. Set up Docker Environment

3. Configure Local Google Cloud Environment

4. Set up Object Detection API

5. Train the model

Set up Google Cloud Platform

Figure 3: Storage

Sign up for a Google Cloud platform account. Login and create your new project, and you’ll also need to enable the ML engine for your project. Next we need to go to “Storage” and create a new bucket for our data. We’ll call it “cattlecounterbucket”, see Figure 3. Finally create a sub-directory in your storage bucket called “data.”

Set up Docker Environment

Figure 4: Docker Images

Install Docker on your machine. Download the TensorFlow object detection Dockerfile, this will install all the prerequisites for our training. Build the image by running

docker build

Find the ID of the docker image you’ve just built by running

docker images

Attach to the image by running

docker run -it <IMAGE_ID

You should then see a command prompt from inside of the Docker container, see Fig 4.

Configure Local Google Cloud Environment

We need to set up the Google Cloud environment by giving it our credentials and project information. Start by running the following command.

gcloud auth login

Let GCP know your Project ID and Bucket Name.

export PROJECT=”YOUR_PROJECT_ID”export GCS_BUCKET=”YOUR_GCS_BUCKET_NAME”

Tell the Gcloud tool which project and bucket we are working with.

gcloud config set project $PROJECT

Since we are training a model, we’ll want access to Google’s Cloud TPUs. Fetch the name of your service account by running the following

curl -H “Authorization: Bearer $(gcloud auth print-access-token)” https://ml.googleapis.com/v1/projects/${PROJECT}:getConfig

Store the TPU service account identifier from the above response

export TPU_ACCOUNT=<YOUR_TPU_ACCOUNT>

Grant TPU permissions to the project

gcloud projects add-iam-policy-binding $PROJECT --member serviceAccount:$TPU_ACCOUNT –-role roles/ml.serviceAgent

Set up the Object Detection API

Test the API from your container, by running the following command:

cd /tensorflow/models/researchpython object_detection/builders/model_builder_test.py

Check that the output shows passing tests, see Figure 5.

Figure 5: Object Detection API tests

Copy your TFRecord files and label map from your host machine into your docker container.

docker cp train.record CONTAINERNAME:/train.record
docker cp test.record CONTAINERNAME:/test.record
docker cp cattle_label_map.pbtxt CONTAINERNAME:/cattle_label_map.pbtxt

Move your data into your Google Cloud bucket.

gsutil -m cp -r /train.record gs://${GCS_BUCKET}/data/
gsutil -m cp -r /test.record gs://${GCS_BUCKET}/data/
gsutil -m cp /cattle_label_map.pbtxt gs://${GCS_BUCKET}/data

We’re using the TensorFlow object detection model to detect the cattle. The model is doing all the heavy lifting and we’re basically configuring it to work with our labeled images. Download the object detection model and copy it to your storage bucket.

mkdir /tmp && cd /tmp
curl -O http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
tar xzf  ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
gsutil cp /tmp/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03/model.ckpt.* gs://${GCS_BUCKET}/data/

Edit the detection configuration in /tensorflow/models/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config. Use the following table to update the configuration values.

Upload the new configuration to your Google cloud bucket

gsutil cp /tensorflow/models/research/object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config

Package the object detection API by running

bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist
cd slim
python setup.py sdist

Train the Model

We’re now ready to start training the model. Create a file called training_config.yaml with the following contents, see Figure 6.

Figure 6: Training Configuration

Create a similar config file for evaluation or testing, eval_config.yaml. Add the one additional args key “–checkpoint_dir=gs://<YOUR_GCS_BUCKET>/train”

Start the training job on Google Cloud Platform by running the following commands:

cd /tensorflow/models/research
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` --job-dir=gs://${GCS_BUCKET}/data/ \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_tpu_main \
--runtime-version 1.13 \
--scale-tier BASIC_TPU \
--region us-central1 \
--config=training_config.yaml

While we’re at it, let’s start another job to do the evaluation:

gcloud ml-engine jobs submit training `whoami`_object_detection_eval_validation_`date +%s` \
--job-dir=gs://${GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \
--runtime-version 1.13 \
--scale-tier BASIC_TPU \
--region us-central1 \
--config=eval_config.yaml

Open the Google Cloud console in your browser and click the AI Platform tab, and then click “Jobs”. You should see your jobs queued, see Figure 7.

Figure 7: GCP console

You should see a green icon next to each job when they are complete.

Test the model

We can use TensorBoard to see our object detection accuracy calculated during the validation phase. TensorBoard is a browser-based tool. Install and run it as follows.

Assuming you have python and pip installed on your machine, install Tensorflow with the following command.

pip install tensorflow

Install Google Cloud CLI so we can access the jobs.

gcloud auth application-default login

Start TensorBoard from your bucket’s training directory with

tensorboard -–logdir=gs://${GCS_BUCKET}/train

Open TensorBoard in your browser at http://localhost:6006
Go to the images tab to see how well your model did (right) against the original labeled image (left), see Figure 8.

Figure 8: Labeled image vs. Object detection

Deploying on Android

We can use TFlite to convert our trained model to work on an Android device. First let’s set up some config files on our Docker container.

export CONFIG_FILE=gs://${GCS_BUCKET}/data/pipeline.config
export CHECKPOINT_PATH=gs://${GCS_BUCKET}/train/model.ckpt-2000
export OUTPUT_DIR=/tmp/tflite

We also need to modify the export script to increase the number of detection boxes from 10 to 100. Edit object_detection/export_tflite_ssd_graph.py (line 106).

flags.DEFINE_integer(‘max_detections’, 100, ‘Maximum number of detections (boxes) to show.’)
Export the inference graph by running the following:
python object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true \
--max_detections=100

We have two new files, tflite_graph.pb and tflite_graph.txt in our output directory. Use the Tensorflow Lite Optimizing Converter (TOCO), to optimize the model for mobile.

bazel run -c opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,1 \
--input_arrays=normalized_input_image_tensor \
--output_arrays=’TFLite_Detection_PostProcess’,’TFLite_Detection_PostProcess:1’,’TFLite_Detection_PostProcess:2’,’TFLite_Detection_PostProcess:3’ \
--inference_type=QUANTIZED_UINT8 \
--mean_value=128 \
--std_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops

Building the Android App

Now we can start integrating the model into our Parrot application. Create a new Kotlin project in Android Studio and make sure the following dependencies are in your app’s build.gradle file.

implementation ‘com.parrot.drone.groundsdk:groundsdk:1.1.2’
implementation ‘androidx.constraintlayout:constraintlayout:1.1.3’
implementation ‘androidx.lifecycle:lifecycle-extensions:2.1.0’
implementation ‘androidx.lifecycle:lifecycle-viewmodel-ktx:2.1.0’
runtimeOnly ‘com.parrot.drone.groundsdk:arsdkengine:1.1.2’
implementation ‘org.tensorflow:tensorflow-lite:0.0.0-nightly’
Implement a singleton to manage communication with the drone and controller, called ParrotSdkManager.
class ParrotSdkManager private constructor() {
         private object HOLDER {
                  val instance = ParrotSdkManager()
         }
         companion object {
                  val instance: ParrotSdkManager by lazy { HOLDER.INSTANCE }
         }
         var connected = MutableLiveData<Boolean>(false)
}

We have a singleton, but it’s not doing much yet. Add some instance variables to set up the ground SDK and methods to initialize it with context.

private var sdk: GroundSdk? = null
private var drone: Drone? = null
private var droneState: Ref<DeviceState>? = null
// Initializes an instance of GroundSdk with the lifecycle of activity
fun initializeWith(activity: Activity) {
      sdk?.let { sdk ->
            try {
                  sdk.close()
            } catch(e: Throwable) {
                        // Handle your error here!
            }
      }
      this.sdk = ManagedGroundSdk.obtainSession(activity)
}
// Starts attempting to connect to the drone
fun startAutoconnect() {
      sdk?.getFacility(AutoConnection::class.java) {
            it?.let {
                  if(it.status != AutoConnection.Status.STARTED){
                        it.start()
                  }
                  if(drone?.uid != it.drone?.uid) {
                        if(drone != null) stopMonitors()
                  }
                  drone = it.drone
                  if(drone != null) {
                     startVideoStream(<Your StreamView Here>)
                  }
                  startMonitors()
            }
      }
}
We need to implement the following methods to monitor the drone when it’s connected
     fun startMonitors() {
         droneState = drone?.getState {
            it?.let {
                 this.connected.value = it.connectionState 
                       == ConnectionState.CONNECTED
            }
         }
      }
      fun stopMonitors() {
            droneState?.close()

Place a GsdkStreamView in your layout and pass it into the following so we can manipulate the video stream. It may look complicated, but all that’s really going on is ensuring that streaming is enabled and playing the stream for the given stream view whenever it’s available.

fun startVideoStream(streamView: GsdkStreamView) {
   require(drone != null)
   streamServer = drone?.getPeripheral(StreamServer::class.java){
      if(it == null) {
         liveStream?.close()
         liveStream = null
         currentStream = null
         streamView.setStream(null)            
      } else {
         if(!streamServer.streamingEnabled()) {
            streamServer.enableStreaming(true)
         }
         // Start monitoring the livestream
         if(liveStream == null) {
            liveStream = streamServer.live { stream ->
               if(stream != null) {
                  if(currentStream == null) {
                     streamView.setStream(stream)
                  }
                  stream.play()
               } else {
                  streamView.setStream(null)
               }
               currentStream = stream
            }
         }
      }
   }
}

Finally, we take bitmaps from the video stream and classify them in real-time with TensorFlow, which will give us a list of bounding boxes to overlay on our video feed.

val classifier = ImageClassifier(application.assets)
streamView.capture { bitmap ->
      bitmap?.let { bitmap ->
           val boxes = classifier.classify(bitmap)
       }
}

Conclusion

While this is a longish blog post, it’s worth noting that most of what we’re doing is configuring existing systems to create our cattle counting object detection. We didn’t have to create the model and we outsourced the labeling to a third party. We also used TensorFlow Lite examples to manipulate the video stream and the Parrot SDK to provide the video of the cattle.