In late 2018, Mobiquity delivered a project to create a Crowd Counting prototype mobile application. The purpose of the app was to demonstrate how video can be streamed to a cloud backend service with video analytics using cloud machine learning (ML). The use-case was extremely simple: to capture a short video clip of people in a room, and within a few seconds return a machine learning-based numeric count of people in any given space. The technical implementation tackled a number of new engineering feats in a short, six-week period of time.
Our global team looked like this: we worked with the client and their leadership in Seattle, had creatives in San Francisco and Philadelphia, and engineering/data science professionals in Amsterdam, who worked alongside the client’s team (which has some of the best engineers in the world!). This geographic spread presented challenges and opportunities, and in the end we accomplished a lot. The creative ideation stemmed from onsite workshops where we whiteboarded the user experience, application logic and associated data flows, and iterated for several days. Once we narrowed down potential use-cases, we created an end-to-end flow that made sense across creative, engineering, and data science perspectives.
From there we kicked off mobile engineering streams (focusing on Android Software Development Kits (SDKs) that didn’t exist), architecture and pipeline builds using AWS core services, creative/user experience/user interface to fast-track front-end designs, and in parallel a lot of foundational work to assemble and annotate datasets. The best way to describe this project is waterfall design, followed by several intense weeks of one-day sprints.
Between mobile application/platform, network, cloud, and backend data processing components, we ran into numerous decision points that had to be tackled quickly, and in some cases re-thought based on the downstream results we wanted to improve. In the end, we produced a mobile SDK to stream video to Kinesis Video Streams (KVS), processed video to a panorama image, and ran the image through SageMaker to return a people count and associated accuracy estimate. The machine learning is trained using an object detection model to identify heads and shoulders patterns (and not facial recognition).
For those who may want to get their hands on this and build/experiment/innovate, here are a couple of thoughts for consideration:
We are so excited to finally release the Crowd Counting Demo App Developer’s Guide for developers and architects to give it a go. We can’t wait to see what you come up with!
We believe that addressing customer challenges gives you opportunities to delight. Using our proprietary Friction Reports and strong industry expertise, we dig deep into customer sentiment and create action plans that remove engagement roadblocks. The end result is seamless, relevant experiences that your customers will love.