CrossControl Advances Virtual Fencing and Object Detection

The company collaborated with Mälardalen University to increase the situational awareness of off-highway vehicles and improve operator response.

Algorithms for cone detection, virtual fence creation and hazard detection are integrated with the application and alert an operator as soon as a hazard enters the designated area. (CrossControl)

Industrial vehicles such as forklifts, cranes and tractors have come a long way in terms of applying technology, enhancing performance with improved operation and safety. With the advancements in computer vision, robotics and artificial intelligence (AI), these vehicles now are equipped with functionality that utilizes information to support and optimize performance. One of the key enablers of these advances is the integration of machine learning (ML) derived platforms and powerful computers, neural processing units and cameras integrated into the digital system.

By determining if a detected person or object is in a dangerous area, the application can alert the operator with an alarm or an on-screen warning. (CrossControl)

System designers and OEMs can get started with AI and computer vision using just a standard Ethernet camera and a Cortex A35 dual- or quad-core next-generation display. Using the detection and recognition of objects, designers can implement and train neural networks to realize new solutions for process guidance, automation, augmented reality and operator awareness. For example, the dual-core CCpilot V700 from CrossControl can provide adequate detections for automating processes.

Adding an optional AI accelerator module, inserted into the mini PCIE slot, to the quad-core platform used by the CCpilot V1000 and CCpilot V1200 results in enhanced performance for advanced detections and processes. In either example, computer vision is implemented with enough computing headroom for additional UI features and other HMI solutions. The system does not require excessive power and generates very little heat. CrossControl is displaying some of these technologies for increased machine intelligence at the CONEXPO Show in Las Vegas.

Basic setups

Advanced object detection employs a combination of the NXP i.MX 8QuadXPlus and quad-core Arm Cortex-A35 application processor. (NXP)

Operators often are tasked with monitoring a camera stream for an event or state and reacting accordingly. For example, during grain harvests drivers watch a video feed of the grain bin and when it reaches capacity, they either call in a new grain truck or drive to a drop-off point to empty the onboard storage. With the base level of computer vision offered by a dual-core display, the unit can serve a message to the driver when the AI-inferred object detection recognizes that a state has been reached. Recognizing a state is among the simpler tasks that AI can perform on a video stream and requires significantly fewer computing resources than more complex image analysis.

In a configuration like this where the display also would be used for instrumentation, machine controls and other HMI systems, the average detections in CrossControl testing were more than 10 per second, meaning that the message to the operator, or a vehicle control being enabled, is served in less than 20 cm (7.9 inches) of vehicle movement (harvesters typically operate at speeds of 7-9 km/h, or 4-6 mph). For the ML process, it is not necessary to also render the video stream to the display, meaning that a dedicated camera monitor no longer is a necessity, and more screen space can be used for additional functions.

The video stream still can be opened in a windowed panel or full screen as required by an operator. The system also can activate the stream to be shown when the filling reaches a defined point, or an alternative-tasks trigger is reached to give the operator visual confirmation.

Advanced object detection

A more advanced version of this deployment employs a combination of the NXP i.MX 8QuadXPlus and quad-core Arm Cortex-A35 application processor. This setup offers exceptional graphical performance and utilizes the device’s internal mini PCIE port equipped with an AI acceleration module to deliver more than 45 detections per second even using advanced models trained to recognize more than 90 different classes of objects.

The greater performance of the larger CCpilot V1000 and V1200 displays allows for the system to be used for productivity, awareness and safety solutions. The higher rate of detections and application processor headroom can deliver simultaneous deployment of multiple applications. For example, task guidance with AI support, object detection within the video feed (to offer additional situational awareness to the driver), and general machine information, with all three applications shown concurrently on a single display.

Creating a virtual fence

The larger CCpilot V1000 and V1200 displays, with a higher rate of detections and application processor headroom, can deliver simultaneous deployment of multiple applications. (CrossControl)

In conjunction with a researcher at Mälardalen University, CrossControl experts developed a test application for the i.MX 8X-based display computing platform that can help to create situational awareness for work vehicles by utilizing surveillance cameras and object detection. The test application runs on COTS products, V1000 and V1200. In the first method of supporting operating awareness, researcher Gustav Alexandersson built an application that could create a virtual fence between cones placed around the vehicle to create a work area of any size or shape.

The virtual fence splits the image into two areas: a dangerous area close to the vehicle and a safe area outside the dangerous zone. By determining if a detected person is in a dangerous area, the application can alert the operator with an alarm or an on-screen warning. The area is dynamically updated if a cone is moved. To validate the concept, the team tested three algorithms for calculating the virtual fence with the placement of up to 10,000 detectable simulated objects. The results showed that with a fence made of 10 cones and 1,000 visible people, the system could detect if someone was at risk 1,000 times a second.

With the use of the platform’s more powerful architecture supported by the onboard mini PCIE AI accelerator, the three examined algorithms don’t affect the possibility to analyze the area at least 30 times a second (once for each frame of video), even when integrated with a complete HMI application. This is sufficient, for example, to alert the driver of a hazard in the area before an excavator arm can enter the same zone, though some performed more efficiently than others.

The same single camera and onboard display setup also can create a virtual fence without cones to set a predefined work area and utilize the same human-detection model to serve hazard warnings to an operator. An operator could even draw on the display to set a work area and be notified when a person enters it, but cones offer the added benefit of serving as a warning to people outside the machine.

Multiple camera implementations

Multiple cameras and digital image-stitching make it possible to create a bird’s-eye-view vision system for off-highway vehicles and pair the system with integrated human detection. (CrossControl)

Many larger machines deploy multiple cameras to provide all-around machine vision and driver awareness. Open platforms such as CrossControl’s can provide excellent support at both a hardware and software level to integrate this asset. The available software modules that enable commonly desired functionality include a camera module that provides the necessary tools for integrating camera streams into an application for concurrent visualization of multiple streams from any brand.

With multiple cameras and digital image-stitching, it is possible to create a bird’s-eye-view vision system for heavy vehicles and pair the system with integrated human detection. A demonstration of this feature was successfully developed in collaboration with Mälardalen University and two successful master’s degree candidates. The result is a bird’s-eye view of the vehicle, giving an overview of the entire vehicle without blind spots.

The test system can detect humans within 5 meters (16 ft.) from the vehicle and visually alert the operator that a human is approaching. A deployed system can calibrate the distance to suit their application. If the system is set to monitor the direction of travel and automatically brake if a human is detected, the stopping distance for a typical work vehicle traveling at 10 km/h (6 mph) is just 55 cm (21.5 inches) if the system is detecting at 30 fps, and 75 cm (29.5 inches) if the system operates at 10 detections per second.

The approach taken for the test solution incorporates execution directly on the GPU and allows for improvements of the system to be implemented, such as the hardware-accelerated OpenCV data type UMat. By using such advancements, this setup decreases the execution time and increases both available resources and detection per second.

Johan Persson, senior software application engineer and the lead for AI exploration at CrossControl AB, wrote this article for SAE Media.