Design goalsο
The LAA has been designed to be able to automate most developer boards currently (or have been) used by the industry and the communities. The LAA design has been curated to make those DUT testing reliable and reproducible.
Common issuesο
Hosting a test lab with DUTs is a challenging task that comes with multiple issues. We are listing some well-known issues here as they were taken into account for the design of the LAA.
DUT misbehaviorο
Often, DUTs used in test automation systems are prototypes or developer devices not meant to be productized nor automated.
Misbheaviors
Many DUTs came with misbehaviors that are not fixable and that should be workarounded.
For instance, many DUTs come with:
non-unique or random MAC addresses
non-unique or random USB serials
unstable USB stack that can break the worker USB stack
β¦
Unknown tested softwareο
The DUT will execute firmware, OS and user-space software provided by the test writer, meaning the LAA operator does not know what software will run on the DUT.
For instance, the DUT can run:
network services like DHCP or DNS server that will impact or alter the lab network
buggy network stack
rogue software
Common Lab problemsο
The following table outlines common laboratory challenges and the corresponding solutions provided by the LAA.
| Problem | Description | Impact | LAA Solution |
|---|---|---|---|
| USB hub instability | External USB hubs occasionally require resets or a worker reboot. | DUTs become inaccessible for flashing or console access until intervention. | USB infrastructure is integrated into the LAA and dedicated to a single DUT, reducing dependency on large shared hubs. |
| USB device limits on testing devices | Testing devices have practical limits on the number of USB devices they can handle, sometimes requiring serial concentrators. | Infrastructure complexity increases and debugging USB issues becomes harder. | Each LAA manages only one DUT, eliminating the need to scale USB connectivity through hubs or concentrators. |
| USB port state after testing device reboot | When a testing device reboots, all USB ports power up, including those used for flashing that should normally remain off. | Flashing interfaces may interfere with normal operation until manually controlled by LAVA. | A dedicated worker per DUT avoids shared USB power-state issues across multiple devices. |
| Service instability | Shared services controlling many devices occasionally require restarting. | A single service issue can affect many DUTs simultaneously. | Each LAA runs its own worker and control stack, isolating service failures to one DUT. |
| Testing device reboot affecting multiple DUTs | One testing device controls many DUTs. | Rebooting the testing device interrupts multiple running jobs and impacts multiple users. | Each LAA has its own βcontrollerβ, so only one DUT is affected. |
| Inability to reboot testing device immediately | Rebooting a testing device disrupts other DUTs running jobs. | Engineers must wait for jobs to finish or risk impacting customers. | Rebooting an LAA only affects a single DUT. |
| Mechanical PDU relay wear | Rack PDUs typically use mechanical relays that degrade over time. | Failed PDU ports prevent power cycling and require a new PDU port and a device dictionary change. | LAA uses integrated electronic switching (for example opto-isolated control) instead of mechanical rack PDUs. |
| Limited number of PDU ports which we want to safeguard | PDUs typically provide a fixed number of outlets and they can break. | Scaling DUT count requires additional PDUs and rack space because ports in existing PDUs have failed. | Power control is integrated within the LAA safeguarding PDUs ports. |
| PDU port failure | Mechanical wear or electrical failure can render a PDU port unusable. | A DUT may become unusable unless moved to another rack or PDU. | Integrated power switching removes reliance on shared rack PDUs. |
| Need for relay boards | Some DUTs require GPIO toggling for reset or boot control. | Additional hardware and wiring are needed. | GPIO and relay functionality are integrated in the LAA. |
| Cable management complexity | Multiple cables per DUT (power, serial, USB, Ethernet, relay wiring). | Cable spaghetti, difficult debugging, and accidental disconnections. | Minimal cabling β typically only power and Ethernet are required. |
| Slow and complex DUT onboarding | New DUT types require designing wiring and infrastructure integration. | Onboarding may take hours or longer depending on setup. | DUT integration is handled through a MIB; once connected, deployment takes significantly less time. |
| Infrastructure troubleshooting overhead | Shared hubs, relays, PDUs, and cables introduce many potential failure points. | Engineers spend time diagnosing infrastructure rather than the DUT itself. | LAA integrates most infrastructure components into a single appliance. |
| Rack density planning complexity | DUT deployment depends on available PDU ports, USB capacity, and rack space. | Scaling requires careful planning and infrastructure expansion. | Each LAA is a self-contained unit that can be deployed independently. |
| Network exposure of DUTs | DUTs often share the lab network. | DUTs could potentially probe or interact with the lab network. | LAA provides an isolated local network for the DUT. |
| Shared infrastructure failure affecting many DUTs | Worker crash, USB hub failure, or service issue. | Multiple DUTs become unavailable simultaneously. | Failures are isolated to a single LAA and DUT. |
| Infrastructure maintenance overhead | Shared lab infrastructure requires ongoing maintenance and monitoring. | Lab engineers spend time maintaining infrastructure. | Infrastructure components are embedded into each LAA, reducing centralized maintenance. |
LAA featuresο
Feature |
Classical LAVA worker |
LAA |
|---|---|---|
DUTs per worker |
Multiple |
1 |
OTA and rollback |
Complex |
β |
DUT private network |
β |
β |
DUT reproducibility |
Complex |
β |
Worker reproducibility |
Complex |
β |
One DUT per LAAο
By design, each LAA is connected to a single DUT.
It is protecting each DUT against interference with other DUTs in the same lab. For instance:
CPU and memory consumption of concurrent tests
non-unique USB serials is not a problem anymore
USB instability is affecting only one DUT at a time
OTA and rollbackο
The LAA OS supports OTA with automatic rollback in case of boot failures. The system can also easily rollback to a previous version if a regression is detected.
DUT Private networkο
The LAA and the DUT are connected together in a private network managed by the LAA. Network traffic is not routed outside of this private network and does not interfere with the lab network.
It is protecting the lab network from:
non-unique or random MAC address, while always providing the same IP to the DUT
buggy or rogue tested OS and user-space application
DUT reproducibilityο
DUTs of the same type are running inside the exact same environment, including the IPv4 network provided by the DHCP server running on the LAA.
It provides a stable and reproducible environment for the test jobs.
Worker reproducibilityο
Replacing a faulty worker by another one is a quick and simple process with a LAA, reducing the maintenance effort.