Over the next several weeks, Arturo will be sharing a three-part series that deep dives into the strength of our labeling process and how our team, technology, and processes separate us from the competition and enable our on-demand processing, resulting in the most accurate and up-to-date property information available to property and casualty insurance carriers.
The importance of everything Arturo does can be traced back to its image labeling. While seemingly mundane in nature, the nitty-gritty process sets the stage for current and all future projects that use machine learning.“Good labels can be used to train our models,” Arturo Data Quality Manager, Sergio Nieves said of its property insights and predictive analytics. “It’s how we emphasize quality and efficiency.”
In this post, Sergio helps to outline the art of labeling images, from the decision to keep quality control in-house to the technology we elected to use. The results put Arturo ahead of the competition in a way that is hard to duplicate.
Incredible speed to market
“Our goal is always to do it once and do it right,” Sergio said, noting the speed to market depends almost entirely on the complexity of the project. “It might take [an outsourced labeling shop] a month to label 5,000 images with 20 labelers working on it, or they could do 8,000 images in one day.”
One of the biggest time-savers is having the data quality team in-house rather than creating an unnecessary back-and-forth process across different time zones and even language barriers. The team can look at it once together in real-time so the machine learning team can understand why certain labels were created and then train models based on feedback, such as shadowing from a solar panel or wire placement on a building.
While it makes financial and efficient sense for the data quality team to be in-house, Arturo works directly with offshore labelers from India and Australia, as few as 24 but upward of 70, depending on the complexity of the project. “It’s repetitive work so they know how to do it,” Sergio said of enlisting help to expedite the process. The in-house data quality team, however, is based in Chicago for ease of interaction. During the labeling process, these team members are also compiling the documentation and training the business processing outsourcing for future projects before passing images off to the machine learning team.
Labelbox, the platform that Arturo’s labeling and data quality teams use, enables the creation of polygons, segmentation maps, and data points based on text classifications for each project, which the machine learning team then uses to generate masks to overlay on images.
This process of model-assisted labeling, which makes assets within computer-generated predictions editable, allows for quality control between BPOs feeding images to a model. Each image might have hundreds of polygons in it, according to Sergio, so it expedites the process to look at what the model did and tweak it, delete or correct a polygon, or label something that got missed all in real-time. “This only helps improve our documentation going forward,” Sergio said.
It’s this combination of keeping the data quality team in-house, using a top-of-the-line labeling platform, and a deep archive of quality labeled data that puts Arturo ahead of the competition. “Our team is constantly learning as we grow and pick up new skills,” Sergio said. “We’re a tight-knit group who is doing a lot more and can have a quicker turnaround time.”
To learn more about Arturo’s technology and how it is brought to life in solutions for insurance carriers, please request a demo.