Need autonomous driving training data? ›

FAQ: Training Data as a Service (TDaaS)

FAQ: Training Data as a Service (TDaaS)

When you create a new solution, in a new space, people naturally have questions. They want you to define the terms you’re using, explain how you compare to solutions they’re more familiar with, give examples of how the solution works, and so on.

We’ve been cheerily fielding these kinds of queries over the phone, online, and at events in our quest to spread the TDaaS™ word. It’s very fun, but uh, not very efficient—we realized we needed to outline all these answers in one skimmable place.

So here we go, answers to the most frequently asked questions of Mighty AI. Happy learning!

Are you a crowdsourcing provider?

No. Sorta. But no. We have a complete training data solution, which does include a form of crowdsourcing, but is far more than simply a connection to a crowd. The following paragraphs will explain further, but the short answer is: it’s more accurate to call us the world’s first Training Data as a Service™ provider than a crowdsourcing provider.

How are you different from Mechanical Turk?

Amazon’s Mechanical Turk, our neighbor down the street here in Seattle, is “a marketplace for work that requires human intelligence.” Comparing that to Mighty AI’s TDaaS solution, Mechanical Turk is similar to one part of our workflow: the part where humans label data. But that’s kinda it in terms of similarities.

We’re not a marketplace or a self-serve tool; we’re a managed service. To merge SAT taxonomy with geeky goodness, Mighty AI : mTurk :: Heroku : AWS. What Mighty AI provides that Mechanical Turk does not includes:

  • consultative services (annotation strategy, implementation, and optimization);
  • task design (UX, UI, and instructions);
  • project management;
  • recruitment, qualification, management, and payment of known taskers; and
  • QA built on our proprietary machine learning and workflows.

What do you mean by “as a service”?

When we say we provide “Training Data as a Service” we mean we have an end-to-end solution. Instead of labeling data in-house or using a crowdsourcing provider, you can lean on Mighty AI to take your training data duties completely off your plate. That includes everything from figuring out precisely what kind of annotations you need, to delivering the final annotated datasets to you to train your algorithm. Yep. That whole workflow is all Mighty AI.

Do I buy data from you?

We sell you annotations based on data that you provide. So, in short, yes, you give us unstructured data, and get back structured data in the form of annotations, aka “answers” or “insights.” If you don’t have the raw data you need to begin, let’s talk: we can likely get it for you.

Who are these people doing the tasks?

We call our community members “Fives,” and they’re a network of skilled specialists from all walks of life and from all over the globe. Our relationship with our users is unique in that we really get to know them—we know the demographics, interests, skills, qualifications, and on-platform behaviors of each individual. This helps us pair the right taskers with each task, so that not only are they doing tasks they enjoy, but also the quality of their annotations is far superior to that of a random crowd of “workers.” In addition, our community members provide each other questions, feedback, and mentorship, and earn experience points and levels.

How do I send data to you? How do I receive it from you?

It’s pretty simple, and you have a couple of options:

  1. API integration allows you to post and collect your data in real-time. You send over your source data and we send back data once it has completed its workflow, including quality assurance.
  2. Alternately, you can provide batches of data in .csv or other file formats and we can return your labeled data in the file format that you need.

How quickly can you turn around a dataset?

Our solution is scalable—our vast population of high-quality taskers combined with our meticulous and fully automated QA process allows us to generate very high volumes of accurate training data with fast turnaround to meet your data throughput needs. Just tell us what you need and we’ll make it happen.

How do you ensure quality?

If we told you we’d have to ki… err, it’s a combination of targeting, tactics, and technology.

  • Targeting: we identify taskers with the right mix of demographics, location, skill, domain knowledge, and proven performance for your needs.
  • Tactics: we start with a “pilot,” annotating a small subset of your data first, and adjusting based on learnings.
  • Technology: our proprietary interfaces, workflows, and quality control technology ensures the data we deliver back is accurate at the level you’ve requested.

Can I try it?

It’s not a plug-and-play solution—no two training data projects are the same. However, the pilot mentioned above is effectively a trial.

How much does it cost?

It depends on the following:

  • volume
  • type of task
  • quality & velocity requirements
  • level of skill and/or speciality needed from taskers
  • level of consultation needed

In general, you’ll find the total cost of ownership (TCO) favorable, given you’ll no longer have the expense of employees spending their valuable time labeling data, or paying contractors or a crowdsourcing provider to do it (not to mention the time spent managing the processes and QAing the results). Even better, unlike traditional crowdsourcing or in-house editors, you pay Mighty AI by the answer—not the task or the hour. (With traditional crowdsourcing, you often have to “spray and pray”: ask 5, 10, or even more taskers the same question, and then perform analysis to distill the signal from the noise.) This means you get reliable, high-quality human insights, with a predictable ROI. And your team members can go focus on what they do best.

Who are your customers?

To name a few: IBM, Expedia, Getty Images, Sentient Technologies, NTT DoCoMo, Havas, GumGum. Check out our case studies here.

More questions? Let us know.

Note: Prior to January 10, 2017, Mighty AI was known as Spare5. While Spare5 remains the name of our consumer brand and application, we’ve relaunched our business-customer side as Mighty AI, which also serves as the parent company under which Spare5 now lives. Some posts on have been updated with the new company name to ease confusion.