Technology, Human-Annotated Datasets,


IT - Software Services










Company Profile

Appen is a global leader in the development of high-quality, human-annotated datasets for machine learning and artificial intelligence

Appen brings over 20 years of experience capturing and enriching a wide variety of data types including speech, text, image and video.

We have deep expertise in more than 180 languages and dialects, and access to a global crowd of over 400,000 skilled contractors.

Appen partners with leading technology, automotive and eCommerce companies — as well as governments worldwide — to help them develop, enhance and use products that rely on natural languages and machine learning.


Ads Evaluation

Our team of evaluators use proven systems to ensure that paid advertisements are targeted appropriately during an online search. The outcome is a maximization of clicks per ad, an enhanced user experience and increased user satisfaction. Additionally, Appen enables clients to expand search advertising into global markets with the confidence of knowing that advertisements have been evaluated, sorted and ranked appropriately by in-market evaluators, which increases the opportunity for revenue.

Whole Page Evaluation

Appen assists its clients in improving the search experience by providing enhanced machine learning training data that generates accurate search results. An analysis of the whole page covers multiple page components, including answers, captions, related searches, and entity information. Appen’s evaluators determine user intent—as well as the accuracy of the presented results and the user’s satisfaction with them—to assess the user experience. Additionally, with processes and tools in place, Appen offers identification of incorrect content in addition to editorial fixes, which further enhances the user experience.

Translation and Localization

From translating product descriptions and user reviews, to localizing speech interfaces for virtual assistants and chatbots, we can help you scale to new markets more quickly and efficiently. Take advantage of Appen’s vast experience from helping build the Microsoft and Skype Translator engines to customize the ideal machine translation engine for you. The key to translating ever-increasing amounts of content is automation. But Machine Translation engines need a lot of data, and even more as we move into the brave new world of Neural Machine Translation. Appen has vast experience in collecting data, and can create corpora in machine-readable or natural language, text or speech. Or you can license an existing database from our extensive catalog.

Autocorrect and Spell Check Evaluation

Our agile, multilingual teams of search evaluators review corrected or completed words your search engine provides, compare them to the misspelled or incomplete words the user entered, and determine if the search engine provided the correct result. This feedback is then used to train the machine learning algorithm to automatically correct or complete query terms.

Field Testing

With access to a crowd of over 400,000 people in over 130 countries, Appen has the resources to provide local field testers to augment your team. Our experienced project managers work with you to determine your testing parameters and to develop the testing program. We then work closely with you throughout the project to ensure that you have the data you need to feel confident that your solution has been thoroughly tested prior to its release.

Linguistic Annotation

Our expert linguists provide a wide variety of services that help train your machine learning model to better understand language. Services include: – part-of-speech tagging – named entity tagging – treebanking – semantic roles and relations Our skilled team can ramp quickly to ensure that your data is annotated in your desired timeframe in over 180 language and dialects.

Consultative Services

Appen provides a customized approach where we thoroughly evaluate your needs and develop a program to address your specific business objectives. Our services include developing user guidelines, identifying the correct platform for your experiment, assisting in the creation and development of the user interface and identifying, measuring and reporting on key quality metrics. Our clients often say that as a result of our approach, the Appen team ends up becoming an extension of their project team, creating a close partnership and stronger results.


Our skilled evaluators provide detailed feedback on shopping, social media and search results through their own user profiles, providing you with key insights to help target content to specific demographic segments. This helps you serve up more of the content your users do want to see, and less of the content they are not interested in. With access to evaluators in over 130 countries, we can help improve your content for users worldwide.

Linguistic Rule Development

Our expert linguists work with your team to understand your objectives and help you develop linguistic rules and grammars that support the needs of your target users.


When selecting a crowdsourcing partner, it’s important to consider the type of crowd offered and the skillsets available within that crowd. Appen works with known individuals around the world who become experts in a given task, be it search relevance evaluation, social media evaluation or other human annotation tasks. And should you need a crowd that is skilled in a particular industry, we can curate a crowd to meet your needs. Our seasoned project managers use their collective experience in working with leading technology companies to develop guidelines that fit your specific needs, and ensure that data quality goals are met. With this approach, we produce higher quality outcomes for our clients.

Data Annotation

Appen creates high-quality, human-annotated datasets to train machine learning algorithms to mimic human thought. Annotated data enables richer, more valuable, and more directly usable applications. Transcribed speech, part-of-speech lexicons and other annotated data are also available as part of our off-the-shelf resources.

Spam Detection

Appen uses local expert annotators – fluent in the market’s language – to provide the most accurate assessments of the presence of spam techniques and junk pages. Assessments are based on a continuum, which enables clients to determine for themselves the acceptable levels of certain techniques. Some of the common types of SEO techniques that we identify are keyword and URL stuffing, machine-generated content, link manipulation, hidden text, cloaking and stolen content. Evaluators include those with SEO and webmaster backgrounds, and all of our spam-detection team members are web and tech savvy. Appen’s evaluators help to identify spam content and develop training data to help improve automated spam detection programs.

Text Data Collection

Our experts provide data collection in any domain that you specify, such as business listings, music titles, artist names, abbreviations and acronyms, food, transportation, computing, or geographical locations. We have the capability to collect a wide variety of natural language text data, from a range of user demographics and domains. This data can then be used in the development of web or application user interfaces, prompts and grammar specifications for voice-interactive devices or automated phone systems, domain-specific lexica and specialty word lists.

Content Moderation

Whether you are in need of content moderation for a specific marketing campaign, or ongoing services to train your machine learning algorithm to recognize acceptable vs unacceptable content, Appen can meet your needs. With our global curated crowd, we have access to skilled individuals who are experienced in content moderation to ensure your brand is protected in all of your target markets. Talk to us about your specific user-generated content and we’ll design a program to meet your needs.

Geo-Local Evaluation

We offer in-market evaluators who verify the accuracy of business information and points of interest on maps to ensure you provide customers with the most up-to-date data. Evaluators may check to make sure that the pin on the map is in the right place, or may use satellite images to identify buildings and other notable landmarks. Using local, in-market resources ensures higher levels of accuracy as they can physically verify your data. It also helps to scale your team more quickly and efficiently, accelerating your time to market.

Entity Verification

Appen offers in-market evaluators in over 130 countries who validate the accuracy of the local business data and correct the provided data when necessary. Some projects require calling businesses to verify the data, while other projects rely solely on web research validation. Using in-market evaluators increases the likelihood of validation through phone verification because they speak the native language. Additionally, in-market evaluators provide the local knowledge and expertise to determine whether the businesses in question have closed, moved, or changed names.