Pulling out structured bits of information from unstructured text is a huge selling point of Natural Language Processing (NLP) systems. It’s such a big selling point there’s a name for the process of extracting and classifying bits of unstructured text into structured “entities” to better work with free text: Named Entity Recognition (NER).
There are all kinds of entities that people want to extract from free text. One of the most common entities is organizations. Organizations are references to organized groups of people. For example, the sentence “ Matt works at Apple as an ML engineer. “ contains the organization entity “Apple” as well as the person entity “Matt”.
There are plenty of reasons why people would need to quickly extract organizations from free text. Historians and other researchers can extract organizations from documents to expedite research on their particular research topic. Any person or company can extract organizations from incoming news articles to quickly bucket and sort through the firehouse of news. It’s clear that many people can benefit from organization extraction technology but up until now entity extraction systems have been difficult to set up and use.
Forefront Extract
To make organization extraction as simple as an API call, we created Forefront Extract, an easy-to-use named entity extraction API. Using the API to extract organizations from text is simple and requires only two inputs:
- text — a piece of text that can be any length
- entities — A list of entity types that you are looking for. There are many entity options and the one we will highlight here is “organizations” — references to groups of people
Let’s see the Extract API in action for the examples we outlined earlier.
Example 1: Extracting Historical Context
Forefront Extract can help historians by automatically extracting organizations from documents so they can focus on the most relevant pieces of text. Let’s watch Forefront Extract recognize what organizations are mentioned in the Chinon Parchment — a document from 1308 about the Templar Knights.
With Forefront Extract, researchers can quickly categorize documents both historical and otherwise based on the organizations they mention.
Example 2: Structuring the news
The constant influx of headline-worthy news can be overwhelming for many but with Forefront Extract, we can extract the relevant organizations from news headlines/articles to sort which news is relevant. For example if we want to filter out news that is about the Target corporation, we can use Forefront Extract to extract organizations from one news headline that is about Target and one that could have been confusing had we just used a keyword search.
We can see that the Extract API recognizes that the first headline is about the organization “Target” whereas the second headline merely uses the capitalized word in context with the F.B.I.
Get started with Forefront Extract
It’s clear from our examples that Forefront Extract can intelligently extract organizations and other entities from all kinds of text. The possibilities unlocked by Forefront Extract are virtually limitless. Documentation for the Extract API can be found here.
Ready to sign up and get started? Sign up for Forefront today!
Originally published at https://www.forefront.ai.