The most powerful and flexible technology for detecting key ideas and trends, according to user’s needs
Our concept extraction service has been developed using our technology NaturalExtractor. This service detects and extracts concepts (“government”, “global warming”, “results from the last general elections”…) from any type of text, either high-quality ones (news, legislation…) or colloquial ones (forums, blogs, chats, social media…).
“Concepts” are minimum fragments of text which convey ideas (“service quality”) or objects (“mahogany table”). An effective concept extraction requires tools capable of performing linguistic analysis, such as those included in NaturalExtractor, in order to extract various types of concepts:
- Simple concepts (“checking account”, “twin brother”)
- Compound or nested concepts (“my brother’s checking account”)
- Combinations of the above (“account”, “checking account”, “checking account of the bank”, “bank”, “my brother’s bank”…)
The service can extract each type of concepts according to user’s needs.
This linguistic analysis is possible taking into account prepositions (“basket with food”), determiners (“my sister’s house”), verbs (“the teacher taught English grammar”) and other grammatical categories as concept separators.
NaturalExtractor applies a normalization process to concepts by generating a standard form in order to coherently handle all instances of the same concept (“the checking account”, “these checking accounts”, “one of my checking accounts”…). A correct normalization of concepts is essential for services such as Categorization or for trend detection.
This process of concept recognition can be easily combined with the service of entity recognition, thus allowing for analysis featuring both services (“the speech of the president of the United States” is a compound concept which can be divided into one concept “speech of the president” and one entity “United States”).