International Text-Analytics Company
Initial situation
Every day, each major company generates unimaginable amounts of textual information - from Word documents to emails to short Slack and WhatsApp messages within the workforce. Typically, much of this data disappears into the dark corners of file repositories. But there are always situations when it is important to have an overall view of all documents and the - potentially sensitive - information they contain:
- In the event of a security breach or other data breach, a company needs to know immediately whose personal information was affected, even if the original files are no longer accessible at all.
- According to DSGVO or GDPR, any person can ask at any time what personally identifiable data a company has stored about them. And companies must be able to respond promptly - and should be able to do so as cost-effectively as possible.
- In the event of a legal dispute, it is essential in some jurisdictions that electronic evidence can be reliably produced. This means, for example, that a company in the U.S. may be required to make all documents relating to a matter available
Our contribution
Under our leadership, a unique text analysis solution has been developed for a multi-national market.
This solution allows to import, analyze, aggregate and return millions of documents (e.g. the email traffic of several companies) via a user interface and an API. All applicable technologies are used to make the search in this data as efficient as possible:
- Automatic grouping of email conversations (email threads)
- Grouping of duplicates and similar documents
- Compilation of topically related documents (clustering)
- Extraction of entities and pivoting of data based on the extracted information
- Self-training classification into relevant and non-relevant data (Continuous Active Learning)
In addition, targeted marketing increased awareness of the emerging brand to such an extent that an IPO could take place in 2020.