The Azure Cognitive Search service allows you to quickly discover, enrich, and explore your data. You can automatically pull data from Azure data stores such as blob storage, Azure sequel, Cosmos DB, and more. You can search not only by databases (Azure Cosmos DB, Azure SQL Database, SQL Server hosted in an Azure VM), but also by Blob (Azure Blob Storage, Azure Table Storage). The flexibility of the service is so great that it allows optionally to push data directly into the search index from other cloud storage locations using the Azure push API.
Additional information enrichment is available for searchable metadata with the prebuilt cognitive skills integrated into the Azure Cognitive Search platform. These skills allow exploring multimedia sources to do things like extracting key phrases, image metatags, detect foreign language, and more. Different file formats such as PDF, Word documents, images, JSON files, and more are supported. Azure could also add a language detection skill for your search and provide relevant search results.
Among out-of-the-box skills, custom skills could be defined and integrated into your machine learning model. For example, for a B2B portal in manufacturing, materials used for an exact product could be extracted from vendor documentation and appear in response to a search query.
All of these capabilities combined give you a powerful search experience. Imagine, you use Azure Cognitive Search to develop a search application where a product search should return technical manuals associated with the product. As documents are ingested into the Azure platform, they are processed and categorized according to the search index.
Azure’s out-of-the-box cognitive skills support keyphrase extraction, and named entity recognition generates a rich corpus of metadata. Among out-of-the-box skills, you can customize the cognitive search pipeline to extract and enrich metadata specific to your business. This process can be done programmatically as the Azure Cognitive Search service is well integrated with Azure databases.
If you don't need an ML model, deploying an Azure function is the quickest and easiest way to create a custom skill. Creating tags for domain-specific terms like product material is an example of how to use the Azure function. Then continue with publishing this skill set and get a new skill programmatically connected to the document exploring pipeline.
For files uploaded to the blob, it is possible to use OCR (Optical Character Recognition). Recognition of handwritten (so far only English) and printed text is possible. With the help of cognitive services, it is possible to identify various objects in the photo; for example, famous places or celebrities.
To summarize, you might consider using built-in cognitive skills if your original content consists of unstructured text, images, or content that requires language detection and translation. When using AI, you might consider adding a custom skill if you have open source, third party, or native code that you want to integrate into the search pipeline. These are classification models that define the characteristics of different types of documents.