Skip to content

Latest commit

 

History

History
112 lines (81 loc) · 8.35 KB

search.md

File metadata and controls

112 lines (81 loc) · 8.35 KB

AZURE SEARCH / COGNITIVE SEARCH

CODE

LEARN

LUCENE

DEVOPS

PERFORMANCE

SAMPLES

SECURE

SHOWCASE

SKILLS

VOLUME

On processing large numbers of documents.... Via Luis Calado de Sousa 08/2018

It is possible to process large number of documents using Cognitive Search, but it requires a few considerations today. We are working on some features to make this easier but in the meantime you can do this by following this pattern.

  1. Partition your content into multiple folders on blob storage to be able to index it in parallel
  2. Create multiple indexers and data sources with each pointing to one of the folders in the container
  3. Set the indexers on a schedule to resume automatically after they hit the runtime limit
  4. Important. Create a custom skill that puts the OCR and other enriched data into blob storage as JSON (or another store) so that if you need to rebuild your index you don’t have to run OCR again on all of the documents. You can instead use the blob indexer to read just the JSON values.

Custom skill link https://docs.microsoft.com/en-us/azure/search/cognitive-search-create-custom-skill-example

MISC

MISC FAQ

CAN I FIND OUT WHO SEARCHED WHAT IN AZ SEARCH

Azure Search does not track or store customer related information such as this. FYou can integrate in that capability via custom code efforts. Mechanisms such as App Insights can be used with the client uploading telemetry. The AppService running the search engine can logs the information as well, to App Insights or another engine. The search telemetry can be used to add a fake parameter to a search request (e.g. search=foo&userid=123) which would log this info.