Azure Search returning results for deleted blob resources Azure Search returning results for deleted blob resources azure azure

Azure Search returning results for deleted blob resources


After some reading I found that the only deletion policy currently supported by Azure search is Soft Delete.

To enable this for BLOB storage you have to create a metadata value on each BLOB (e.g. IsDeleted) and update this value to enable it to be captured by the Deletion policy.

PUT https://[service name].search.windows.net/datasources/blob-datasource?api-version=2016-09-01Content-Type: application/jsonapi-key: [admin key]{"name" : "blob-datasource","type" : "azureblob","credentials" : { "connectionString" : "<your storage connection string>" },"container" : { "name" : "my-container", "query" : "my-folder" },"dataDeletionDetectionPolicy" : {    "@odata.type" :"#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",         "softDeleteColumnName" : "IsDeleted",    "softDeleteMarkerValue" : "true"    }} 

Full details here

I'll need to do some testing to ensure that it is safe to update the metadata and then immediately delete the BLOB.


While Soft Delete is an option, the index that is being targeted by the indexer can also be directly modified if you so choose.

You can use the POST to index API detailed on this page to directly delete documents, using their "key" field. An example below:

POST https://[service name].search.windows.net/indexes/[index name]/docs/index?api-version=[api-version]   Content-Type: application/json   api-key: [admin key]  {    "value": [      {        "@search.action": "delete",        "key_field_name": "value"    }  ]  } 

Assuming you didn't use field mappings to modify the default "key" behavior of blob indexers, from the documentation on this page the key field will be the base64 encoded value of the metadata_storage_path property (again, refer to the previous link for details). Therefore, upon deleting the blob, you can write a trigger to POST the appropriate payload to your search index from which you want the documents to be deleted.


Here is a solution I implemented for removing blobs in azure search data source.

  • Step1 : remove a document from blob storage
  • Step2 : remove a document from azure search

In dictionary key is container name, values is list of files.

Here is code sample

 public async Task<bool> RemoveFilesAsync(Dictionary<string, List<string>> listOfFiles)    {        try        {            CloudBlobClient cloudBlobClient = searchConfig.CloudBlobClient;            foreach (var container in listOfFiles)            {                List<string> fileIds = new List<string>();                CloudBlobContainer staggingBlobContainer = cloudBlobClient.GetContainerReference(container.Key);                foreach (var file in container.Value)                {                    CloudBlockBlob staggingBlob = staggingBlobContainer.GetBlockBlobReference(file);                    var parameters = new SearchParameters()                    {                        Select = new[] { "id", "fileName" }                    };                    var results = searchConfig.IndexClient.Documents.Search<Document>(file, parameters);                    var filedetails = results.Results.FirstOrDefault(p => p?.Document["fileName"]?.ToString()?.ToLower() == file.ToLower());                    if (filedetails != null)                        fileIds.Add(filedetails.Document["id"]?.ToString());                     await staggingBlob.DeleteAsync();                }                // delete from search index                var batch = IndexBatch.Delete("id", fileIds);                await searchConfig.IndexClient.Documents.IndexWithHttpMessagesAsync(batch);            }            return true;        }        catch (Exception ex)        {            throw;        }    }