Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Document Intelligence big files issue #31025

Open
PiratesKing13 opened this issue Sep 9, 2024 · 1 comment
Open

Azure Document Intelligence big files issue #31025

PiratesKing13 opened this issue Sep 9, 2024 · 1 comment
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Cognitive - Form Recognizer customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@PiratesKing13
Copy link

I am using Azure document Intelligence to read (OCR) my pdf files, my code is Ok with files less than 100 pages and 200 MB in size. but when I want to pass this limit, I face this error in my code

Unexpected error: RestError: Error reading response as text: aborted
{
"name": "RestError",
"code": "PARSE_ERROR",
"message": "Error reading response as text: aborted"
}

I have also checked Document intelligence limitation for my tier subscription and it support files up to 500 MB and 2000 pages which I am not passing that limit.

I am using node version 20.16.0
Windows 10
@azure/ai-form-recognizer --> 5.0.0
@azure/storage-blob --> 12.17.0

here is my code

import {
DocumentAnalysisClient,
AzureKeyCredential,
} from '@azure/ai-form-recognizer';
import {
BlobSASPermissions,
BlobServiceClient,
ContainerClient,
RestError,
StorageSharedKeyCredential,
generateBlobSASQueryParameters,
} from '@azure/storage-blob';
import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import * as fs from 'fs/promises';

@Injectable()
export class DocumentInteligenceService {
private documentAnalysisClient: DocumentAnalysisClient;
private endpoint;
private apiKey;
private readonly connectionString: string;
private readonly containerName: string;
private readonly blobServiceClient: BlobServiceClient;
private readonly storageAccountName: string;
private readonly storageAccountKey: string;
private readonly containerClient: ContainerClient;

constructor(private configService: ConfigService) {
this.endpoint = this.configService.get(
'AzureFormRecognizer.Endpoint',
);
this.apiKey = this.configService.get('AzureFormRecognizer.ApiKey');

this.documentAnalysisClient = new DocumentAnalysisClient(
  this.endpoint,
  new AzureKeyCredential(this.apiKey),
);

this.connectionString = this.configService.get<string>(
  'AzureStorageAccount.ConnectionString',
);

this.containerName = this.configService.get<string>(
  'AzureStorageAccount.ContainerName',
);

this.storageAccountName = this.configService.get<string>(
  'AzureStorageAccount.StorageAccountName',
);
this.storageAccountKey = this.configService.get<string>(
  'AzureStorageAccount.StorageAccountKey',
);

this.blobServiceClient = BlobServiceClient.fromConnectionString(
  this.connectionString,
);

this.containerClient = this.blobServiceClient.getContainerClient(
  this.containerName,
);

}

async analyzeDocumentLayout(blobUrl: string): Promise {
try {
const poller =
await this.documentAnalysisClient.beginAnalyzeDocumentFromUrl(
'prebuilt-read',
blobUrl,
{
onProgress: (state) => console.log(Status: ${state.status}),
},
);
const result = await poller.pollUntilDone();

  const resultText = JSON.stringify(result, null, 2);

  await fs.writeFile(`test.json`, resultText, 'utf-8');
  console.log('The results have been saved to a text file.');
} catch (error) {
  if (error instanceof RestError) {
    console.error('Error:', error.message);
    // Add logic to retry or handle specific RestError scenarios.
  } else {
    console.error('Unexpected error:', error);
  }
}

}

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) labels Sep 9, 2024
Copy link

github-actions bot commented Sep 9, 2024

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

@xirzec xirzec added Cognitive - Form Recognizer and removed Storage Storage Service (Queues, Blobs, Files) labels Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. Cognitive - Form Recognizer customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

3 participants