Monday 20 March 2023

Read PDF text with Azure AI (Azure Cognitive Service)

Azure Cognitive Service makes developer life easy for AI tasks. It provides you with one API Endpoint and with that endpoint, it enables many use cases. 


You can do multiple things with a single cognitive Service like Speech, Text, and Vision use cases.
 
Speech to Text
Transcribe audible speech into readable, searchable text.
Text to Speech
Convert text to lifelike speech for more natural interfaces.
Speech Translation
Integrate real-time speech translation into your apps.

Today, we will create a Vision POC and read PDF/Image files.


Step 1: Create an Azure Cognitive Service in Azure Portal.



Step 2: Click on the Keys and EndPoint option. It will provide you one endpoint and key.











Step 3: Create a new MVC Project and paste the code below in .cshtml file




<form method="post" enctype="multipart/form-data">
    <div class="row">
        <div class="col-8">
            <input type="file" name="file" class="form-control" />
        </div>
        <div class="col-4">
            <button type="submit" class="btn btn-primary">Upload File</button>
        </div>
    </div>
    @ViewBag.extractText
</form>


Step 4: Install Computer Vision Nuget Package









Step 5: Paste the below code in Controller.cs


   public class HomeController : Controller
    {
        private Microsoft.AspNetCore.Hosting.IHostingEnvironment _environment;
    
        public HomeController(Microsoft.AspNetCore.Hosting.IHostingEnvironment hostingEnvironment)
        {
            _environment = hostingEnvironment;
        }

        public IActionResult Index()
        {
            return View();
        }

        [HttpPost]
        public async Task<ActionResult> Index(IFormFile file)
        {
            if (file == null)
            {
                ModelState.Clear();
                ModelState.AddModelError("file", "Please select file first.");
                return View("Index");
            }

            var key = "pass your key here"; 
            var endPoint = "https://write your end point here";

            ComputerVisionClient client = Authenticate(endPoint, key);

            if (file.Length > 0)
            {
                var path = Path.GetFullPath(Path.Combine(Environment.CurrentDirectory, "UploadedFiles"));
                if (!Directory.Exists(path))
                {
                    Directory.CreateDirectory(path);
                }
                using (var fileStream = new FileStream(Path.Combine(path, file.FileName), FileMode.Create))
                {
                    await file.CopyToAsync(fileStream);
                }

            }

            var contentPath = _environment.ContentRootPath + "\\UploadedFiles\\";
            var fileName = contentPath + file.FileName;
            var text = await ReadImage(client, fileName);
            if (string.IsNullOrEmpty(text))
                text = "No text found.";
            ViewBag.extractText = text;
            FileInfo fileInfo = new FileInfo(fileName);
            if (fileInfo.Exists)//check file exsit or not  
            {
                fileInfo.Delete();
            }

            return View("Index");
        }

     
       public ComputerVisionClient Authenticate(string endpoint, string key)
        {
            ComputerVisionClient client =
              new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
              { Endpoint = endpoint };
            return client;
        }

        public async Task<string> ReadImage(ComputerVisionClient client, string localFile)
        {
            StringBuilder sb = new StringBuilder();

            // Read text from URL
            var textHeaders = await client.ReadInStreamAsync(System.IO.File.OpenRead(localFile));
            // After the request, get the operation location (operation ID)
            string operationLocation = textHeaders.OperationLocation;
            Thread.Sleep(2000);


            const int numberOfCharsInOperationId = 36;
            string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

            // Extract the text
            ReadOperationResult results;

            do
            {
                results = await client.GetReadResultAsync(Guid.Parse(operationId));
            }
            while ((results.Status == OperationStatusCodes.Running ||
                results.Status == OperationStatusCodes.NotStarted));

            var textUrlFileResults = results.AnalyzeResult.ReadResults;
            foreach (ReadResult page in textUrlFileResults)
            {
                foreach (Line line in page.Lines)
                {
                    sb.AppendLine(line.Text);

                }
            }
            return sb.ToString();
        }
    }


Step 6: Run the project and upload a PDF file






you can see the output below the button, You can extract Specific text based on your need with custom logic.

If you want to read the specific text, then go for the Form Recognizer option in the Azure AI.


To know more about Cognitive Service, you can follow the below link

Azure Cognitive Service



No comments:

Post a Comment

How to find the reason of HTTP Error 500.30 - ASP.NET Core app failed to start in Azure App Service

HTTP Error 500.30 - The ASP.NET Core app failed to start If your web app is throwing an error HTTP error 500.30 then how to find the root ca...