How to process the images before previewview shows them on the screen ? I want to do object detection

Question

0.00/5 (No votes)

See more:

I am trying to build a real-time object detection android app using the Google ML kit. What I want to do is take a image frame from the camera , detect the objects inside that frame , draw the bounding boxes & then show that image to the user on the screen. I am trying to do this using the previewView, how can I process the images before the previewView shows them to the user? Is this even possible ? Please help me out here. Thanks

activity_main.xml

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout
    xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <androidx.appcompat.widget.Toolbar
        android:id="@+id/actionBar_main"
        android:layout_width="match_parent"
        android:layout_height="?attr/actionBarSize"
        android:background="@color/teal_200"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:title="@string/app_name" />

    <androidx.camera.view.PreviewView
        android:id="@+id/previewView_main"
        android:layout_width="match_parent"
        android:layout_height="0dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/actionBar_main" />


</androidx.constraintlayout.widget.ConstraintLayout>

MainActivity.java

import androidx.core.content.ContextCompat;
import androidx.lifecycle.LifecycleOwner;
import android.Manifest;
import android.annotation.SuppressLint;
import android.content.pm.PackageManager;
import android.media.Image;
import android.os.Bundle;
import android.util.Size;

import com.google.android.gms.tasks.OnFailureListener;
import com.google.android.gms.tasks.OnSuccessListener;
import com.google.common.util.concurrent.ListenableFuture;
import com.google.mlkit.vision.common.InputImage;
import com.google.mlkit.vision.objects.DetectedObject;
import com.google.mlkit.vision.objects.ObjectDetection;
import com.google.mlkit.vision.objects.ObjectDetector;
import com.google.mlkit.vision.objects.defaults.ObjectDetectorOptions;

import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class MainActivity extends AppCompatActivity {

    Toolbar toolbar;
    PreviewView previewView_main;
    ListenableFuture<ProcessCameraProvider> cameraProviderFuture;
    private final int CAMERA_REQUEST_CODE = 101;
    ObjectDetectorOptions options;
    ExecutorService cameraExecutor;
    ObjectDetector objectDetector;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        cameraExecutor = Executors.newSingleThreadExecutor();

        toolbar = findViewById(R.id.actionBar_main);
        setSupportActionBar(toolbar);

        previewView_main = findViewById(R.id.previewView_main);
        ask_permissions();

        // Object Detection API
        options = new ObjectDetectorOptions.Builder()
                        .setDetectorMode(ObjectDetectorOptions.STREAM_MODE)
                        .enableClassification()  // Optional
                        .build();
        objectDetector = ObjectDetection.getClient(options);

    }

    //===============================================================================

    // Camera Preview Code

    private void load_camera_preview() {

        cameraProviderFuture = ProcessCameraProvider.getInstance(this);
        cameraProviderFuture.addListener(() -> {
            try {
                ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
                bindPreview(cameraProvider);
            } catch (ExecutionException | InterruptedException e) {
                // No errors need to be handled for this Future.
                // This should never be reached.
            }
        }, ContextCompat.getMainExecutor(this));

    }

    private void bindPreview(ProcessCameraProvider cameraProvider) {

        Preview preview = new Preview.Builder()
                .build();

        CameraSelector cameraSelector = new CameraSelector.Builder()
                .requireLensFacing(CameraSelector.LENS_FACING_BACK)
                .build();

        preview.setSurfaceProvider(previewView_main.getSurfaceProvider());


        // ImageAnalysis for processing camera preview frames
        ImageAnalysis imageAnalysis =
                new ImageAnalysis.Builder()
                        .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
                        .setTargetResolution(new Size(1280, 720))
                        .build();

        imageAnalysis.setAnalyzer(cameraExecutor, new ImageAnalysis.Analyzer() {
            @Override
            public void analyze(@NonNull ImageProxy image) {

                // preparing input image
                @SuppressLint("UnsafeOptInUsageError")
                Image mediaImage = image.getImage();

                if (mediaImage != null) {

                    InputImage inputImage = InputImage.fromMediaImage
                            (mediaImage, image.getImageInfo().getRotationDegrees());

                    // object detection for each frame

                    objectDetector.process(inputImage)
                            .addOnSuccessListener(detectedObjects -> {
                                
                                // detectedObjects is a list




                            })
                            .addOnFailureListener(e -> {



                            });
                }

                image.close();
            }
        });


        Camera camera = cameraProvider.bindToLifecycle((LifecycleOwner) this, cameraSelector, preview);

    }


    //===============================================================================

    // Asking permissions from user code


    private void ask_permissions() {

        // asks camera permission to the user.
        if (ActivityCompat.checkSelfPermission(this, Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
            String[] permissions = {Manifest.permission.CAMERA};
            ActivityCompat.requestPermissions(this, permissions, CAMERA_REQUEST_CODE);
        }

        // if permission already granted then load camera preview
        if (ActivityCompat.checkSelfPermission(this, Manifest.permission.CAMERA) == PackageManager.PERMISSION_GRANTED) {
            load_camera_preview();
        }

    }

    // Called when a request permission is denied or accepted.
    // Load camera preview in both cases
    @Override
    public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);
        if (requestCode == CAMERA_REQUEST_CODE && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
            load_camera_preview();
        } else {
            load_camera_preview();
        }
    }

    //===============================================================================

What I have tried:

I was thinking if I could overlay an imageview over the entire previewView and then call change the images inside the imageview, but that doesnt seem efficient.

Posted 9-Aug-21 1:58am

Deepesh Mhatre 2021

Updated 9-Aug-21 2:12am

Add a Solution

Comments

[no name] 9-Aug-21 11:21am

Object detection is a concept, not a technique. You haven't said anything about what you expect to accomplish. I use "object detection" to determine rivers and map terrain ... which I'm sure doesn't mean anything without an explanation.

Deepesh Mhatre 2021 9-Aug-21 12:29pm

Didnt i mention I want to use the Google ML kit Object detection API ?

[no name] 9-Aug-21 12:45pm

Yes, which still doesn't explain what you expect to happen or "see".

Deepesh Mhatre 2021 10-Aug-21 2:57am

I know how to use the API.The only problem I am facing is how do show these images with bounding boxes to the user.
What I want to do is take the image from the camera & detect objects and show them to the user in realtime. got it ?

David Crow 9-Aug-21 13:46pm

I'm guessing you read through this? The DetectedObject object contains a Rect that indicates the position of the object in the image. You could then use those coordinates to draw a rectangle "on top of" the object.

Deepesh Mhatre 2021 10-Aug-21 2:56am

I know that,Let me clear this up. I know how to use the API.The only problem I am facing is how do show these images with bounding boxes to the user.
What I want to do is take the image from the camera & detect objects and show them to the user in realtime. got it ?

[no name] 10-Aug-21 10:03am

It's no different than processing a single image ... you're hung up on "real time". "Got it?"

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)