Open Source Computer Vision Library

Overview

OpenCV: Open Source Computer Vision Library

Resources

Contributing

Please read the contribution guidelines before starting work on a pull request.

Summary of the guidelines:

  • One pull request per issue;
  • Choose the right base branch;
  • Include tests and documentation;
  • Clean up "oops" commits before submitting;
  • Follow the coding style guide.
Comments
  • CUDA backend for the DNN module

    CUDA backend for the DNN module

    More up-to-date info available here (unofficial)


    How to use build and use the CUDA backend?

    How to use multiple GPUs?

    There are many ways to make use of multiple GPUs. Here is one which I think is the safest and the least complex solution. It makes use of the fact that the CUDA runtime library maintains a separate CUDA context for each CPU thread.

    Suppose you have N devices.

    Create N threads.
    Assign a CUDA device to each thread by calling cudaSetDevice or cv::cuda::setDevice in that thread. Each thread is now associated with a device.
    You can create any number of cv::dnn::Net objects in any of those threads and the network will use the device associated with that thread for memory and computation.
    

    Benchmarks

    Demo Video: https://www.youtube.com/watch?v=ljCfluWYymM

    Project summary/benchmarks: https://gist.github.com/YashasSamaga/a84cf2826ab2dc755005321fe17cd15d

    Support Matrix for this PR ## Current Support Matrix: (not updated)

    Blip | Meaning ---- | --------- ✔️ | supports all the configurations that are supported by all the existing backends (and might support more than what's currently supported) 🔵 | partially supported (fallback to CPU for unsupported configurations) :x: | not supported (fallback to CPU)

    Layer | Status | Constraints | Notes ---------------------------------------- | ------ | ------------- | -------------- Activations | ✔️ Batch Normalization | ✔️ Blank Layer | ✔️ Concat Layer | ✔️ Const Layer | ✔️ Convolution 2d | ✔️ | | asymmetric padding is disabled in layer constructor but the backend supports it Convolution 3d | ✔️ | | asymmetric padding is disabled in layer constructor but the backend supports it Crop and resize | :x: | Crop Layer | ✔️ | | forwarded to Slice Layer Detection Output Layer | :x: | Deconvolution 2d | 🔵 | padding configuration should not lead to extra uneven padding Deconvolution 3d | 🔵 | padding configuration should not lead to extra uneven padding Elementwise Layers | ✔️ | Eltwise Layer | ✔️ | Flatten Layer | ✔️ | Fully Connected Layer | ✔️ | Input Layer | :x: | Interp Layer | ✔️ | Local Response Normalization | ✔️ | Max Unpooling 2d | ✔️ | Max Unpooling 3d | ✔️ | MVN Layer | :x: | Normalize Layer | 🔵 | Only L1 and L2 norm supported Padding Layer | ✔️ Permute Layer | ✔️ Pooling 2d | 🔵 | Only max and average pooling supported | supports asymmetric padding Pooling 3d | 🔵 | Only max and average pooling supported | supports asymmetric padding Prior Box Layer | ✔️ Proposal Layer | :x: Region Layer | ✔️ | NMS performed using CPU Reorg Layer | ✔️ | Reshape Layer | ✔️ | Resize Layer | ✔️ Scale Layer | ✔️ Shift Layer | ✔️ | | forwarded to Scale Layer Shuffle Channel Layer | ✔️ Slice Layer | ✔️ Softmax Layer | ✔️ Split Layer | ✔️ LSTM Layer | :x:

    Known issues:

    1. Tests for some of the SSD based networks fail on Jetson Nano

    References: #14585

    Results:

    • https://github.com/opencv/opencv/pull/14827#issuecomment-522229894
    • https://github.com/opencv/opencv/pull/14827#issuecomment-523456312
    force_builders_only=Custom,linux,docs
    buildworker:Custom=linux-4
    docker_image:Custom=ubuntu-cuda:18.04
    
    GSoC category: dnn 
    opened by YashasSamaga 171
  • Issues with recognition whilst using IP Stream only

    Issues with recognition whilst using IP Stream only

    System information (version)
    • OpenCV => 3.1
    • Operating System / Platform => Linux
    Detailed description

    I will try ask in here as got no response on the forum. I have been working with OpenCV in an application since last year. The first version I was capturing the frame from a webcam and using Haarcascades and without an issue it would recognise a face nearly every time.

    I came into some issues with getting a stable web based stream going, after trying multiple solutions I moved to a new way. This is still using the exact same webcam except Linux Motion is accessing it and OpenCV is now connecting to the mjpeg stream from Linux Motion through a secure Nginx server, the stream is on the same device as the OpenCV script.

    Since doing this the quality of the stream has increased massively, but, it now no longer detects faces, at all hardly, I have compared screen shots of frames from when OpenCV was accessing the webcam and frames from when OpenCV is accessing the stream, and apart from the improved quality of the frames there really is not any difference, yet OpenCV refuses to identify a face, it is literally a case where I have to move the camera around and hold a position to identify a face, before I could be walking past on the other side of the room and it would detect my face.

    After trying everything i could think of and find on Google, I went back to the webcam, instantly it was detecting my face in whatever position, in what ever light. I have tried multiple other ways of streaming to the web again but still not successful so have moved back to the Motion stream again to try work this out.

    Can anyone shed any light on this, it does not make sense to me that an improvement in quality suddenly breaks facial identification. I have tried playing with the frame settings, resolution, contrast, hue, brightness and nothing I can do seems to work.

    Steps to reproduce
    self.OpenCVCapture.open('http://MOTION_STREAM_ADDRESS/stream.mjpg')
    self.OpenCVCapture.set(5, 30) 
    self.OpenCVCapture.set(3,1280)
    self.OpenCVCapture.set(4,720)
    self.OpenCVCapture.set(10,1)
    

    Then run through Haarcascades for detection.

    question (invalid tracker) 
    opened by AdamMiltonBarker 99
  • OpenCV 3.1.0 simple VideoCapture and waitKey crashes after a while on OS X 10.11.2

    OpenCV 3.1.0 simple VideoCapture and waitKey crashes after a while on OS X 10.11.2

    OpenCV 3.1.0 is installed through brew install opencv3 --with-contirb --with-qt5 and the following program crashes after a while:

    #include <opencv2/core.hpp>
    #include <opencv2/highgui.hpp>
    #include <opencv2/videoio.hpp>
    
    int main(int argc, const char * argv[]) {
        cv::VideoCapture cap(0);
        cv::Mat frame;
        while (cap.read(frame)) {
            imshow("Frame", frame);
            if (cv::waitKey(1) == 'q') {
                break;
            }
        }
        return 0;
    }
    

    The stack trace is the following:

    2015-12-24 09:54:22.297 basic-capture[86100:4481590] -[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] An uncaught exception was raised
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] -[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680
    2015-12-24 09:54:22.313 basic-capture[86100:4481590] (
        0   CoreFoundation                      0x00007fff95766ae2 __exceptionPreprocess + 178
        1   libobjc.A.dylib                     0x00007fff90699f7e objc_exception_throw + 48
        2   CoreFoundation                      0x00007fff95769b9d -[NSObject(NSObject) doesNotRecognizeSelector:] + 205
        3   CoreFoundation                      0x00007fff956a2601 ___forwarding___ + 1009
        4   CoreFoundation                      0x00007fff956a2188 _CF_forwarding_prep_0 + 120
        5   Foundation                          0x00007fff9c7d385b __NSFireTimer + 95
        6   CoreFoundation                      0x00007fff956acbc4 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
        7   CoreFoundation                      0x00007fff956ac853 __CFRunLoopDoTimer + 1075
        8   CoreFoundation                      0x00007fff9572ae6a __CFRunLoopDoTimers + 298
        9   CoreFoundation                      0x00007fff95667cd1 __CFRunLoopRun + 1841
        10  CoreFoundation                      0x00007fff95667338 CFRunLoopRunSpecific + 296
        11  HIToolbox                           0x00007fff8f2f3935 RunCurrentEventLoopInMode + 235
        12  HIToolbox                           0x00007fff8f2f3677 ReceiveNextEventCommon + 184
        13  HIToolbox                           0x00007fff8f2f35af _BlockUntilNextEventMatchingListInModeWithFilter + 71
        14  AppKit                              0x00007fff967d10ee _DPSNextEvent + 1067
        15  AppKit                              0x00007fff96b9d943 -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 454
        16  libqcocoa.dylib                     0x000000010555ae5a _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE + 1034
        17  libopencv_highgui.3.1.dylib         0x000000010083c596 cvWaitKey + 178
        18  basic-capture                       0x0000000100001666 main + 246
        19  libdyld.dylib                       0x00007fff8a0335ad start + 1
        20  ???                                 0x0000000000000001 0x0 + 1
    )
    2015-12-24 09:54:22.314 basic-capture[86100:4481590] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[CaptureDelegate doFireTimer:]: unrecognized selector sent to instance 0x103600680'
    *** First throw call stack:
    (
        0   CoreFoundation                      0x00007fff95766ae2 __exceptionPreprocess + 178
        1   libobjc.A.dylib                     0x00007fff90699f7e objc_exception_throw + 48
        2   CoreFoundation                      0x00007fff95769b9d -[NSObject(NSObject) doesNotRecognizeSelector:] + 205
        3   CoreFoundation                      0x00007fff956a2601 ___forwarding___ + 1009
        4   CoreFoundation                      0x00007fff956a2188 _CF_forwarding_prep_0 + 120
        5   Foundation                          0x00007fff9c7d385b __NSFireTimer + 95
        6   CoreFoundation                      0x00007fff956acbc4 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
        7   CoreFoundation                      0x00007fff956ac853 __CFRunLoopDoTimer + 1075
        8   CoreFoundation                      0x00007fff9572ae6a __CFRunLoopDoTimers + 298
        9   CoreFoundation                      0x00007fff95667cd1 __CFRunLoopRun + 1841
        10  CoreFoundation                      0x00007fff95667338 CFRunLoopRunSpecific + 296
        11  HIToolbox                           0x00007fff8f2f3935 RunCurrentEventLoopInMode + 235
        12  HIToolbox                           0x00007fff8f2f3677 ReceiveNextEventCommon + 184
        13  HIToolbox                           0x00007fff8f2f35af _BlockUntilNextEventMatchingListInModeWithFilter + 71
        14  AppKit                              0x00007fff967d10ee _DPSNextEvent + 1067
        15  AppKit                              0x00007fff96b9d943 -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 454
        16  libqcocoa.dylib                     0x000000010555ae5a _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE + 1034
        17  libopencv_highgui.3.1.dylib         0x000000010083c596 cvWaitKey + 178
        18  basic-capture                       0x0000000100001666 main + 246
        19  libdyld.dylib                       0x00007fff8a0335ad start + 1
        20  ???                                 0x0000000000000001 0x0 + 1
    )
    libc++abi.dylib: terminating with uncaught exception of type NSException
    
    bug category: highgui-gui platform: ios/osx 
    opened by mahiuchun 65
  • C++ cv::VideoCapture.open(0) always return false on Android

    C++ cv::VideoCapture.open(0) always return false on Android

    System information (version)
    • OpenCV => 3.4.1
    • Operating System macOS High Sierra 10.13.6 / Platform => Android 7.1.1 API 25
    • Compiler => Android NDK
    Detailed description
    Context

    I'm writing a cross platform OpenCV based C++ library. The consuming code is a React Native Application through a react native native module.

    To be perfectly clear, there is no access from Java Code to C++ OpenCV on Android. There are events with the result of the OpenCV C++ code sent to Javascript through the react native bridge.

    My native library is compiled on Android as a SHARED library. It is dynamically linked to the libopencv_world.so that is produced by the compilation of OpenCV C++ for Android.

    What it does

    Basically, it opens the device's default camera and take snapshots.

    The outcome

    This code is then ran on iOS and Android.

    This is working perfectly well on iOS. It fails on Android.

    Here is the failing part of C++ code on Adndroid:

    Steps to reproduce
    // cap is a cv::VideoCapture object    
    if (cap.open(0))
            {
                cap.set(cv::CAP_PROP_FRAME_WIDTH, CAM_WIDTH);
                cap.set(cv::CAP_PROP_FRAME_HEIGHT, CAM_HEIGHT);
            }
            else
            {
                reject("false", "cap.open(0) returned false");
            }
    
    feature priority: low category: videoio(camera) platform: android effort: few weeks 
    opened by omatrot 59
  • OpenCV3 python calls to FlannBasedMatcher::knnMatch fail with error

    OpenCV3 python calls to FlannBasedMatcher::knnMatch fail with error

    The following code returns an error.

        sift = x2d.SIFT_create(1000)
        features_l, des_l = sift.detectAndCompute(im_l, None)
        features_r, des_r = sift.detectAndCompute(im_r, None)
    
        FLANN_INDEX_KDTREE = 1
        index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
        search_params = dict(checks=50)
        flann = cv2.FlannBasedMatcher(index_params,search_params)
    

    The error returned:

    opencv/modules/python/src2/cv2.cpp:161: error: (-215) The data should normally be NULL! in function allocate
    

    I tried this with Python 2.7. FLANN_INDEX_KDTREE is set to 1 unlike here, since in modules/flann/include/opencv2/flann/defines.h I found it set to 1 on line 84.

    bug priority: normal category: python bindings affected: 3.4 category: flann category: t-api 
    opened by Algomorph 56
  • fixing cap_pvpapi interface

    fixing cap_pvpapi interface

    This PR tries to fix the following 3 reported issues

    • http://code.opencv.org/issues/3946
    • http://code.opencv.org/issues/3947
    • http://code.opencv.org/issues/3948

    A complete remake of the PvAPI API interface solving bugs but also

    • allowing the use of MANTA type cameras
    • allowing multiple AVT cameras at the same time
    port/backport done 
    opened by StevenPuttemans 55
  • fixing waitKey commands to be universal

    fixing waitKey commands to be universal

    As a follow up for the discussion held in PR #7098, and the discussion started in this topic, we decided to implement a waitChar function that always returns the correct value if the returned value is needed for further comparison to ASCII codes of char inputs.

    opened by StevenPuttemans 54
  • AttributeError: 'module' object has no attribute 'face'

    AttributeError: 'module' object has no attribute 'face'

    I got an error when running opencv in Python on raspberry pi.

    I tried to find and apply it to fix the error, but it did not work out. I also confirmed that the module "face" is in the file opencv_contrib-3.3.0. I do not know why for some reason.

    error 1

    Traceback (most recent call last): File "training.py", line 13, in recognizer = cv2.face.createLBPHFaceRecognizer() AttributeError: 'module' object has no attribute 'face'

    error 2

    Traceback (most recent call last): File "training.py", line 13, in help(cv2.face) AttributeError: 'module' object has no attribute 'face'

    error3

    Traceback (most recent call last): File "training.py", line 13, in help(cv2.face.createLBPHFaceRecognizer) AttributeError: 'module' object has no attribute 'face'

    python : 3.5.3 opencv-3.3.0 opencv_contrib-3.3.0

    source code

    Import OpenCV2 for image processing

    Import os for file path

    import cv2, os

    Import numpy for matrix calculation

    import numpy as np

    Import Python Image Library (PIL)

    from PIL import Image

    Create Local Binary Patterns Histograms for face recognization

    recognizer = cv2.face.createLBPHFaceRecognizer()

    Using prebuilt frontal face training model, for face detection

    detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml");

    Create method to get the images and label data

    def getImagesAndLabels(path):

    # Get all file path
    imagePaths = [os.path.join(path,f) for f in os.listdir(path)] 
    
    # Initialize empty face sample
    faceSamples=[]
    
    # Initialize empty id
    ids = []
    
    # Loop all the file path
    for imagePath in imagePaths:
    
        # Get the image and convert it to grayscale
        PIL_img = Image.open(imagePath).convert('L')
    
        # PIL image to numpy array
        img_numpy = np.array(PIL_img,'uint8')
    
        # Get the image id
        id = int(os.path.split(imagePath)[-1].split(".")[1])
        print(id)
    
        # Get the face from the training images
        faces = detector.detectMultiScale(img_numpy)
    
        # Loop for each face, append to their respective ID
        for (x,y,w,h) in faces:
    
            # Add the image to face samples
            faceSamples.append(img_numpy[y:y+h,x:x+w])
    
            # Add the ID to IDs
            ids.append(id)
    
    # Pass the face array and IDs array
    return faceSamples,ids
    

    Get the faces and IDs

    faces,ids = getImagesAndLabels('dataset')

    Train the model using the faces and IDs

    recognizer.train(faces, np.array(ids))

    Save the model into trainer.yml

    recognizer.save('trainer/trainer.yml')

    question (invalid tracker) category: contrib 
    opened by sungjinp11 52
  • [GSOC] New camera model for stitching pipeline

    [GSOC] New camera model for stitching pipeline

    Merge with extra: https://github.com/opencv/opencv_extra/pull/303

    This PR contains all work for New camera model for stitching pipeline GSoC 2016 project.

    GSoC Proposal

    Stitching pipeline is a well established code in OpenCV. It provides good results for creating panoramas from camera captured images. Main limitation of stitching pipeline is its expected camera model (perspective transformation). Although this model is fine for many applications working with camera captured images, there are applications which aren't covered by current stitching pipeline.

    New camera model

    Due to physical constraints it is possible for some applications to expect much simpler transform with less degrees of freedom. Those are situations when input data are not subject to perspective transform. The transformation can be much simpler, such as affine transformation. Datasets considered here includes images captured by special hardware (such as book scanners[0] that tries hard to eliminate perspective), maps from laser scanning (produced from different starting points), preprocessed images (where perspective was compensated by other robust means, taking advantage of physical situation, e.g. for book scanners we would use data from calibration to compensate remaining perspective). In all those situations we would like to obtain image mosaic under affine transformation.

    I'd like to introduce new camera model based on affine transformation to stitching pipeline. This would include:

    • New Matcher using affine transformation (cv::estimateRigidTransform) to estimate H
    • New Estimator aware of affine model.
    • Defining and documenting this new model for CameraParams (e.g. now translation is always expected to be zero, this might not be true for affine transformation)
    • Integration works in compositing part of pipeline (there might be changes necessary depending how we would decide to represent affine model in CameraParams)
    • New options for high-level API to be able to use affine model instead of current one simply
    • Producing new sample code for stitching pipeline
    • Improving current documentation (current documentation does not mention details about current camera model, this might need some clarification)

    I used approach based on affine transformation to merge maps produced by multiple robots [1] for my robotics project. It shows a good results. However, as mentioned earlier applications for this model are much broader than that.

    Parallelism for FeaturesFinder

    To make usage of stitching pipeline more comfortable and performant for large number of images, I’d like also to improve FeaturesFinder to allow finding features in parallel. All camera models and other users of FeaturesFinder may take benefit from that. The API could be similar to FeaturesMatcher::operator ()(features, pairwise_matches, mask).

    This could be with TBB in similar manner as mentioned method in FeaturesMatcher, which is already being used in stitching pipeline so there would be almost no additional overhead in starting new threads in typical scenarios, because these threads are there already for FeaturesMatcher. This change would be fully integrated into high level stitching interface.

    There might be some changes necessary in finders to ensure thread-safety. Where thread-safety can’t be ensured or it does not make sense (GPU finders), parallelization would be disabled and all images would be processed in serial manner so this method would be always safe to use regardless of underlying finder. This approach is also similar to FeaturesMatcher.

    Benefits to OpenCV

    • New transform options for stitching pipeline
    • Performance improvements through parallel processing of image features

    implemented goals (all + extras)

    new camera model

    • [x] affine matcher
    • [x] affine estimator
    • [x] affine warper
    • [x] affine bundle adjusters
    • tests for affine stitching
      • [x] basic affine stitching integration test
      • [x] tests for affine matcher
      • [x] integration tests on real-word scans
      • [x] affine warper tests
      • [x] affine bundle adjusters tests
    • [x] integrating with high level API (Stitcher)
    • [x] stitching_detailed sample
    • [x] stitching simple sample

    parallel feature finding

    • [x] parallel API in feature finder
    • [x] tests (incl. perf tests)
    • [x] integrating with Stitcher

    implemented extras

    • [x] robust fuctions for affine transform estimations
      • [x] add support for least median robust method
      • [x] tests for LMEDS
      • [x] Levenberg–Marquardt algorithm-based refining for affine estimation functions
      • [x] tests and docs
      • [x] perf tests
    • [x] stitching tutorial for high level API
      • [x] add examples of running the samples on testdata in opencv_extra
    • [x] fix existing stitching tests with SURF (SURF was disabled)

    video

    short video presenting this project

    other work

    During this GSoC I have also coded some related work, that is not going to be included (mostly because we has chosen different approach or the work has been merged under this PR). It is listed here for completeness.

    PRs:

    • #6560
    • #6609
    • #6615
    • #6642

    commits:

    • eba30a89737d4ded755f07cff75fd861864cf09a
    • 150daa2dc57a258ba61a01e12901518b6b4d98e8
    GSoC 
    opened by hrnr 51
  • [GSoC] Add siamrpnpp.py

    [GSoC] Add siamrpnpp.py

    GSoC '20 : Real-time Single Object Tracking using Deep Learning (SiamRPN++)

    Overview

    Proposal : https://summerofcode.withgoogle.com/projects/#4979746967912448 Mentors : Liubov Batanina @l-bat, Stefano Fabri @bhack, Ilya Elizarov @ieliz Student : Jin Yeob Chung @jinyup100

    Details of the Pull Request

    • Export of the torch implementation of the SiamRPN++ visual tracker to ONNX
      • Please refer to (https://gist.github.com/jinyup100/7aa748686c5e234ed6780154141b4685) or Code to generate ONNX models at the bottom of this PR description
    • Addition of siamrpnpp.py in the opencv/samples/dnn repository
      • SiamRPN++ visual tracker can be performed on a sample video input
      • Parsers include:
        • --input_video path to sample video input
        • --target_net path to target branch of the visual tracker
        • --search_net path to search branch of the visual tracker
        • --rpn_head path to head of the visual tracker
        • --backend selection of the computation backend
        • --target selection of the computation target device
    • Additional samples of the visual tracker performed on videos are available at:
      • https://drive.google.com/drive/folders/1k7Z_SHaBWK_4aEQPxJJCGm3P7y2IFCjY?usp=sharing
    Examples

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [X] I agree to contribute to the project under OpenCV (BSD) License.
    • [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
    • [X] The PR is proposed to proper branch
    • [X] There is reference to original bug report and related work
    • [X] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [X] The feature is well documented and sample code can be built with the project CMake
    Code to generate ONNX Models

    The code shown below to generate the ONNX models of siamrpn++ is also available from : https://gist.github.com/jinyup100/7aa748686c5e234ed6780154141b4685

    ball_track

    The Final Version of the Pre-Trained Weights and successfully converted ONNX format of the models using the codes are available at::

    Pre-Trained Weights in pth Format https://drive.google.com/file/d/11bwgPFVkps9AH2NOD1zBDdpF_tQghAB-/view?usp=sharing

    Target Net : Import :heavy_check_mark: Export :heavy_check_mark: https://drive.google.com/file/d/1dw_Ne3UMcCnFsaD6xkZepwE4GEpqq7U_/view?usp=sharing

    Search Net : Import :heavy_check_mark: Export :heavy_check_mark: https://drive.google.com/file/d/1Lt4oE43ZSucJvze3Y-Z87CVDreO-Afwl/view?usp=sharing

    RPN_head : Import : :heavy_check_mark: Export :heavy_check_mark:
    https://drive.google.com/file/d/1zT1yu12mtj3JQEkkfKFJWiZ71fJ-dQTi/view?usp=sharing

    import numpy as np
    import onnx
    import torch
    import torch.nn as nn
    
    # Class for the Building Blocks required for ResNet
    class Bottleneck(nn.Module):
        expansion = 4
    
        def __init__(self, inplanes, planes, stride=1,
                     downsample=None, dilation=1):
            super(Bottleneck, self).__init__()
            self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
            self.bn1 = nn.BatchNorm2d(planes)
            padding = 2 - stride
            if downsample is not None and dilation > 1:
                dilation = dilation // 2
                padding = dilation
    
            assert stride == 1 or dilation == 1, \
                "stride and dilation must have one equals to zero at least"
    
            if dilation > 1:
                padding = dilation
            self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                                   padding=padding, bias=False, dilation=dilation)
            self.bn2 = nn.BatchNorm2d(planes)
            self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
            self.bn3 = nn.BatchNorm2d(planes * 4)
            self.relu = nn.ReLU(inplace=True)
            self.downsample = downsample
            self.stride = stride
    
        def forward(self, x):
            residual = x
    
            out = self.conv1(x)
            out = self.bn1(out)
            out = self.relu(out)
    
            out = self.conv2(out)
            out = self.bn2(out)
            out = self.relu(out)
    
            out = self.conv3(out)
            out = self.bn3(out)
    
            if self.downsample is not None:
                residual = self.downsample(x)
    
            out += residual
    
            out = self.relu(out)
    
            return out
        
    # End of Building Blocks
    
    # Class for ResNet - the Backbone neural network
    
    class ResNet(nn.Module):
        "ResNET"
        def __init__(self, block, layers, used_layers):
            self.inplanes = 64
            super(ResNet, self).__init__()
            self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=0,  # 3
                                   bias=False)
            self.bn1 = nn.BatchNorm2d(64)
            self.relu = nn.ReLU(inplace=True)
            self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            self.layer1 = self._make_layer(block, 64, layers[0])
            self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
    
            self.feature_size = 128 * block.expansion
            self.used_layers = used_layers
            layer3 = True if 3 in used_layers else False
            layer4 = True if 4 in used_layers else False
    
            if layer3:
                self.layer3 = self._make_layer(block, 256, layers[2],
                                               stride=1, dilation=2)  # 15x15, 7x7
                self.feature_size = (256 + 128) * block.expansion
            else:
                self.layer3 = lambda x: x  # identity
    
            if layer4:
                self.layer4 = self._make_layer(block, 512, layers[3],
                                               stride=1, dilation=4)  # 7x7, 3x3
                self.feature_size = 512 * block.expansion
            else:
                self.layer4 = lambda x: x  # identity
    
            for m in self.modules():
                if isinstance(m, nn.Conv2d):
                    n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                    m.weight.data.normal_(0, np.sqrt(2. / n))
                elif isinstance(m, nn.BatchNorm2d):
                    m.weight.data.fill_(1)
                    m.bias.data.zero_()
    
        def _make_layer(self, block, planes, blocks, stride=1, dilation=1):
            downsample = None
            dd = dilation
            if stride != 1 or self.inplanes != planes * block.expansion:
                if stride == 1 and dilation == 1:
                    downsample = nn.Sequential(
                        nn.Conv2d(self.inplanes, planes * block.expansion,
                                  kernel_size=1, stride=stride, bias=False),
                        nn.BatchNorm2d(planes * block.expansion),
                    )
                else:
                    if dilation > 1:
                        dd = dilation // 2
                        padding = dd
                    else:
                        dd = 1
                        padding = 0
                    downsample = nn.Sequential(
                        nn.Conv2d(self.inplanes, planes * block.expansion,
                                  kernel_size=3, stride=stride, bias=False,
                                  padding=padding, dilation=dd),
                        nn.BatchNorm2d(planes * block.expansion),
                    )
    
            layers = []
            layers.append(block(self.inplanes, planes, stride,
                                downsample, dilation=dilation))
            self.inplanes = planes * block.expansion
            for i in range(1, blocks):
                layers.append(block(self.inplanes, planes, dilation=dilation))
    
            return nn.Sequential(*layers)
    
        def forward(self, x):
            x = self.conv1(x)
            x = self.bn1(x)
            x_ = self.relu(x)
            x = self.maxpool(x_)
    
            p1 = self.layer1(x)
            p2 = self.layer2(p1)
            p3 = self.layer3(p2)
            p4 = self.layer4(p3)
            out = [x_, p1, p2, p3, p4]
            out = [out[i] for i in self.used_layers]
            if len(out) == 1:
                return out[0]
            else:
                return out
            
    # End of ResNet
    
    # Class for Adjusting the layers of the neural net
    
    class AdjustLayer_1(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustLayer_1, self).__init__()
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False),
                nn.BatchNorm2d(out_channels),
                )
            self.center_size = center_size
    
        def forward(self, x):
            x = self.downsample(x)
            l = 4
            r = 11
            x = x[:, :, l:r, l:r]
            return x
    
    class AdjustAllLayer_1(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustAllLayer_1, self).__init__()
            self.num = len(out_channels)
            if self.num == 1:
                self.downsample = AdjustLayer_1(in_channels[0],
                                              out_channels[0],
                                              center_size)
            else:
                for i in range(self.num):
                    self.add_module('downsample'+str(i+2),
                                    AdjustLayer_1(in_channels[i],
                                                out_channels[i],
                                                center_size))
    
        def forward(self, features):
            if self.num == 1:
                return self.downsample(features)
            else:
                out = []
                for i in range(self.num):
                    adj_layer = getattr(self, 'downsample'+str(i+2))
                    out.append(adj_layer(features[i]))
                return out
            
    class AdjustLayer_2(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustLayer_2, self).__init__()
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False),
                nn.BatchNorm2d(out_channels),
                )
            self.center_size = center_size
    
        def forward(self, x):
            x = self.downsample(x)
            return x
    
    class AdjustAllLayer_2(nn.Module):
        def __init__(self, in_channels, out_channels, center_size=7):
            super(AdjustAllLayer_2, self).__init__()
            self.num = len(out_channels)
            if self.num == 1:
                self.downsample = AdjustLayer_2(in_channels[0],
                                              out_channels[0],
                                              center_size)
            else:
                for i in range(self.num):
                    self.add_module('downsample'+str(i+2),
                                    AdjustLayer_2(in_channels[i],
                                                out_channels[i],
                                                center_size))
    
        def forward(self, features):
            if self.num == 1:
                return self.downsample(features)
            else:
                out = []
                for i in range(self.num):
                    adj_layer = getattr(self, 'downsample'+str(i+2))
                    out.append(adj_layer(features[i]))
                return out
            
    # End of Class for Adjusting the layers of the neural net
    
    # Class for Region Proposal Neural Network
    
    class RPN(nn.Module):
        "Region Proposal Network"
        def __init__(self):
            super(RPN, self).__init__()
    
        def forward(self, z_f, x_f):
            raise NotImplementedError
            
    class DepthwiseXCorr(nn.Module):
        "Depthwise Correlation Layer"
        def __init__(self, in_channels, hidden, out_channels, kernel_size=3, hidden_kernel_size=5):
            super(DepthwiseXCorr, self).__init__()
            self.conv_kernel = nn.Sequential(
                    nn.Conv2d(in_channels, hidden, kernel_size=kernel_size, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    )
            self.conv_search = nn.Sequential(
                    nn.Conv2d(in_channels, hidden, kernel_size=kernel_size, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    )
            self.head = nn.Sequential(
                    nn.Conv2d(hidden, hidden, kernel_size=1, bias=False),
                    nn.BatchNorm2d(hidden),
                    nn.ReLU(inplace=True),
                    nn.Conv2d(hidden, out_channels, kernel_size=1)
                    )
            
        def forward(self, kernel, search):    
            kernel = self.conv_kernel(kernel)
            search = self.conv_search(search)
            
            feature = xcorr_depthwise(search, kernel)
            
            out = self.head(feature)
            
            return out
    
    class DepthwiseRPN(RPN):
        def __init__(self, anchor_num=5, in_channels=256, out_channels=256):
            super(DepthwiseRPN, self).__init__()
            self.cls = DepthwiseXCorr(in_channels, out_channels, 2 * anchor_num)
            self.loc = DepthwiseXCorr(in_channels, out_channels, 4 * anchor_num)
    
        def forward(self, z_f, x_f):
            cls = self.cls(z_f, x_f)
            loc = self.loc(z_f, x_f)
            
            return cls, loc
    
    class MultiRPN(RPN):
        def __init__(self, anchor_num, in_channels):
            super(MultiRPN, self).__init__()
            for i in range(len(in_channels)):
                self.add_module('rpn'+str(i+2),
                        DepthwiseRPN(anchor_num, in_channels[i], in_channels[i]))
            self.weight_cls = nn.Parameter(torch.Tensor([0.38156851768108546, 0.4364767608115956,  0.18195472150731892]))
            self.weight_loc = nn.Parameter(torch.Tensor([0.17644893463361863, 0.16564198028417967, 0.6579090850822015]))
    
        def forward(self, z_fs, x_fs):
            cls = []
            loc = []
            
            rpn2 = self.rpn2
            z_f2 = z_fs[0]
            x_f2 = x_fs[0]
            c2,l2 = rpn2(z_f2, x_f2)
            
            cls.append(c2)
            loc.append(l2)
            
            rpn3 = self.rpn3
            z_f3 = z_fs[1]
            x_f3 = x_fs[1]
            c3,l3 = rpn3(z_f3, x_f3)
            
            cls.append(c3)
            loc.append(l3)
            
            rpn4 = self.rpn4
            z_f4 = z_fs[2]
            x_f4 = x_fs[2]
            c4,l4 = rpn4(z_f4, x_f4)
            
            cls.append(c4)
            loc.append(l4)
            
            def avg(lst):
                return sum(lst) / len(lst)
    
            def weighted_avg(lst, weight):
                s = 0
                fixed_len = 3
                for i in range(3):
                    s += lst[i] * weight[i]
                return s
    
            return weighted_avg(cls, self.weight_cls), weighted_avg(loc, self.weight_loc)
            
    # End of class for RPN
    
    def conv3x3(in_planes, out_planes, stride=1, dilation=1):
        "3x3 convolution with padding"
        return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                         padding=dilation, bias=False, dilation=dilation)
    
    def xcorr_depthwise(x, kernel):
        """
        Deptwise convolution for input and weights with different shapes
        """
        batch = kernel.size(0)
        channel = kernel.size(1)
        x = x.view(1, batch*channel, x.size(2), x.size(3))
        kernel = kernel.view(batch*channel, 1, kernel.size(2), kernel.size(3))
        conv = nn.Conv2d(batch*channel, batch*channel, kernel_size=(kernel.size(2), kernel.size(3)), bias=False, groups=batch*channel)
        conv.weight = nn.Parameter(kernel)
        out = conv(x) 
        out = out.view(batch, channel, out.size(2), out.size(3))
        out = out.detach()
        return out
    
    class TargetNetBuilder(nn.Module):
        def __init__(self):
            super(TargetNetBuilder, self).__init__()
            # Build Backbone Model
            self.backbone = ResNet(Bottleneck, [3,4,6,3], [2,3,4])
            # Build Neck Model
            self.neck = AdjustAllLayer_1([512,1024,2048], [256,256,256])
        
        def forward(self, frame):
            features = self.backbone(frame)
            output = self.neck(features)
            return output
    
    class SearchNetBuilder(nn.Module):
        def __init__(self):
            super(SearchNetBuilder, self).__init__()
            # Build Backbone Model
            self.backbone = ResNet(Bottleneck, [3,4,6,3], [2,3,4])
            # Build Neck Model
            self.neck = AdjustAllLayer_2([512,1024,2048], [256,256,256])
            
        def forward(self, frame):
            features = self.backbone(frame)
            output = self.neck(features)
            return output
     
    class RPNBuilder(nn.Module):
        def __init__(self):
            super(RPNBuilder, self).__init__()
    
            # Build Adjusted Layer Builder
            self.rpn_head = MultiRPN(anchor_num=5,in_channels=[256, 256, 256])
    
        def forward(self, zf, xf):
            # Get Feature
            cls, loc = self.rpn_head(zf, xf)
    
            return cls, loc
        
    """Load path should be the directory of the pre-trained siamrpn_r50_l234_dwxcorr.pth
     The download link to siamrpn_r50_l234_dwxcorr.pth is shown in the description"""
    
    current_path = os.getcwd()
    load_path = os.path.join(current_path, "siamrpn_r50_l234_dwxcorr.pth")
    pretrained_dict = torch.load(load_path,map_location=torch.device('cpu') )
    pretrained_dict_backbone = pretrained_dict
    pretrained_dict_neck_1 = pretrained_dict
    pretrained_dict_neck_2 = pretrained_dict
    pretrained_dict_head = pretrained_dict
    pretrained_dict_target = pretrained_dict
    pretrained_dict_search = pretrained_dict
    
    # The shape of the inputs to the Target Network and the Search Network
    target = torch.Tensor(np.random.rand(1,3,127,127))
    search = torch.Tensor(np.random.rand(1,3,125,125))
    
    # Build the torch backbone model
    target_net = TargetNetBuilder()
    target_net.eval()
    target_net.state_dict().keys()
    target_net_dict = target_net.state_dict()
    
    # Load the pre-trained weight to the torch target net model
    pretrained_dict_target = {k: v for k, v in pretrained_dict_target.items() if k in target_net_dict}
    target_net_dict.update(pretrained_dict_target)
    target_net.load_state_dict(target_net_dict)
    
    # Export the torch target net model to ONNX model
    torch.onnx.export(target_net, torch.Tensor(target), "target_net.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names=['input'], output_names=['output_1,', 'output_2', 'output_3'])
    
    # Load the saved torch target net model using ONNX
    onnx_target = onnx.load("target_net.onnx")
    
    # Check whether the ONNX target net model has been successfully imported
    onnx.checker.check_model(onnx_target)
    print(onnx.checker.check_model(onnx_target))
    onnx.helper.printable_graph(onnx_target.graph)
    print(onnx.helper.printable_graph(onnx_target.graph))
    
    # Build the torch backbone model
    search_net = SearchNetBuilder()
    search_net.eval()
    search_net.state_dict().keys()
    search_net_dict = search_net.state_dict()
    
    # Load the pre-trained weight to the torch target net model
    pretrained_dict_search = {k: v for k, v in pretrained_dict_search.items() if k in search_net_dict}
    search_net_dict.update(pretrained_dict_search)
    search_net.load_state_dict(search_net_dict)
    
    # Export the torch target net model to ONNX model
    torch.onnx.export(search_net, torch.Tensor(search), "search_net.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names=['input'], output_names=['output_1,', 'output_2', 'output_3'])
    
    # Load the saved torch target net model using ONNX
    onnx_search = onnx.load("search_net.onnx")
    
    # Check whether the ONNX target net model has been successfully imported
    onnx.checker.check_model(onnx_search)
    print(onnx.checker.check_model(onnx_search))
    onnx.helper.printable_graph(onnx_search.graph)
    print(onnx.helper.printable_graph(onnx_search.graph))
    
    # Outputs from the Target Net and Search Net
    zfs_1, zfs_2, zfs_3 = target_net(torch.Tensor(target))
    xfs_1, xfs_2, xfs_3 = search_net(torch.Tensor(search))
    
    # Adjustments to the outputs from each of the neck models to match to input shape of the torch rpn_head model
    zfs = np.stack([zfs_1.detach().numpy(), zfs_2.detach().numpy(), zfs_3.detach().numpy()])
    xfs = np.stack([xfs_1.detach().numpy(), xfs_2.detach().numpy(), xfs_3.detach().numpy()])
    
    # Build the torch rpn_head model
    rpn_head = RPNBuilder()
    rpn_head.eval()
    rpn_head.state_dict().keys()
    rpn_head_dict = rpn_head.state_dict()
    
    # Load the pre-trained weights to the rpn_head model
    pretrained_dict_head = {k: v for k, v in pretrained_dict_head.items() if k in rpn_head_dict}
    pretrained_dict_head.keys()
    rpn_head_dict.update(pretrained_dict_head)
    rpn_head.load_state_dict(rpn_head_dict)
    rpn_head.eval()
    
    # Export the torch rpn_head model to ONNX model
    torch.onnx.export(rpn_head, (torch.Tensor(np.random.rand(*zfs.shape)), torch.Tensor(np.random.rand(*xfs.shape))), "rpn_head.onnx", export_params=True, opset_version=11,
                      do_constant_folding=True, input_names = ['input_1', 'input_2'], output_names = ['output_1', 'output_2'])
    
    # Load the saved rpn_head model using ONNX
    onnx_rpn_head_model = onnx.load("rpn_head.onnx")
    
    # Check whether the rpn_head model has been successfully imported
    onnx.checker.check_model(onnx_rpn_head_model)
    print(onnx.checker.check_model(onnx_rpn_head_model))    
    onnx.helper.printable_graph(onnx_rpn_head_model.graph)
    print(onnx.helper.printable_graph(onnx_rpn_head_model.graph))
    
    
    GSoC category: dnn 
    opened by jinyup100 49
  • Refactor core module for type-safety

    Refactor core module for type-safety

    Merge with opencv/opencv_contrib#1768 and opencv/opencv_extra#518

    This pullrequest changes

    • Provides a backward-compatible API based on enums for type-safty. Refer to #12288 for further details

    Utilized options/preprocessors:

    • CV_TYPE_SAFE_API with CMAKE option ENABLE_TYPE_SAFE_API to enable enum-based type-safe API
    • CV_TYPE_COMPATIBLE_API with CMAKE option ENABLE_COMPATIBLE_API to enable the overloaded int-based API. Although it is enabled by default and using it would raise deprecation warnings, it still recommended to disable it in the build farm to enforce good practices. CV_COMPATIBLE_API is only available when ENABLE_TYPE_SAFE_API is set.
    • CV_TRANSNATIONAL_API utilized internally, and should be removed after migration completes.
    force_builders=Custom,linux32,win32,windows_vc15,Android pack,ARMv7,ARMv8
    docker_image:Custom=ubuntu-cuda:16.04
    docker_image:Docs=docs-js
    
    opened by cv3d 48
  • DNN: fused depthwise and add

    DNN: fused depthwise and add

    Merge with test data: https://github.com/opencv/opencv_extra/pull/1034 Related issue: https://github.com/opencv/opencv/issues/23074

    In the previous optimization, we fused the Conv and Add layers. This PR further provides support for Depth-wise Conv and Add layers fusion.

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [ ] There is a reference to the original bug report and related work
    • [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [ ] The feature is well documented and sample code can be built with the project CMake
    bug category: dnn 
    opened by zihaomu 0
  • fix openmp include and link issue on macos

    fix openmp include and link issue on macos

    Fixes https://github.com/opencv/opencv/issues/23092

    Pull Request Readiness Checklist

    See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

    • [x] I agree to contribute to the project under Apache 2 License.
    • [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
    • [x] The PR is proposed to the proper branch
    • [x] There is a reference to the original bug report and related work
    • [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.
    • [x] The feature is well documented and sample code can be built with the project CMake
    opened by fengyuentau 0
  • Generated CmakeModule.config file can not use

    Generated CmakeModule.config file can not use

    System Information

    Darwin_x86/opencv4.6.0/lib/cmake/opencv4/OpenCVModules.cmake:174 (message):
      The imported target "libprotobuf" references the file
    
         "//opencv4.6.0/lib/opencv4/3rdparty/liblibprotobuf.a"
    
      but this file does not exist.  Possible reasons include:
    

    Apprently I disabled protobuf, why the cmakemodule.config still need protobuf?

    Detailed description

    Darwin_x86/opencv4.6.0/lib/cmake/opencv4/OpenCVModules.cmake:174 (message):
      The imported target "libprotobuf" references the file
    
         "//opencv4.6.0/lib/opencv4/3rdparty/liblibprotobuf.a"
    
      but this file does not exist.  Possible reasons include:
    

    Apprently I disabled protobuf, why the cmakemodule.config still need protobuf?

    Steps to reproduce

    Darwin_x86/opencv4.6.0/lib/cmake/opencv4/OpenCVModules.cmake:174 (message):
      The imported target "libprotobuf" references the file
    
         "//opencv4.6.0/lib/opencv4/3rdparty/liblibprotobuf.a"
    
      but this file does not exist.  Possible reasons include:
    

    Apprently I disabled protobuf, why the cmakemodule.config still need protobuf?

    Issue submission checklist

    • [X] I report the issue, it's not a question
    • [X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
    • [X] I updated to the latest OpenCV version and the issue is still there
    • [X] There is reproducer code and related data files (videos, images, onnx, etc)
    bug 
    opened by jinfagang 0
  • Cannot find omp.h and link libomp on macOS with -DWITH_OPENMP=ON

    Cannot find omp.h and link libomp on macOS with -DWITH_OPENMP=ON

    System Information

    OpenCV version: 4.7.0 OS: macOS 13.0.1 on Apple M1 Compiler: Apple clang version 14.0.0 (clang-1400.0.29.202) Target: arm64-apple-darwin22.1.0

    libomp is installed via brew.

    Detailed description

    modules/core/src/parallel.cpp:123:14: fatal error: 'omp.h' file not found
        #include <omp.h>
    

    Steps to reproduce

    git clone https://github.com/opencv/opencv.git && cd opencv
    cmake -B build -DWITH_OPENMP=ON -DBUILD_ZLIB=OFF .
    cmake --build build --target install -j 6
    

    Issue submission checklist

    • [X] I report the issue, it's not a question
    • [X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
    • [X] I updated to the latest OpenCV version and the issue is still there
    • [X] There is reproducer code and related data files (videos, images, onnx, etc)
    bug 
    opened by fengyuentau 0
  • Can not found zlib 1.2.9

    Can not found zlib 1.2.9

    System Information

    OpenCV python version: 4.7.0.68 & 4.6.0.66 Operating System / Platform: Centos 7 Python version: 3.9.2

    Detailed description

    I install opencv-python==4.7.0.68 and got

    [2023/01/04 10:30:57.244] Traceback (most recent call last):
    [2023/01/04 10:30:57.244]   File "/home/xxx/source_code/xxx/OCR/pse/test.py", line 7, in <module>
    [2023/01/04 10:30:57.244]     import cv2
    [2023/01/04 10:30:57.244]   File "/usr/local/python3.9/lib/python3.9/site-packages/cv2/__init__.py", line 181, in <module>
    [2023/01/04 10:30:57.244]     bootstrap()
    [2023/01/04 10:30:57.244]   File "/usr/local/python3.9/lib/python3.9/site-packages/cv2/__init__.py", line 153, in bootstrap
    [2023/01/04 10:30:57.244]     native_module = importlib.import_module("cv2")
    [2023/01/04 10:30:57.244]   File "/usr/local/python3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
    [2023/01/04 10:30:57.244]     return _bootstrap._gcd_import(name[level:], package, level)
    [2023/01/04 10:30:57.244] ImportError: /lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /usr/local/python3.9/lib/python3.9/site-packages/cv2/../opencv_python.libs/libpng16-186fce2e.so.16.37.0)
    

    However the same environment,it works well with opencv-python==4.6.0.66

    Steps to reproduce

    pip install opencv-python==4.7.0.68 and run import cv2

    Issue submission checklist

    • [X] I report the issue, it's not a question
    • [X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
    • [X] I updated to the latest OpenCV version and the issue is still there
    • [X] There is reproducer code and related data files (videos, images, onnx, etc)
    bug 
    opened by wqh17101 2
  • VideoWriter fails silently when given frame that differs from the frameSize specified

    VideoWriter fails silently when given frame that differs from the frameSize specified

    System Information

    opencv-python==4.7.0.68
    Python 3.10.9
    Arch Linux
    

    Detailed description

    VideoWriter fails silently when the size of the frame passed to VideoWriter().write() differs from the frameSize specified in the constructor.

    I would expect an error, but instead, the written video will not play, which is hard to trace back to a frame size issue.

    Related:
    https://stackoverflow.com/questions/75000164/cant-play-video-created-with-opencv-videowriter

    Steps to reproduce

    import cv2
    
    
    FPS = 30
    KEY_ESC = 27
    OUTPUT_FILE = "vid.mp4"
    
    cam = cv2.VideoCapture(0)
    
    codec = cv2.VideoWriter.fourcc(*"mp4v") # MPEG-4 http://mp4ra.org/#/codecs
    frame_size = cam.read()[1].shape[:2] # wrong (see SO link)
    video_writer = cv2.VideoWriter(OUTPUT_FILE, codec, FPS, frame_size)
    
    # record until user exits with ESC
    while True:
        success, image = cam.read()
        cv2.imshow("window", image)
    
        video_writer.write(image)
    
        if cv2.waitKey(5) == KEY_ESC:
            break
    

    Issue submission checklist

    • [X] I report the issue, it's not a question
    • [X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
    • [X] I updated to the latest OpenCV version and the issue is still there
    • [X] There is reproducer code and related data files (videos, images, onnx, etc)
    bug 
    opened by mcp292 2
Releases(4.7.0)
https://arxiv.org/abs/1904.01941

Character-Region-Awareness-for-Text-Detection- https://arxiv.org/abs/1904.01941 Train You can train SynthText data use python source/train_SynthText.p

DayDayUp 120 Dec 28, 2022
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022
A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

EasyLaMa (WIP) This is a tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background. Installation For GP

3 Sep 17, 2022
Learning Camera Localization via Dense Scene Matching, CVPR2021

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Hua

tangshitao 65 Dec 01, 2022
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
Textboxes : Image Text Detection Model : python package (tensorflow)

shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl

Jayne Shin (신재인) 91 Dec 15, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
A community-supported supercharged version of paperless: scan, index and archive all your physical documents

Paperless-ngx Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep,

5.2k Jan 04, 2023
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for

Pascal Fischer 178 Dec 27, 2022
A small C++ implementation of LSTM networks, focused on OCR.

clstm CLSTM is an implementation of the LSTM recurrent neural network model in C++, using the Eigen library for numerical computations. Status and sco

Tom 794 Dec 30, 2022
Generic framework for historical document processing

dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty

Digital Humanities Laboratory 343 Dec 24, 2022
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
Characterizing possible failure modes in physics-informed neural networks.

Characterizing possible failure modes in physics-informed neural networks This repository contains the PyTorch source code for the experiments in the

Aditi Krishnapriyan 55 Jan 02, 2023
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
Awesome Spectral Indices in Python.

Awesome Spectral Indices in Python: Numpy | Pandas | GeoPandas | Xarray | Earth Engine | Planetary Computer | Dask GitHub: https://github.com/davemlz/

David Montero Loaiza 98 Jan 02, 2023
第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)第一名;仅采用densenet识别图中文字

OCR 第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)冠军 模型结果 该比赛计算每一个条目的f1score,取所有条目的平均,具体计算方式在这里。这里的计算方式不对一句话里的相同文字重复计算,故f1score比提交的最终结果低: - train val f1score 0

尹畅 441 Dec 22, 2022