Videohash is a Python library for detecting near-duplicate videos (Perceptual Video Hashing).
Any video input can be used to generate a 64-bit equivalent hash value with this package.
The video-hash-values for identical/near-duplicate videos are the same or similar, implying that if the video is resized (upscaled/downscaled), transcoded, watermark added/removed, changed color, changed frame rate, changed aspect ratio, slightly cropped, or black-bars added/removed, the hash-value should remain unchanged or not vary substantially.
- Every one second, a frame from the input video is extracted, the frames are shrunk to a 144x144 pixel square, a collage is constructed that contains all of the resized frames(square-shaped), the collage's wavelet hash is the video hash value for the original input video.
- Videohash cannot be used to verify whether one video is a part of another (video fingerprinting). If the video is reversed or rotated by a substantial angle (greater than 10 degrees), Videohash will not provide the same or similar hash result, but you can always reverse the video manually and generate the hash value for reversed video.
To use this software, you must have FFmpeg installed. Please read how to install FFmpeg if you don't already know how.
pip install videohash
pip install git+https://github.com/akamhy/videohash.git
In the following usage example the first three instance of VideoHash class are computing the hash for the same video(not same as in checksum) and the last one is a different video.
videohash1 is the video at https://www.youtube.com/watch?v=PapBjpzRhnA.
videohash2 is downscaled copy of https://www.youtube.com/watch?v=PapBjpzRhnA contained in Matroska Multimedia Container.
videohash3 is the same video as videohash2 but on local storage.
videohash4 uses a completely different video at https://www.youtube.com/watch?v=_T8cn2J13-4.
>>> from videohash import VideoHash
>>> # video: Artemis I Hot Fire Test
>>> url1 = "https://www.youtube.com/watch?v=PapBjpzRhnA"
>>> videohash1 = VideoHash(url=url1, download_worst=False)
>>>
>>> videohash1.hash # video hash value of the file, value is same as str(videohash1)
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>>
>>> #VIDEO:Artemis I Hot Fire Test
>>> url2="https://raw.githubusercontent.com/akamhy/videohash/main/assets/rocket.mkv"
>>> videohash2 = VideoHash(url=url2)
>>> videohash2.hash
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> videohash2.hash_hex
'0x341fefff8f780000'
>>> videohash2.hash_hex
'0x341fefff8f780000'
>>> videohash1 - videohash2
0
>>> videohash1 == videohash2
True
>>> videohash1 == "0b0011010000011111111011111111111110001111011110000000000000000000"
True
>>> videohash1 != videohash2
False
>>> path3 = "/home/akamhy/Downloads/rocket.mkv" #VIDEO: Artemis I Hot Fire Test
>>> videohash3 = VideoHash(path=path3)
>>> videohash3.hash
'0b0011010000011111111011111111111110001111011110000000000000000000'
>>> videohash3 - videohash2
0
>>> videohash3 == videohash1
True
>>> url4 = "https://www.youtube.com/watch?v=_T8cn2J13-4" #VIDEO: How We Are Going to the Moon
>>> videohash4 = VideoHash(url=url4)
>>> videohash4.hash_hex
'0x7cffff000000eff0'
>>> videohash4 - "0x7cffff000000eff0"
0
>>> videohash4.hash
'0b0111110011111111111111110000000000000000000000001110111111110000'
>>> videohash4 - videohash2
34
>>> videohash4 != videohash2
True
Run the above code @ https://replit.com/@akamhy/videohash-usage-2xx-example-code-for-video-hashing#main.py
Extended Usage : https://github.com/akamhy/videohash/wiki/Extended-Usage
API Reference : https://github.com/akamhy/videohash/wiki/API-Reference
Released under the MIT License. See
license for details.
The VideoHash logo was created by iconolocode. See license for details.
Videos are from NASA and are in the public domain.
NASA copyright policy states that "NASA material is not protected by copyright unless noted".