Activity 12: Video Processing
A video is basically made up of many images consecutively shown to a viewer who perceive the movements to be smooth, thus making it into a movie. Note that for humans, we can only perceive it to be smooth if there are more than 15 images (or referred to as frames) shown every second.
A video is basically made up of many images consecutively shown to a viewer who percieve the movements to be smooth, thus making it into a movie. Note that for humans, we can only perceive it to be smooth if there are more than 15 images (or referred to as frames) shown every second.
Each video has a frame rate for which it is number of successive images shown in a second. Most cameras has around 30 frames per second (fps), that is 1/30 second for each of the image.
From previous experiments, we usually apply only the processing on a single image. For a video, we can include a different dimension as measurement, that is, time. Therefore using video, we can extract the dynamics or kinematics of a system.
Given a video, we can basically extract the images using Avidemux 2.5. It can simply be used by:
The experiment was the dropping of the ball from a specific height. Note that from the backgroud, there is a reference height as shown in the figure below where in between tick marks is equivalent to 5 cm.
Now, for the images, we have an orange ball dropped in a white and black background.
We can simply apply image segmentation which is similar to that of the previous activity. We can use the image of the ball below to be segmented since there is a blur effect on the images extracted from video.
After applying segmentation, we can now get a blob of ball as shown in the figure below. Note that from here, we can basically extract its pixel position using morphological operation. One trick I did was the identification of the largest blob in the image and denoted this as the ball. After identifiying which part is the ball, the centroid of this ball was then denoted as its pixel position.
After applying to a set of images, we now have a pixel position of the ball at every image (which is 1/30 s). We can then get the y-position of the ball as a function of time. This is shown in the figure below.
However we note that, intially, we have placed a scale in the backgroud to identify the actual height. The ratio of pixel to length is: 25 pixels = 5 centimeters. Using this scale, I can now convert the y-pixel position into its real height in cm. this is shown in the figure below.
For an image, the position of the 0,0 origin is at the top left. This therefore means that the higher pixel position denote a lower position of the ball as it was captured. To correct this, I shifted the positions and denoted the position for which the ball dropped on as the zero. The new graph of the y-position as a function of time is shown in the next figure.
Now, we can isolate the part for which the ball was intially at the top, and until it dropped on the floor.
This graph is distance as a function of time. From kinematics, this is equivalent to:
Now, from the line fit, we can surmise that g/2 is equivalent to the factor 0.446. Comparing this to the true value of g/2 which is 0.49 cm/s2, we can get a 8.9% error. This is still acceptable.
YEy! This is our final activity. For this, I give myself a 10/10. Thanks Eloisa and Venice!
A video is basically made up of many images consecutively shown to a viewer who percieve the movements to be smooth, thus making it into a movie. Note that for humans, we can only perceive it to be smooth if there are more than 15 images (or referred to as frames) shown every second.
Each video has a frame rate for which it is number of successive images shown in a second. Most cameras has around 30 frames per second (fps), that is 1/30 second for each of the image.
From previous experiments, we usually apply only the processing on a single image. For a video, we can include a different dimension as measurement, that is, time. Therefore using video, we can extract the dynamics or kinematics of a system.
Given a video, we can basically extract the images using Avidemux 2.5. It can simply be used by:
- open the video via Aivedmux
- then File --> Save --> Save Selection of JPEG images
- use a filename of choice for saving
The experiment was the dropping of the ball from a specific height. Note that from the backgroud, there is a reference height as shown in the figure below where in between tick marks is equivalent to 5 cm.
![]() |
| Figure 1. Background with reference tick marks (for conversion of pixel to length) of 5cm |
![]() |
| Figure 2. Consecutive images of the dropping of ball |
We can simply apply image segmentation which is similar to that of the previous activity. We can use the image of the ball below to be segmented since there is a blur effect on the images extracted from video.
![]() |
| Figure 3. Ball image used for segmentation |
![]() |
| Figure 4. Segmented images of the orange ball being dropped. |
![]() |
| Figure 5. Y-position (pixels) as a function of time of the dropping of ball |
![]() |
| Figure 6. Y-position (cm) as a function of time of the dropping of ball |
![]() |
| Figure 7. Shifted Y-position (cm) as a function of time of the dropping of ball |
![]() |
| Figure 8. Y-position (cm) as a function of time from the initial release to the dropping of the ball on floor |
Now, from the line fit, we can surmise that g/2 is equivalent to the factor 0.446. Comparing this to the true value of g/2 which is 0.49 cm/s2, we can get a 8.9% error. This is still acceptable.
YEy! This is our final activity. For this, I give myself a 10/10. Thanks Eloisa and Venice!









Comments
Post a Comment