This post serves as my journal for a project I’m going currently going through. I’ve found myself wanting to reach for something to record & hold my thoughts as I go through this project. I’m starting to realise the important of having some kind of researching/engineering journal is needed when doing projects of this type.
A bit of the layout
I’m using vim/kitty, a python virtual environment, Label Studio so far. My data is based off my friend’s sparing footage, as they have a few, short videos which are basic: perfect for a quick prototype.
Some resources
- I’ve found About MMA-AI to be one of the few projects on this matter. Their reflection on the hardest part of their project gave some insight in when I should look into Feature Engineering & Selection.
Label Studio
After a long jam session with music, I’ve finally created at least one fully annotated video of my freinds sparing. I chose to focus on only basic stricking movements, a basic combo block, and a clinch block. I tired to have the model see footwork inloved in some movements: hopfully it’s enough to see something interesting.
After I was done with the annotation, I’m meet with what I should export it as.
Basically, JSON is great for packages based off Video Classification, whereas YOLOv8OBB is great for Object Detection. I didn’t really think this far when creating the dataset, but due to the way I had annotated the data, I’m chosing to go with YOLOv8OBB.
This somewhat blow my oringal code apart, but that’s how project like this go really.
I’ve decided to just go with JSON for now. I can always export it later.
Now it’s time to split up- Actually, I’m switching everything up.
I realised before I was using a really just doing object dectection, which isn’t bad, but for the kind of results I want something like YOLO for it’s pose estimation. I didn’t know of this before, which is why I didn’t just start with that.
Of course the issue now is that I have to re-create my data, which I’ll be doing by using YOLOv8-Poseto automatically generate mask annotations.
So far, I’ve extracted the keypoint’s of which the model had seen in order to use it in my own model: but, I’m running into issues again. It seems the training script is unable to find the collection of batchs of the keypoints: I’m not sure why this is, as everything has the correct dirs implict in the scripts. I imagine it’s due to how the training script is calling the PoseDataGenerator class within the generator script: honestlly, I’m not even sure how the two connect besides my importaion of the class within the training script. I think it’s best to look at the two right now.
The training script
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
import sys
sys.path.append('/home/nate/MMA/src/utils/')
from generator import PoseDataGenerator # Import the generator class
# Define the path to your chunk directory
chunk_dir = "/home/nate/MMA/data/chunks/" # Update with the correct path
# Create the training data generator
train_generator = PoseDataGenerator(batch_size=32)
# Define the LSTM model
model = Sequential([
LSTM(128, activation='relu', input_shape=(train_generator[0][0].shape[1], train_generator[0][0].shape[2]), return_sequences=True),
Dropout(0.2),
LSTM(64, activation='relu', return_sequences=False),
Dropout(0.2),
Dense(32, activation='relu'),
Dense(2, activation='softmax') # Assuming 2 classes for classification
])When running this script, I get the error
Traceback (most recent call last):
File "/home/nate/MMA/src/train/train_model.py", line 16, in <module>
LSTM(128, activation='relu', input_shape=(train_generator[0][0].shape[1], train_generator[0][0].shape[2]), return_sequences=True),
~~~~~~~~~~~~~~~^^^
File "/home/nate/MMA/src/utils/generator.py", line 24, in __getitem__
X = np.load(os.path.join(self.chunk_dir, self.X_files[i]))
~~~~~~~~~~~~^^^
IndexError: list index out of rangeThe error it flings is correct. The generator, due to the set batch_size within the script, is try to iterate up to a pairwise of 32: I only have six pairs of X & Y chunks, which is why it’s throwing this error.
The Generator scirpt
import os
import numpy as np
from tensorflow.keras.utils import Sequence
class PoseDataGenerator(Sequence):
def __init__(self, chunk_dir="/home/nate/MMA/data/chunks", batch_size=32):
self.chunk_dir = chunk_dir
self.batch_size = batch_size
self.X_files = [f for f in os.listdir(chunk_dir) if f.startswith('X_chunk') and f.endswith('.npy')]
self.y_files = [f for f in os.listdir(chunk_dir) if f.startswith('y_chunk') and f.endswith('.npy')]
self.X_files.sort()
self.y_files.sort()
def __len__(self):
# Return the number of batches per epoch
return int(np.floor(len(self.X_files) / self.batch_size))
def __getitem__(self, index):
# Load the data for the batch
batch_X = []
batch_y = []
for i in range(index * self.batch_size, (index + 1) * self.batch_size):
X = np.load(os.path.join(self.chunk_dir, self.X_files[i]))
y = np.load(os.path.join(self.chunk_dir, self.y_files[i]))
batch_X.append(X)
batch_y.append(y)
return np.array(batch_X), np.array(batch_y)Running this script alone doesn’t run into any issues: although it’s simply a script defining a class, I’d figure it’d still be important to test it incase the error was in there.
Oh but one more error.
After fixing the issue of incorrect pairwise amounts, I’m now getting this error when running the training script
Epoch 1/10
Traceback (most recent call last):
File "/home/nate/MMA/src/train/train_model.py", line 28, in <module>
model.fit(train_generator, epochs=10, steps_per_epoch=len(train_generator))
File "/home/nate/test/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/nate/test/lib/python3.11/site-packages/keras/src/models/functional.py", line 272, in _adjust_input_rank
raise ValueError(
ValueError: Exception encountered when calling Sequential.call().
Invalid input shape for input Tensor("data:0", shape=(None, 1000, 10, 136), dtype=float32). Expected shape (None, 1000, 10), but input has incompatible shape (None, 1000, 10, 136)
Arguments received by Sequential.call():
• inputs=tf.Tensor(shape=(None, 1000, 10, 136), dtype=float32)
• training=True
• mask=NoneAs it implies, the generator is creating arrays with four dimensions, which isn’t a tensor of a common shape for the LSTM model. I can either just change what the model will expect as an array shape, or I can have the generator serve up individual smaples from the chunks instead of the complete chunks. I want my model to learn from movements done by a person, so it’d be better to have each chunk split up into it’s own training sample based off a sequence within it. The following edited generator script will show how it’s more technically done, and I’ll explain that afterwards aswell.
hiSome open ended questions in till I get here
- If CNNs can automatically extract some random set of features, could I preform feature extraction on them and when would I need to do that? If this is the case, I’d assume annotating the data in a fashion of how I personally watch fights would serve to get something out of it- I’m beginning to questions the method or at least the frame of thought wile annotating my data. Some of the problems start to arise when I consider how this process should be like making it in such a way that you could show your human friend and they would now understand what MMA is. Along this line, you’d think they would first need to understand that MMA fights require more then one person;so, does the model need to know that as well? I don’t think so, but I’d suggest acting like an annoying friend that keeps point things out.
- Reminder to look into RL and all the rest with the DeepseekR1 buzz.
- Lot’s of hand fighting in south paw versus orthadox(lead hands are on the same side), as they’re trying to
- Could I pit foot work annotitions against stricking moves using RL or something?
- Could the switch of stance effect annotitiing and thus learning?
- Cross Vaildation could be a good way to get a dirty generliztion