Week 8

I started off this week by training neural networks using my labeled small swarm for training data. This process took some time to start due to the large amounts of data constantly causing my notebook to crash, but I was eventually able to modify everything to allow for this increased data size. I trained various networks throughout the rest of the week, and the specifications of the networks and the results can be seen listed below:

Modification 9:

Keeping the original channel numbers, I used batch normalization and an early stopping patience of 4
The small swarm was divided into 147 cubes with 0 overlap to use for training data
The network stopped training at epoch 10/1000 with a training loss of 0.0523 and a validation loss of 0.0717
When tested on an unseen cube of data (my original labeled cube), this network had a loss of 0.246

Modification 10:

Keeping the original channel numbers, I used batch normalization and an early stopping patience of 50
The small swarm was divided into 147 cubes with 0 overlap to use for training data
The network stopped training at epoch 80/1000 with a training loss of 0.0203 and a validation loss of 0.0551
When tested on the unseen cube of data, this network had a loss of 0.275

Modification 11:

Keeping the original channel numbers, I used batch normalization and an early stopping patience of 25
The small swarm was divided into 400 cubes with 10 voxels overlap to use for training data
The network stopped training at epoch 80/1000 with a training loss of 0.0133 and a validation loss of 0.0337
When tested on the unseen cube of data, this network had a loss of 0.584

Modification 12:

I doubled the original channel numbers and used an early stopping patience of 25
The small swarm was divided into 147 cubes with 0 overlap to use for training data
The network stopped training at epoch 80/1000 with a training loss of 0.0197 and a validation loss of 0.0600
When tested on the unseen cube of data, this network had a loss of 0.392

Modification 13:

I doubled the original channel numbers, and I removed my implementation of early stopping, though I continued to track validation loss at each epoch
The small swarm was divided into 400 cubes with 10 voxels overlap to use for training data
After epoch 1000/1000, the network had a training loss of 0.0000 and a validation loss of 0.2514
When tested on the unseen cube of data, this network had a loss of 0.695

Modification 14:

I doubled the original channel numbers, used batch normalization, and an early stopping patience of 25
The small swarm was divided into 400 cubes with 10 voxels overlap to use for training data
The network stopped training at epoch 80/1000 with a training loss of 0.0124 and a validation loss of 0.0294
When tested on the unseen cube of data, this network had a loss of 0.518

Modification 15:

Keeping the original channel numbers, I used an early stopping patience of 25
The small swarm was divided into 400 cubes with 10 voxels overlap to use for training data
The network stopped training at epoch 70/1000 with a training loss of 0.0109 and a validation loss of 0.0541
When tested on the unseen cube of data, this network had a loss of 0.451

Modification 16

Keeping the original channel numbers, I used an early stopping patience of 5
The small swarm was divided into 400 cubes with 10 voxels overlap to use for training data
The network stopped training at epoch 30/1000 with a training loss of 0.0314 and a validation loss of 0.0365
When tested on the unseen cube of data, this network had a loss of 0.425

When looking at the “Residuals:bee-notbee minus labels” figure for the cubes, it may look like these neural networks are performing worse when compared to the ones trained on just the cube. However, the loss of these networks seems to generally be lower, and, unlike the original networks I’ve trained, these networks have never seen any of the data from this cube. Therefore, I think these networks are performing better given unseen data and those original networks were slightly overfit to the cube of data they trained on (and were tested on). It’s also important to note that the UNet binary output determining if each voxel is a bee or not a bee is based on a threshold (in this case 0.5). I think this threshold might be too high since in the “UNet Output” figure I can see many lighter areas that are not picked up due to the high threshold, so next week I plan to re-test these networks with a lower threshold level to see if that helps with the identification of more bees which are smaller/lighter in color or if this leads to bees being connected.

Alongside training the networks, I presented my research progress to the rest of the members of the lab for the first time. This presentation went well, and the lab members were excited to see the possibility of a neural network that could give them access to more information about the inner workings of bee swarms. They also asked numerous questions and provided multiple suggestions which gave me more ideas about future things I could look at. After this presentation, I began to look into some of these things, but due to confidentiality reasons, can’t discuss them here.

Written on July 26, 2024