Task demonstration
Test input grid 0/0
Test output grid

AO Agent Inner (Q) state
AO Agent Predicted (Z) Output
0