Hello,

I am getting the error: "Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:15"

There is definetly something I am doing wrong but I cannot figure out what. Any help will be appreciated. The error in the output is poiinting to line 119; if(R[currentState][action] > -1){. I tried to change this but no success.

package q.learning;

import java.util.Random;

public class class1
{
    private static final int Q_SIZE = 16;
    private static final double GAMMA = 0.8;
    private static final int ITERATIONS = 30;
    private static final int INITIAL_STATES[] = new int[] {6, 7, 8, 9, 10, 11, 12,13,14,15,1, 3, 5, 2, 4, 0}; 
    private static final int R[][] = new int[][] {{-1,  7, -1, -1,  2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, 
                                                  {-1, -1,  15, -1, -1,  2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, 
                                                  {-1, -1, -1,  50, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, 
                                                  {-1, -1, -1, -1, -1, -1, -1,  7, -1, -1, -1, -1, -1, -1, -1, -1}, 
                                                  {-1, -1, -1, -1, -1,  2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}, 
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  2, -1, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1,  9, -1, -1, -1, -1, -1, -1, -1,100},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  5, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1,  8, -1, -1, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  8, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,100},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1},
                                                  {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 250}};

    private static int q[][] = new int[Q_SIZE][Q_SIZE];
    private static int currentState = 0;

    private static void train()
    {
        initialize();

        // Perform training, starting at all initial states.
        for(int j = 0; j < ITERATIONS; j++)
        {
            for(int i = 0; i < Q_SIZE; i++)
            {
                episode(INITIAL_STATES[i]);
            } // i
        } // j

        System.out.println("Q Matrix values:");
        for(int i = 0; i < Q_SIZE; i++)
        {
            for(int j = 0; j < Q_SIZE; j++)
            {
                System.out.print(q[i][j] + ",\t");
            } // j
            System.out.print("\n");
        } // i
        System.out.print("\n");

        return;
    }

    private static void test()
    {
        // Perform tests, starting at all initial states.
        System.out.println("Shortest routes from initial state:");
        for(int i = 0; i < Q_SIZE; i++)
        {
            currentState = INITIAL_STATES[i];
            int newState = 0                    ;
            do
            {
                newState = maximum(currentState, true);
                System.out.print(currentState + ", ");
                currentState = newState;
            }while(currentState < 15);
            System.out.print("15\n");
        }

        return;
    }

    private static void episode(final int initialState)
    {
        currentState = initialState;

        // Travel from state to state until goal state is reached.
        do
        {
            chooseAnAction();
        }while(currentState == 15);

        // When currentState = 15, Run through the set once more for convergence.
        for(int i = 0; i < Q_SIZE; i++)
        {
            chooseAnAction();
        }
        return;
    }

    private static void chooseAnAction()
    {
        int possibleAction = 0;

        // Randomly choose a possible action connected to the current state.
        possibleAction = getRandomAction(Q_SIZE);

        if(R[currentState][possibleAction] >= 0){
            q[currentState][possibleAction] = reward(possibleAction);
            currentState = possibleAction;
        }
        return;
    }

    private static int getRandomAction(final int upperBound)
    {
        int action = 0;
        boolean choiceIsValid = false;

        // Randomly choose a possible action connected to the current state.
        while(choiceIsValid == false)
        {
            // Get a random value between 0(inclusive) and 16(exclusive).
            action = new Random().nextInt(upperBound);
            if(R[currentState][action] > -1){
                choiceIsValid = true;
            }
        }

        return action;
    }

    private static void initialize()
    {
        for(int i = 0; i < Q_SIZE; i++)
        {
            for(int j = 0; j < Q_SIZE; j++)
            {
                q[i][j] = 0;
            } // j
        } // i
        return;
    }

    private static int maximum(final int State, final boolean ReturnIndexOnly)
    {
        // If ReturnIndexOnly = True, the Q matrix index is returned.
        // If ReturnIndexOnly = False, the Q matrix value is returned.
        int winner = 0;
        boolean foundNewWinner = false;
        boolean done = false;

        while(!done)
        {
            foundNewWinner = false;
            for(int i = 0; i < Q_SIZE; i++)
            {
                if(i != winner){             // Avoid self-comparison.
                    if(q[State][i] > q[State][winner]){
                        winner = i;
                        foundNewWinner = true;
                    }
                }
            }

            if(foundNewWinner == false){
                done = true;
            }
        }

        if(ReturnIndexOnly == true){
            return winner;
        }else{
            return q[State][winner];
        }
    }

    private static int reward(final int Action)
    {
        return (int)(R[currentState][Action] + (GAMMA * maximum(Action, false))); //formula
    }

    public static void main(String[] args)
    {
        train();
        test();
        return;
    }

}

Q_SIZE is 16, which you use to get a value for action.
So you get a random action in the range 0 - 15, which you use as an index into the array
But the array seems to have 15 elements ie [0] to [14]

Thanks James, however I see 16 elements in my array, or maybe I am looking at the wrong thing?

The exception relates to the R[][] array, which looks like [15][16] to me - so forget what I said about action, this is a problem with currentState==15. As before, you use the random to set a value for currentState (line 104), which you use as the first index. A quick print statement just before the offending array reference will confirm the value of currentState

Ok, I am confused now. I get this when printing the state:run:

run:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 15
6677777777777777777777777777777777777777771111111111111111111111111111111115    at q.learning.class1.getRandomAction(class1.java:123)

I also change this line but still does not work. Get the same error.

      }while(currentState == 15);

Stop changing things at random. Understand why you get the error befroe trying to fix it.

Line 100 - you get a random number 0 - 15 (ie Q_SIZE -1)
Line 104 - you assign that value to currentState
Line 119 - you use currentState as the first index for R. Valid values for the first index of R are 0 - 14 because R is a 15x16 array. If currentState is 15 you get an array index out of bounds.

I don't know why you did that, or what you intended, but what you did causes the Exception.

Sorry I did not see that my array was 15x 16. I have changed the R array and made it to 16 X 16. I don't get any errors but my output is blank which is not what I am expecting.

OK. The only thing to do now is to put in a whole load of debugging print statements so you can narrow down exactly where it's going wrong.

Alright, I did as you said. Most of my print statements are showing, apart from the one in the loop from line 35. Seems that there is something wrong there but again I am not sure what.

One more thing, when I set the iteration to 0, I get the correct output. But when I change this value, I get the output to load forever and not displaying anything.

I can't really comment on your logic because I have no idea what this program is supposed to do or how it is supposed to do it! However, this code looks fishy...

currentState = initialState;
do {
   chooseAnAction();
} while(currentState == 15);

that will loop as long as the currentState stays at 15. As soon as it is not 15, it will exit the loop, which makes a nonsense of the following comment. At a pure guess, maybe you want to loop until currentState achieves a value of 15, then exit?

The 15 part is correct. Basically when the state exits the 15 it should stop looping.

In that case what about the following comment
// When currentState = 15, Run through the set once more for convergence.
because currentState can never be 15 when you reach that comment

In "ChooseAnAction", after changing:

possibleAction = getRandomAction(Q_SIZE);

To:

possibleAction = getRandomAction(Q_SIZE - 1);

it will compile. However, there is an infinite loop in "getRandomAction".

R[currentState][action] is always -1 so

choiceIsValid = true;

never occurs.

To help you debug:

In "episode" add a println statement in the for loop (as shown below):

        // When currentState = 15, Run through the set once more for convergence.
        for(int i = 0; i < Q_SIZE; i++)
        {
            System.out.println("i: " + i);

            chooseAnAction();

            System.out.println("after chooseAnAction");
        }

In "getRandomAction" add println statements in the while loop (as shown below):

        // Randomly choose a possible action connected to the current state.
        while(choiceIsValid == false)
        {
            // Get a random value between 0(inclusive) and 16(exclusive).
            action = new Random().nextInt(upperBound);

            System.out.println("action: " + action);
            System.out.println("R[" + currentState + "][" + currentState + "]: " + R[currentState][action]);
            System.out.println();

            if(R[currentState][action] > -1){
                choiceIsValid = true;
            }
        }

@cgeier, I love you man! Thanks!!!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.