questions about a simple random walk

Discussion in 'Probability' started by hanrahan398, Nov 9, 2010.

  1. hanrahan398

    hanrahan398 Guest

    Consider the simple random walk given by {S_n} = sum_i=1_to_n (A_i)
    where each A_i is equiprobably 1 or -1.

    Run a large number of trials, each of length n.

    Clearly the expected mean of S_n is zero and its expected standard
    deviation is sqrt(n). So we can define the z-score Z_n = (S_n)/
    sqrt(n).

    Now define the statistic u_n = sum_i=1_to_n (Z_i / n^k), where k >=1
    is a constant.

    Run a large number of trials.

    Questions about u_n:

    1) is it true that its mean is zero?

    2) what is its expected standard deviation?

    3) is it normal? if not, what is the probability that u_n > t, where t
    is a positive constant?

    Thanks!
    Michael
     
    hanrahan398, Nov 9, 2010
    #1
    1. Advertisements

  2. hanrahan398

    Ray Vickson Guest

    Are the Z_1, Z_2, ..., Z_n computed from a single A_i-sample of size n
    (case 1), or are they separate z-scores from n independent samples of
    size n (case 2)? I will assume you mean case 1, so Z_1 = A_1/sqrt(1),
    Z_2 = (A_1 + A_2)/sqrt(2), etc.

    (1) Since u_n is a sum of mean-zero random variables, it has mean
    zero: the mean of the sum is the sum of the means---no independence
    needed.

    (2) However, the terms of u_n are correlated, so computing the
    variance is more complicated. We can pull the 1/n^k outside the
    summation that defines u_n, so it is enough to look at v_n = sum_{i=1}
    ^n Z_i, so that u_n = v_n / n^k. We have E(Z_i) = 0 and Var(Z_i) = 1
    for all i. For i < j we have Z_j = (A_1 + ... + A_j)/sqrt(j) = [S_i +
    S_{j-i}]/sqrt(j), where S_{j-1} = A_{i+1}+...+A_j. We have E(Z_i Z_j)
    = E[S_i * (S_i + S_{j-i}]/sqrt(i*j) = Var(S_i)/[sqrt(i)*sqrt(j)] + 0,
    since S_i and S_{j-i} are independent with 0 means. Thus, E(Z_i * Z_j)
    = i/[sqrt(i)*sqrt(j)] = sqrt(i/j), hence cov(Z_i,Z_j) = E(Z_i * Z_j) -
    E(Z_i)*E(Z_j) = sqrt(i/j). We have Var(v_n) = sum(Var(Z_i)) + 2*sum_{i
    < j} cov(Z_i,Z_j) = n + sum_{i < j} sqrt(i/j).

    (3) The random variables v_n (and u_n) are NOT normally distributed,
    since the terms Z_i are not normally distributed. However, for large
    n, most of the summands Z_i involve a large number of terms, so will
    be "approximately" normal. In fact, we can try to look at the moment-
    generating function M(w) = E[exp(w*u_n)] for large n, and see whether
    it is roughly of the form const*exp(-b*w^2), which would be the mgf of
    a normal random variable.
    This will be difficult to compute exactly.

    R.G. Vickson
     
    Ray Vickson, Nov 9, 2010
    #2
    1. Advertisements

  3. hanrahan398

    hanrahan398 Guest

    Many thanks for all the work you've done here, Ray - I am much
    obliged!

    I have to admit, I am close to the limit of my current level of
    understanding of statistics here.

    I did mean case 1, with Z_i's defined as you say.

    But unfortunately I mistyped the definition of u_n. Rather than
    "u_n = sum_i=1_to_n (Z_i / n^k), where k >=1 is a constant"
    I should have typed:
    "u_n = sum_i=1_to_n (Z_i / i^k), where k >=1 is a constant"
    and then further defined
    t_n = u_n/(sum_j=1_to_n (1/j^k))

    Thus for example u_3 is the sum Z_1/(1^k) + Z_2/(2^k) + Z_3/(3^k).

    If k=1 then t_3 is (Z_1/1 + Z_2/2 + Z_3/3 ) / (1 + 1/2 + 1/3).

    If k=2, then t_3 is (Z_1/1 + Z_2/4 + Z_3/9 ) / (1 + 1/4 + 1/9).

    Then the questions I asked about u_n should have been asked about t_n,
    i.e. given a large number of trials of length n, what are its expected
    mean and s.d., and does it tend to normality for large n?

    The basic idea is that a lot of 1's or -1's near the beginning have a
    bigger effect on t_n than a lot near the end.

    Thanks!
    Kind regards,
    Michael

     
    hanrahan398, Nov 10, 2010
    #3
  4. hanrahan398

    Ray Vickson Guest

    I will follow the Maple convention of writing a/[b*c] as a/b/c, etc.

    The terms of u_n are A_1/sqrt(1)/1, (A_1+A_2)/sqrt(2)/2^k,
    (A_1+A_2+A_3)/sqrt(3)/3^k, etc. In u_n, what is the coefficient of
    A_1? It is U_1 = 1 + 1/sqrt(2)/2^k + 1/sqrt(3)/3^k + ... + 1/sqrt(n)/
    n^k. The coefficient of A_2 is U_2 = 1/sqrt(2)/2^k + ... + 1/sqrt(n)/
    n^k,... and the coefficient of A_i is U_i = sum{1/sqrt(j)/j^k, j =
    i..n}.

    Thus, u_n has the form u_n = U_1*A_1 + U_2*A_2+ ... + U_n*A_n, where
    U_i is written above. The terms are independent because the A_i are
    mutually independent, so E(u_n)= 0 (sum of terms of mean 0), and
    Var(u_n) = sum{U_i^2 * Var(A_i), i=1..n} = sum{[sum{1/sqrt(j)/
    j^k,j=i..n)]^2, i=1..n}. (The variance of a sum of independent terms
    of the form U_i*A_i is the sum of the variances; and Var(U_i*A_i) =
    U_i^2 * Var(A_i) = U_i^2.)

    The moment-generating function of u_n is Fn(w) = E[exp(w*u_n)] =
    E[exp(w*U_1*A_1) * exp(w*U_2*A_2) * ... * exp(w*U_n*A_n)]. Since the
    individual factors exp(w*U_i*A_i) are mutually independent, the
    expected value of the product is the product of the expected values:
    Fn(w) = product{E[exp(w*U)i*A_i)],i=1..n}. We have E[exp(w*U)i*A_i)] =
    (1/2)*exp(w*U_i*1) + (1/2)*exp(w*U_i*(-1)) = cosh(w*U_i), so we
    finally have Fn(w) = product{cosh(w*U_i),i=1..n] From this, you can
    get the characteristic function of u_n and, in principle, invert it
    (inverse Fourier transform) to get the cumulative distribution Pr{u_n
    <= t}; in practice, it is not so easy, though. Numerical inverse
    transform methods are available; see http://www.jstor.org/pss/2684571
    ..

    Alternatively, you could use B_i = 0 or 2 with probability 1/2
    (instead of A_i), to get an expression of the form u_n = c_n + v_n,
    where c_n = -sum(U_i) and v_n = sum(U_i*B_i). (This writes A_i = -1 +
    B_i and separates out the nonrandom "-1" part from the random B_i
    part.) Thus, Fn(w) = exp(-w*c_n)*product(1/2 + 1/2 *
    exp(2*w*U_i),i=1..n). Since all the B_i are >= 0, we can use Laplace-
    transform inversion methods to get the distribution Gn(t) = Pr{v_n <=
    t} of v_n;
    see http://www.columbia.edu/~ww2040/Fall03/LaplaceInversionJoC95.pdf
    (complete article freely available). We can then get Pr{u_n <= t} as
    Pr{v_n <= t - c_n}.

    R.G. Vickson
     
    Ray Vickson, Nov 10, 2010
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.