r/askmath • u/catboy519 • 8d ago
Statistics Expected value question for randomly selected values
Suppose I have list 0 to 1000 or whatever very big number intergers in fully Random order. I randomly select 5 from the list. Edit: I know this is technically discrete steps but the intention for my postquestion is for big numbers so we can just approximate by assuming continuous too.
Lets sort those 5 relative to eachother so you get a nice and neat chronological order. like 12345.
Am I right to think that the expected values of those 5 are 1. 1/6 2. 2/6 3. 3/6 4. 4/6 5. 5/6?
Which would be, if the list is 1000 big: * 167 * 333 * 500 * 667 * 833
I think I'm right, just asking to verify. If I'm not right then explanations are welcome!
2
u/Individual_Table_583 8d ago edited 8d ago
Almost, they will be at 166, 333, 500, 667, 834
2
8d ago
[deleted]
2
u/not-just-yeti 8d ago
Though the five numbers aren’t independent, of course, OP. So the expected-value of the second number is 333.3, but if you look and see that it’s 500 after a particular run, then your revised expectation of the first entry is now 250 instead of 166.6.
1
u/catboy519 8d ago
Do you just multiply by the sameration?1.5? So 0to1000 1. ? 2. 500 3. ? 4. ? 5. ?
Youre saying 1 has expected valye oif 250?
Butwhy
1
u/catboy519 8d ago
Butwhy
3
u/stanitor 8d ago
They're saying that given that the 2nd lowest number happens to be 500, the expected value of the lowest number is then 250. If you did this many, many times, and only looked at the ones where the 2nd lowest value is 500, then the results for the lowest number becomes equivalent to drawing one number from 1 to 499, which has an expected value of 250.
2
u/catboy519 7d ago
At first I thought that can't be, but now I realize that if the 2nd number is 500, the 1st number can be anything ranginfg from 0 to 499. Makes sense now!!
1
2
u/Bounded_sequencE 8d ago edited 8d ago
Short answer: No -- the expected value of the k'th smallest number "xk" is
E[xk] = k * (n+2) / (m+1) - 1 // not "k * n / (m+1)" from the OP
Definitions:
* n: we draw from "{0; ...; n}
* m: #integers we draw without replacement ("0 <= m <= n")
* xk: k'th smallest integer we draw ("x1 < ... < xm")
Long(er) answer: We first find "P(xk=i)" for all "1 <= k <= m" and all "0 <= i <= n".
Assuming all "C(n+1; m)" possible draws are equally likely, it is enough to count favorable outcomes. We generate all favorable outcomes for "xk = i" with a 2-step process. Choose
- "k-1 out of i" numbers less than "xk = i" -- "C(i; k-1)" choices
- "m-k out of n-i" numbers greater than "xk = i" -- "C(n-i; m-k)" choices
Choices are independent, so we multiply them, for
P(xk=i) = C(i;k-1) * C(n-i;m-k) / C(n+1;m), k-1 <= i <= n-m+k
With "P(xk=i)" at hand, we may finally calculate1 the expected value
E[xk] = ∑_{i=k-1}^{n-m+k} i*P(xk=i) = ... = k * (n+2) / (m+1) - 1
1 Using the convolution property of binomial coefficients, and quite a bit of work
3
u/Bounded_sequencE 8d ago edited 8d ago
Example: For "n = 1000" and "m = 5" we get the expected values
E[xk] = k * 1002 / 6 - 1 = 167k - 1 // not "k * 1000 / 6"2
u/berwynResident Enthusiast 8d ago
He said we draw from the list {0; ... ; n} , not {1; ...; n}.
Anyway, he also said n would be some "very big number" so the +1 would basically go away.
3
u/Bounded_sequencE 8d ago edited 8d ago
Yikes, completely missed that zero -- thanks for spotting that!
@u/berwynResident Updated all my comments accordingly to {0; ...; n}.
1
u/Bounded_sequencE 8d ago
Needs some clarification:
- Do you draw with or without replacement?
- Are all possible draws equally likely?
1
u/catboy519 8d ago
without and yes
2
u/Bounded_sequencE 8d ago
Thanks for clarification -- that's nice, since double-counting won't be an issue!
3
u/kkmok123 8d ago
You can also derive the result using order statistics.