I read this news report today https://www.cnn.com/2019/01/02/us/florida-girl-sat-controversy/index.html
News reports frequently suck because they tell you so little about what happened and, instead, stoke your emotion. This is certainly no exception and now a high-powered lawyer has taken it on. Basically (from what I am reading and not from what I am otherwise knowing) A young lady took the SAT, got a 900. Subsequently she got a tutor, studied a lot more, took a prep course and the like. She took the SAT again and got a 1230.
The testing people are holding her results and are not "validating" them, at least so far (I put validation in quotes because they have their own definition of what validation means and they don't say what it is in the article). They are not saying that the increase is the reason, instead (again from the article) they are saying...
"We are writing to you because based on a preliminary review, there appears to be substantial evidence that your scores ... are invalid," it said. "Our preliminary concerns are based on substantial agreement between your answers on one or more scored sections of the test and those of other test takers. The anomalies noted above raise concerns about the validity of your scores."
This got me thinking about how one could statistically evaluate the likelihood of cheating by copying answers from others - at least that is what it sounds to me like what they are saying - or are they saying something else?
I think but don't know that answer keys (that is the test format) is not the same for everybody - right? I mean your neighbor is using a different form of the test so simply copying their answers would not work - right? If they are the same form that how would it be possible to know who copied from whom?
When I used multiple choice tests we always received statistics on the test results. Some of these were very useful to me in designing or redesigning test questions. One of my favorite measures was the point-biserial relationship between how well students did on the item and the total test. I would use this to see how well the question discriminated. IOW getting a high total test score should be related to getting the item correct - high total scorers having a high frequency of a correct answer and the opposite for low total scorers. Graphically a slope near 1 was a good discriminating item and a slope near 0 was not.
Are they going into further measures like complicated pattern evaluations making use of some of these measures or ??
I am posting to learn what others think, not about the equity of the decision or treatment of the student (I do have an immediate gut reaction but want to hold off for a while), but rather to generate discussion of possible ways one would approach such an evaluation. Personally, I doubt whether the preliminary lack of validation will hold up unless there is some easily understandable and documented incident that took place (and has not yet been mentioned and would open up another can of worms).
I am going to follow this if I can because I want to know what they looked at to arrive at the "preliminary conclusion or concern".
News reports frequently suck because they tell you so little about what happened and, instead, stoke your emotion. This is certainly no exception and now a high-powered lawyer has taken it on. Basically (from what I am reading and not from what I am otherwise knowing) A young lady took the SAT, got a 900. Subsequently she got a tutor, studied a lot more, took a prep course and the like. She took the SAT again and got a 1230.
The testing people are holding her results and are not "validating" them, at least so far (I put validation in quotes because they have their own definition of what validation means and they don't say what it is in the article). They are not saying that the increase is the reason, instead (again from the article) they are saying...
"We are writing to you because based on a preliminary review, there appears to be substantial evidence that your scores ... are invalid," it said. "Our preliminary concerns are based on substantial agreement between your answers on one or more scored sections of the test and those of other test takers. The anomalies noted above raise concerns about the validity of your scores."
This got me thinking about how one could statistically evaluate the likelihood of cheating by copying answers from others - at least that is what it sounds to me like what they are saying - or are they saying something else?
I think but don't know that answer keys (that is the test format) is not the same for everybody - right? I mean your neighbor is using a different form of the test so simply copying their answers would not work - right? If they are the same form that how would it be possible to know who copied from whom?
When I used multiple choice tests we always received statistics on the test results. Some of these were very useful to me in designing or redesigning test questions. One of my favorite measures was the point-biserial relationship between how well students did on the item and the total test. I would use this to see how well the question discriminated. IOW getting a high total test score should be related to getting the item correct - high total scorers having a high frequency of a correct answer and the opposite for low total scorers. Graphically a slope near 1 was a good discriminating item and a slope near 0 was not.
Are they going into further measures like complicated pattern evaluations making use of some of these measures or ??
I am posting to learn what others think, not about the equity of the decision or treatment of the student (I do have an immediate gut reaction but want to hold off for a while), but rather to generate discussion of possible ways one would approach such an evaluation. Personally, I doubt whether the preliminary lack of validation will hold up unless there is some easily understandable and documented incident that took place (and has not yet been mentioned and would open up another can of worms).
I am going to follow this if I can because I want to know what they looked at to arrive at the "preliminary conclusion or concern".