65% of woman agree...

Deleted member 115935 · Dec 29, 2020

Please don't take offence at title, its a direct quote,

Background,

Here in Europe, we have tv adverts that come up with statements on lines of
When advertising things like hair products, "65 % of women agreed" , to a question like "my hair is shinier than the other product"

At the bottom in small print , they have to show the sample size,
which in this case was 43 of 67 agree.

Now ignoring the the fact the that should be 64%, i.e. they have rounded up.

It got me wondering,

and I know we have some great mathematicians here,

If lets say, the odds of a person agreeing is 50:50
How many would we have to ask, to have a 50:50 chance of finding a run of 67 where 43 agree ?

ronsimpson · Dec 29, 2020

Advertising: No one asked 100 women about the product. Marketing just made up the numbers and actors said what they were told.

AlbertHall · Dec 29, 2020

jpanhalt · Dec 29, 2020

It's an advertising thing. In the US, when a champion of some cause wants to claim something is dangerous, they will claim that 50,000 Americans die from it every year. Here are just 2 examples:

Where does that number come from -- dark place. It does have a certain ring to it, though. It sounds big enough to be a concern, but small enough so one is unlikely to have actually known someone who died from that cause in the past year (e.g. , 1 in 6,000 to 7,000 people). At least, that's the explanation I have heard.

Deleted member 115935 · Dec 29, 2020

AlbertHall said:
View attachment 226189

Thank you
I want to keep this fairly tight,
I got the basic probability thing,
what I was wondering was if I kept generating random Y / N answers, ( 50:50)
how many would I need to generate to have a 50:50 chance of having 43 Y in consecutive 67 numbers.

which I think is different , but I don't know how to work that one out.

wayneh · Dec 29, 2020

andrewmm said:
Thank you
I want to keep this fairly tight,
I got the basic probability thing,
what I was wondering was if I kept generating random Y / N answers, ( 50:50)
how many would I need to generate to have a 50:50 chance of having 43 Y in consecutive 67 numbers.

which I think is different , but I don't know how to work that one out.

You need to get familiar with the binomial distribution and its properties. Many natural distributions are of a parameter that can assume any value, for example height or weight. The variation is well captured by the normal or Gaussian distribution.

Coin flips can only produce three values: heads, tails, or on edge. Ignoring the latter means you can expect a binomial distribution.

The binomial distribution has been fully studied and has many properties similar to the normal distribution. In other words, you can use a calculator to answer your question instead of doing the experiment yourself. The one in #3 is probably just fine.

The claimed result could be an outright lie with no real study behind it. Or it could have been "science" by amateurs, eg. it could be that the study was not double blind, meaning the subjects could have been easily coaxed - maybe not even intentionally - to get the "right" answer.

If there is no real preference and I was going to fake it entirely, I think I'd pick a result that was near the 90% confidence interval. It's quite believable that a real study, even with no real difference in preferences, could produce that result. If you get greedy and make your claim outrageous, you're more likely to get caught.

jpanhalt · Dec 29, 2020

andrewmm said:
Thank you
I want to keep this fairly tight,
I got the basic probability thing,
what I was wondering was if I kept generating random Y / N answers, ( 50:50)
how many would I need to generate to have a 50:50 chance of having 43 Y in consecutive 67 numbers.

which I think is different , but I don't know how to work that one out.

1) Your advertisement does not indicate that the "agree" responses were consecutive. That is a different problem, and the probability of that is quite low.
2) A quick approximation of the Standard Deviation (SD) of a Poisson distribution is the square root of the mean. Thus, if the two results were 43 agree and 24 disagree (67 total), the mean is 33.5 and the SD = 5.8. 95% confidence limits = ±2 SD = ±11.6, which are 45.1 and 21.9 . Of course that distribution applies to large numbers. For such small numbers, the limits are larger. Correction for small numbers of observations can be applied, including the Yates correction (https://en.wikipedia.org/wiki/Yates's_correction_for_continuity), but in this case, that is not needed.

Deleted member 115935 · Dec 30, 2020

jpanhalt said:
1) Your advertisement does not indicate that the "agree" responses were consecutive. That is a different problem, and the probability of that is quite low.
2) A quick approximation of the Standard Deviation (SD) of a Poisson distribution is the square root of the mean. Thus, if the two results were 43 agree and 24 disagree (67 total), the mean is 33.5 and the SD = 5.8. 95% confidence limits = ±2 SD = ±11.6, which are 45.1 and 21.9 . Of course that distribution applies to large numbers. For such small numbers, the limits are larger. Correction for small numbers of observations can be applied, including the Yates correction (https://en.wikipedia.org/wiki/Yates's_correction_for_continuity), but in this case, that is not needed.

Thank you @jpanhalt,

I agree with you, I was not looking for assumption of a straight run of Yes then No

Conceptually , I think I could work out the maths for that .

I'm wondering ,
what they don't tell is how many actual observations they did.
If we don't assume the data is real, then we are into conspiracy theory , which has nothing to do with maths.
so on basis numbers are real,

My guess is they just asked loads of woman,
and chose a run of 67 answers, and found the maximum number of times that Y was said,

Some runs of 67 would give less Yes than others,

Can I be clear, I'm Not trying to reverse engineer what they did,

but the thought was, just how many woman would they have had to ask , if the probability of a yes was 50:50, to be 50:50 certain of getting 43 Y out of 67 samples ?

Its got to be calculable ,
but I cant seem to think how I'd answer how many , I'd end up simulating !

jpanhalt · Dec 30, 2020

andrewmm said:
I'm wondering ,
what they don't tell is how many actual observations they did.

"At the bottom in small print , they have to show the sample size,
which in this case was 43 of 67 agree. "

If we don't assume the data is real, then we are into conspiracy theory , which has nothing to do with maths.
so on basis numbers are real,

No conspiracy is needed. Just keep doing surveys until the results from one group agree with your premise. There is no requirement to report how many "failed" surveys were done. No correction needs to be made (in the case of advertising) for the number of "surveys;" although, that can be done. It's an easy trap to fall into, or one can do it intentionally.

My guess is they just asked loads of woman,
and chose a run of 67 answers, and found the maximum number of times that Y was said,

Agreed, as stated above. I am a very slow reader.

, just how many woman would they have had to ask , if the probability of a yes was 50:50, to be 50:50 certain of getting 43 Y out of 67 samples ?
Its got to be calculable ,

Yes it is, at least a probable number can be estimated (https://en.wikipedia.org/wiki/Standard_error). In the present case, the number of subjects is small, and the results do not meet p<= 0.05 to a first approximation without correction for sample size. While p= 0.05 is often used for "statistical significance," I would not bet my life on something that had a 1 in 20 chance of being wrong, unless the alternative was much worse, like death.

djsfantasi · Dec 30, 2020

There is a lot of great information here. It all should be taught in High School here in the US (and equivalent schools elsewhere).

hrs · Dec 30, 2020

jpanhalt said:
While p= 0.05 is often used for "statistical significance," I would not bet my life on something that had a 1 in 20 chance of being wrong, unless the alternative was much worse, like death.

As I understand it, Ronald Fisher introduced it as a quick litmus test, i.e. results not adhering to p = 0.05 should be rejected. Somehow this has evolved into wholesale scientific proof if the p = 0.05 criterion is met. Reading on the wiki there's some good news though:

In 2019, over 800 statisticians and scientists signed a message calling for the abandonment of the term "statistical significance" in science,[65] and the American Statistical Association published a further official statement [66] declaring (page 2):

We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term "statistically significant" entirely. Nor should variants such as "significantly different," " p ≤ 0.05 {\displaystyle p\leq 0.05}

$p\leq 0.05$
," and "nonsignificant" survive, whether expressed in words, by asterisks in a table, or in some other way.

Click to expand...

Deleted member 115935 · Dec 30, 2020

Thank you @jpanhalt

How many would they have had to survey then to get the 50:50 probability of 43 y in 67 samples ?

jpanhalt · Dec 30, 2020

I have only used such statistics as a tool. Sort of like a carpenter doesn't usually worry about how to make a hammer. I am way out of my comfort zone and am not sure what you mean by 50:50.

However, if you consider that the probability of a sample of 67 random binary responses (e.g., Y/N or 1/0) will have 43 out of 67 as Y is 0.05% , then the probability of not getting that in a single sample is 0.95. Now, define the probability of not getting that in n independent samples of 67 as P.

Then, (0.95)^n = P = 0.50 (if that is what you mean)

Solving that gives approximately 13.51 for n ( i.e., n = log 0.50/log 0.95). Of course, you can do that on a modern calculator by trial and error. Put another way, if you had a population of 938 and divided that into groups of 67 (14 groups), there is about a 50% chance that one of the groups would not have 43 responses of Y.

~~Edit: Where I get uncomfortable is in stating the limits. That is, is that the probability of having exactly 43 Y or at least 43 Y. I suspect it is the latter.~~
Edit2: The analysis I gave was for a value within +/- 2SD (i.e., two tailed) . That range was approximately >=24 to <= 43 for "Y." Thus, the 14 groups of 67 was to give approximately a 50% chance of getting a value outside that range.

402DF855 · Dec 30, 2020

Using simulation, I get 408. Trying to do the math, I get 104. No doubt, both are wrong.

Deleted member 115935 · Dec 30, 2020

Thank you guys
Sorry, I just wake up with these strange questions every now and then,
it seemed so simple, till I started to think about it,
and I realised I did not have a clue how to calculate it

jpanhalt · Dec 30, 2020

hrs said:
As I understand it, Ronald Fisher introduced it as a quick litmus test, i.e. results not adhering to p = 0.05 should be rejected. Somehow this has evolved into wholesale scientific proof if the p = 0.05 criterion is met. Reading on the wiki there's some good news though:

Sure, but the real impact is when government gets involved and decides that a regulated laboratory that is not within ±2SD (a murky number at best), it gets a "ding.'' Too many dings gets a "dong." You may lose your license to do testing.

For whatever reason, Wiki is not part of the Federal Register.

MrSalts · Dec 30, 2020

For years, Trident Gum advertised, "4 out of 5 dentists surveyed recommend sugarless gum for their patients who chew gum".
One of the advertising agency owners eventually said the only research they did was to test focus groups to determine which ratio seemed most convincing.

joeyd999 · Dec 30, 2020

MrSalts said:
For years, Trident Gum advertised, "4 out of 5 dentists surveyed recommend sugarless gum for their patients who chew gum".
One of the advertising agency owners eventually said the only research they did was to test focus groups to determine which ratio seemed most convincing.

jpanhalt · Dec 30, 2020

I thought the TS wanted to stay on subject.

We all know that "More Doctors prefer Camel Cigarettes than any other, " or was it just "Camels? Correction, it was just camels, but that was 1946 before preferring camels didn't have any other meaning. "

“More doctors smoke Camels than any other cigarette.”

joeyd999 · Dec 30, 2020

jpanhalt said:
I thought the TS wanted to stay on subject.

We all know that "More Doctors prefer Camel Cigarettes than any other, " or was it just "Camels?

Sorry, I thought this was in 'Off-Topic' and attempted to comply.

Similar threads	Forum	Replies	Date
No Woman no Drive..	Off-Topic	6	Jun 23, 2018
Woman dies in Arizona after being hit by Uber self-driving car	Off-Topic	172	Mar 19, 2018
Big bang and an intelligent pregnant woman.	General Science, Physics & Math	4	Jul 31, 2016
Mars,Agree ,Disagree	Off-Topic	47	Oct 4, 2015
Chinese Woman Reportedly Killed by Electric Shock from iPhone	General Electronics Chat	11	Jul 15, 2013

65% of woman agree...

Join our Engineering Community! Sign-in with:

65% of woman agree...

Deleted member 115935

ronsimpson

AlbertHall

jpanhalt

Deleted member 115935

wayneh

jpanhalt

Deleted member 115935

jpanhalt

djsfantasi

hrs

Deleted member 115935

jpanhalt

402DF855

Deleted member 115935

jpanhalt

MrSalts

joeyd999

jpanhalt

joeyd999

You May Also Like

Microchip Unveils Plug-In Timing Module for AI-Burdened Data Centers

Evaluating the Class AB Output Stage for Piezo Driver Design

Mesh AI: Node-Level Intelligence with Non-Cellular 5G/6G Connectivity

Using the Arduino Uno Q to Build a DDS Sine Wave Generator