It gets better the longer it runs

SomeBoyo@feddit.de · 1 year ago

It gets better the longer it runs

Murdo Maclachlan@lemmy.world · edit-2 1 year ago

Image Transcription: Code

bool is_prime(int x)
    return false;
}

[Beneath the code is a snippet of console output, as follows:]

test no.99989: passed
test no.99990: passed
test no.99991: failed
test no.99992: passed
test no.99993: passed
test no.99994: passed
test no.99995: passed
test no.99996: passed
test no.99997: passed
test no.99998: passed
test no.99999: passed
95.121% tests passed

I am a human who transcribes posts to improve accessibility on Lemmy. Transcriptions help people who use screen readers or other assistive technology to use the site. For more information, see here.

raspberry_confetti@sh.itjust.works · 1 year ago

good human

average650@lemmy.world · 1 year ago

Why not just test all even numbers greater than 2? It covers infinite numbers and passes 100% of the time.

Kogasa@programming.dev · 1 year ago

It’s important to test edge cases.

two_wheel2@lemm.ee · 1 year ago

We’ll grab those in integration and E2E tests… let’s just stick with even numbers such that 2 < n < 1002… that’s 1000 cases, more than enough, and 100% test coverage

Zeth0s@reddthat.com · 1 year ago

You are joking, but this is exactly what happens if you optimize accuracy of an algorithm to classify something when positive cases are very few. The algorithm will simply label everything as negative, and accuracy will be anyway extremely high!

theblueredditrefugee@lemmy.dbzer0.com · 1 year ago

This is actually a perfect example of why to care about the difference between accuracy, precision, and recall. This algorithm has 0 precision and 0 recall, the only advantage being that it has 100% inverse recall (all negative results are correctly classified as negative).

theblueredditrefugee@lemmy.dbzer0.com · 1 year ago

Illustration of the difference between the two from my machine learning classes in college, which was obviously just the first google result:

Dr Cog@mander.xyz · 1 year ago

This is also why medical studies never use accuracy as a measure if the disorder being studied is in any way rare. Sensitivity and specificity or positive/negative likelihood ratios are more common

Fried_out_Kombi@lemmy.world · 1 year ago

Train a neural network to detect non-primes. The more data you give it, the more accurate it is!

rpj@lemmy.one · 1 year ago

Wow, a neural network!

Xylight (photon dev)@programming.dev · 1 year ago

How long would this have to run for it to round up to 100%?

xthexder@programming.dev · 1 year ago

A few calculations:

There are 9592 prime numbers less than 100,000. Assuming the test suite only tests numbers 1-99999, the accuracy should actually be only 90.408%, not 95.121%
The 1 trillionth prime number is 29,996,224,275,833. This would mean even the first 29 trillion primes would only get you to 96.667% accuracy.
The density of primes can be approximated using the Prime Number Theorem: 1/ln(x). Solving 99.9995 = 100 - 100 / ln(x) for x gives e^200000 or 7.88 × 10^86858. In other words, the universe will end before any current computer could check that many numbers.

Hexarei@programming.dev · 1 year ago

This is a really fun question and now I’m nerd sniped

Hexarei@programming.dev · 1 year ago

As an update to my earlier nerd-sniped-ness:

I found a list of prime numbers, which states that the 50,000,000th prime number is 982,451,653, which means 5.08930896% of the numbers up to 982,451,653 are prime. That’s unfortunate, as it means the accuracy is actually lower than the original post we go further - down from 95.121% accuracy to 94.921%. Bummer!

Out of curiosity, I then whipped up a quick program in rust that starts from those numbers, crunching forward through the primes while prime_count as f32 / total_count as f32 > 0.05, using 16 CPU cores to divide-and-conquer and check whether a number is prime. There’s probably a better way to do that, but meh. Such a check will essentially only get me back above 95% though, and based on the rate of change, I suspect it would take an exponentially higher amount of time than whatever it takes to get to 99.5%.

In the time it’s taken me to write this, it’s calculated just over 330,000 more primes, reaching ~0.050874 hit rate for primes.

This has led me down a small rabbit hole, in which it turns out there are plenty of folks who have approached the topic of “what percentage of numbers are prime?” - and the answer is essentially “it will eventually round to 0%”. Because of you, I remain curious to know when that crosses the threshold of 99.5% though - and I’ll at least leave it running for the next day or two to see how close it gets.

Unfortunately though, at the rate my PC can calculate, I don’t think I’m personally gonna be hitting an answer to this any time soon. If I ever do manage to figure it out, I’ll be sure to update… because hell, why not.

I’ve also considered trying to find bigger lists of primes, but meh. I’ve already spent an hour on this that I intended to spend playing D&D so … meh. =]

xthexder@programming.dev · 1 year ago

We got nerd sniped at almost the exact same time, but approached this in very different ways. I applaud your practical approach, but based on what I calculated, you should stop now. It will never reach 99.999%

outofband2@lemmy.world · 1 year ago

Reminds me of https://xkcd.com/221/

1 year ago

Haha! Yep. It checks out; 4 is totally random!

quickpen@sh.itjust.works · 1 year ago

I don’t get it. Can someone Explain Like I’m 5?

There is a function that always returns false, and then a bunch of random output, and one failed?

Is this like another weird quirk in the JavaScript VM or something? Lol?

kman@lemm.ee · edit-2 1 year ago

Prime numbers become less frequent as the numbers get larger, so if you want to implement a function that tests whether a number is prime, just always returning false will get more and more accurate as you count up. The console output is just saying whether it was correct to say the number isn’t prime, and the percent is the accuracy over the previous numbers

quickpen@sh.itjust.works · 1 year ago

Ohhhh, so there isn’t anything crazy going on here.

It’s just literally running that function and tallying up how accurate it is as a prime number checker.

…and it gets more accurate the longer it runs. Okay, I see now. Thanks, lol.