Weird date-as-numeric format

Get your science fix here: research, quackery, activism and all the rest
User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 5:40 pm

Got a weird problem here.

I have collected proportion-over-time data very similar to this published graph:
date.png
date.png (48.28 KiB) Viewed 2960 times
and I'd like to compare my data with the fitted curve.

The paper gives the fitted quadratic as Y = 598809 – 31.075X + 0.0004X2.

I'm trying to figure out what format I need to convert my dates into to fit a quadratic on the same scale. (Wouldn't be my first choice of model, but I'm trying to compare with the old paper).

That quadratic has a root at 42280.960215058, and I cannot figure out how to make 15 July 2005 be even roughly that number. By X=42281.3 Y i's already over 1, so the entire x-axis with about 50 days has to fit between 42280.9 and 42281.3 - dafuq?

The paper's analysis was done in SPSS, which apparently uses Lilian format (number of seconds since October 14, 1582). So it's not that.

In Excel (well, LibreOffice), 42280 is 03/10/15, whereas 15/07/05 is 38548 - so it's not that (number of days from 1 Jan 1900) either.

I'm 99.9% sure no fraud has been committed here, but I cannot figure out how whichever program was used to fit the curve was treating the dates. Anybody got an idea?

Thanks in advance.
We have the right to a clean, healthy, sustainable environment.

User avatar
jimbob
Light of Blast
Posts: 5276
Joined: Mon Nov 11, 2019 4:04 pm
Location: High Peak/Manchester

Re: Weird date-as-numeric format

Post by jimbob » Tue Sep 28, 2021 5:45 pm

What numbers does it spit out on the x-axis to make that fit?

Would it be simply a delta from a recent starting point - say when the population is zero?
Have you considered stupidity as an explanation

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 5:54 pm

It could be a delta - days after the 1st of some month is quite common. Except that it can't be days, or even months, because the range is too small.

(I don't have the original data, that's a screenshot from the pdf.)
We have the right to a clean, healthy, sustainable environment.

WFJ
Catbabel
Posts: 648
Joined: Tue Jun 01, 2021 7:54 am

Re: Weird date-as-numeric format

Post by WFJ » Tue Sep 28, 2021 6:13 pm

My guess would be they used a count rather than proportion in the fit, then changed the y axis later. The dates, when converted to reals for the fit, would almost certainly have units of seconds or days.

User avatar
dyqik
Princess POW
Posts: 7527
Joined: Wed Sep 25, 2019 4:19 pm
Location: Masshole
Contact:

Re: Weird date-as-numeric format

Post by dyqik » Tue Sep 28, 2021 6:18 pm

Bird on a Fire wrote:
Tue Sep 28, 2021 5:54 pm
It could be a delta - days after the 1st of some month is quite common. Except that it can't be days, or even months, because the range is too small.

(I don't have the original data, that's a screenshot from the pdf.)
Could it be quarters? Your 0.4 range looks like it covers maybe slightly more than a month, which would be a third of a quarter.

Or maybe 100 days?
Last edited by dyqik on Tue Sep 28, 2021 6:21 pm, edited 3 times in total.

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 6:19 pm

WFJ wrote:
Tue Sep 28, 2021 6:13 pm
My guess would be they used a count rather than proportion in the fit, then changed the y axis later. The dates, when converted to reals for the fit, would almost certainly have units of seconds or days.
That's possible. The y-axis values are supposedly daily mean proportions (of juvenile birds in different flocks - 66 flocks over 16 different days), so each day would have a different denominator etc.

I might just resort to plotting my data separately with the same x-limits, and overlaying the published curve using paint.

Reproducibility ftw.
We have the right to a clean, healthy, sustainable environment.

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 6:24 pm

dyqik wrote:
Tue Sep 28, 2021 6:18 pm
Bird on a Fire wrote:
Tue Sep 28, 2021 5:54 pm
It could be a delta - days after the 1st of some month is quite common. Except that it can't be days, or even months, because the range is too small.

(I don't have the original data, that's a screenshot from the pdf.)
Could it be quarters? Your 0.4 range looks like it covers maybe slightly more than a month, which would be a third of a quarter.

Or maybe 100 days?
Hmm, yes that would fit fairly well. I make it that 40 days is 0.36, roughlyish.

But if it's 42280 quarters, that implies starting 10570 years ago. I know SPSS is old, but that seems a bit crazy.
We have the right to a clean, healthy, sustainable environment.

User avatar
dyqik
Princess POW
Posts: 7527
Joined: Wed Sep 25, 2019 4:19 pm
Location: Masshole
Contact:

Re: Weird date-as-numeric format

Post by dyqik » Tue Sep 28, 2021 6:30 pm

There's a typo in the formula. Plotting it shows that the quadratic term is nowhere near strong enough to produce the curvature in the plot.
plot.png
plot.png (28.72 KiB) Viewed 2916 times

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 6:37 pm

Thanks dyqik! I think you've cracked the problem.

Interestingly the other reported formulas in the paper (for other species) have coefficients on very similar scales.... f.ck knows what's going on there - some weirdness in SPSS's curve-fitting thing, or a f.ckup from the author.

So I'll use Paint, then. You've saved me a lot of time - cheers.
We have the right to a clean, healthy, sustainable environment.

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 6:46 pm

The reported degrees of freedom for some of the F tests don't stack up either, which gives me an inkling as to which hypothesis is likelier...
We have the right to a clean, healthy, sustainable environment.

User avatar
dyqik
Princess POW
Posts: 7527
Joined: Wed Sep 25, 2019 4:19 pm
Location: Masshole
Contact:

Re: Weird date-as-numeric format

Post by dyqik » Tue Sep 28, 2021 6:47 pm

This gets you somewhat closer to the curve in the plot, but I still have no idea what the date units are.
plot.png
plot.png (43.97 KiB) Viewed 2908 times

User avatar
dyqik
Princess POW
Posts: 7527
Joined: Wed Sep 25, 2019 4:19 pm
Location: Masshole
Contact:

Re: Weird date-as-numeric format

Post by dyqik » Tue Sep 28, 2021 6:51 pm

By the way, SMath is a very useful tool for quickly throwing this kind of thing together.

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Tue Sep 28, 2021 8:00 pm

Thanks again! MathCAD looks handy.

I'll drop the author an email and see if he can shed some light, but otherwise won't sweat it.
We have the right to a clean, healthy, sustainable environment.

monkey
After Pie
Posts: 1906
Joined: Wed Nov 13, 2019 5:10 pm

Re: Weird date-as-numeric format

Post by monkey » Tue Sep 28, 2021 9:56 pm

Bird on a Fire wrote:
Tue Sep 28, 2021 8:00 pm
Thanks again! MathCAD looks handy.

I'll drop the author an email and see if he can shed some light, but otherwise won't sweat it.
If there was a typo, the paper will need a correction. I've made that happen before when I found one where the equations were correct but put on the wrong figures (nothing else wrong with the paper).

Millennie Al
After Pie
Posts: 1621
Joined: Mon Mar 16, 2020 4:02 am

Re: Weird date-as-numeric format

Post by Millennie Al » Thu Sep 30, 2021 1:14 am

I have had a look at the paper and I think the equations are wrong. I'd guess that the wrong input was used to derive them. The graphs look plausible, though, so maybe they were plotted first and then the analysis was run for the equations and used the wrong columns or something.

Here are my notes as I may have made errors in copying or solving the equations. For each equation I solve for y=0 and also estimate it by eye from the corresponding graph. The values are then summarised at the end, sorted. T is 15 July. I also not to minor anomalies between graph and text.

redshank
y = 366.48 – 0.0095x
0 at about T+90 days = 38577

Black-Tailed godwit
y = 598809 – 31.075x + 0.0004x^2
0 at about T+7 = 42281
The text says "The first fledged juvenile in a flock was found on 19th July" (T+4), but looks more like T+7

dunlin
y = –330820 + 17.134x – 0.0002x^2
0 at about T-2 = 29391

red knot
y = – 520573 + 26.956x – 0.0003x^2
0 at about T+10 = 28099

Purple sandpiper
y = – 907.64 + 0.0235x
0 at about T+3 = 38623

Sanderling
y = –1154.764 + 0.0299x
0 at about T+17 = 38621

Turnstone
y = 772857 – 40.088x + 0.0005x^2
0 at about T+10 = 47920

Ringed plover
Y = – 484.43 + 0.0126X
"When the first flocks were surveyed on 24 July, JP was already as high as 0.2"
but graph shows first two values are 0.4
0 at about T-20 = 38447

Sorting all those zeroes by solution:
-2 29391
-20 38447
+90 38577
+17 38621
+3 38623
+7 42281
+10 47920

There is clearly no consistent time value there. Maybe the fitting used total number of juveniles seen, or total birds seen, instead of JP. Or maybe the date conversion went wrong. It's probably a simple fix for anyone who has the raw data - hopefully it's still around as the paper was published in 2006.

User avatar
Bird on a Fire
Princess POW
Posts: 10137
Joined: Fri Oct 11, 2019 5:05 pm
Location: Portugal

Re: Weird date-as-numeric format

Post by Bird on a Fire » Thu Sep 30, 2021 10:40 am

Wow - thanks Millennie Al. That's above and beyond what I was expecting!

I'll get in touch with the author (I've met him) and see if he can shed some light.
We have the right to a clean, healthy, sustainable environment.

sheldrake
After Pie
Posts: 1819
Joined: Fri Dec 20, 2019 2:48 am

Re: Weird date-as-numeric format

Post by sheldrake » Thu Sep 30, 2021 3:55 pm

The number given is close to, but not exactly, the number of days between Jan 1st 1900 and July 5th 2015

42188 vs 42280

Probably a red herring.

User avatar
shpalman
Princess POW
Posts: 8242
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Weird date-as-numeric format

Post by shpalman » Thu Sep 30, 2021 4:09 pm

That's definitely the "wrong" way to be doing it if similar curves have such wildly different coefficients, mainly because the date squared ends up being a huge number which needs a similarly huge constant to offset it.

Something like a(x-x0)^2+b(x-x0)+c would have been better - set x0 to the start of the data series somewhere, or leave it free (and its value will actually tell you something useful). Then depending on the model you might even be able to set c=0.
having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

User avatar
shpalman
Princess POW
Posts: 8242
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Weird date-as-numeric format

Post by shpalman » Thu Sep 30, 2021 6:29 pm

Bird on a Fire wrote:
Tue Sep 28, 2021 5:40 pm
... 15 July 2005...
sheldrake wrote:
Thu Sep 30, 2021 3:55 pm
The number given is close to, but not exactly, the number of days between Jan 1st 1900 and July 5th 2015
Which year is it supposed to be anyway?
having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

sheldrake
After Pie
Posts: 1819
Joined: Fri Dec 20, 2019 2:48 am

Re: Weird date-as-numeric format

Post by sheldrake » Thu Sep 30, 2021 6:33 pm

Typo on my part, I did the calculation for 2005

User avatar
shpalman
Princess POW
Posts: 8242
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Weird date-as-numeric format

Post by shpalman » Thu Sep 30, 2021 6:38 pm

shpalman wrote:
Thu Sep 30, 2021 4:09 pm
That's definitely the "wrong" way to be doing it if similar curves have such wildly different coefficients, mainly because the date squared ends up being a huge number which needs a similarly huge constant to offset it.
... which means that its coefficient is a small number, but it's vitally important to be machine-precise with it or else it completely changes the behaviour. You can't just say, as the author did, meh it's small I'll just give one significant figure.

Using LibreOffice's zero day convention (30/12/1899) and assuming the graph is 2005 I get something similar to the graph with

0.0004395799826*(date)^2-33.885347*date+653019.2

but if you don't use all those decimal places it is nowhere near.
having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

User avatar
shpalman
Princess POW
Posts: 8242
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Weird date-as-numeric format

Post by shpalman » Thu Sep 30, 2021 8:32 pm

It's https://www.hi.is/sites/default/files/m ... _props.pdf ?

By the way my "data" from the Black-Tailed Godwit image is something like

Code: Select all

15/07/05	0.0
09/09/05	1.0
18/07/05	0.0
20/07/05	0.0
21/07/05	0.1
22/07/05	0.0
27/07/05	0.0
02/08/05	0.1
03/08/05	0.1
04/08/05	0.3
08/08/05	0.1
10/08/05	0.6
12/08/05	0.7
16/08/05	0.8
18/08/05	0.4
24/08/05	0.7
26/08/05	1.0
27/08/05	1.0
... if anyone wants to play with fitting a quadratic to it using various options for day zero.

As I said I think the problem, apart from not knowing which date is day zero, is imprecision in the reported coefficients, and I think the linear fits will suffer less from that:
Millennie Al wrote:
Thu Sep 30, 2021 1:14 am
...
redshank
y = 366.48 – 0.0095x
0 at about T+90 days = 38577
...
Purple sandpiper
y = – 907.64 + 0.0235x
0 at about T+3 = 38623

Sanderling
y = –1154.764 + 0.0299x
0 at about T+17 = 38621
...
Ringed plover
Y = – 484.43 + 0.0126X
"When the first flocks were surveyed on 24 July, JP was already as high as 0.2"
but graph shows first two values are 0.4
0 at about T-20 = 38447
Keeping the zeroes corresponding to the linear fits:
-20 38447
+90 38577
+17 38621
+3 38623
They're much closer together. 38548 is 15/07/15 in LibreOffice.
having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

User avatar
dyqik
Princess POW
Posts: 7527
Joined: Wed Sep 25, 2019 4:19 pm
Location: Masshole
Contact:

Re: Weird date-as-numeric format

Post by dyqik » Thu Sep 30, 2021 8:53 pm

Is there really enough data there to conclude that a quadratic fit is better than a linear fit?

User avatar
shpalman
Princess POW
Posts: 8242
Joined: Mon Nov 11, 2019 12:53 pm
Location: One step beyond
Contact:

Re: Weird date-as-numeric format

Post by shpalman » Thu Sep 30, 2021 8:58 pm

having that swing is a necessary but not sufficient condition for it meaning a thing
@shpalman@mastodon.me.uk

monkey
After Pie
Posts: 1906
Joined: Wed Nov 13, 2019 5:10 pm

Re: Weird date-as-numeric format

Post by monkey » Thu Sep 30, 2021 9:08 pm

dyqik wrote:
Thu Sep 30, 2021 8:53 pm
Is there really enough data there to conclude that a quadratic fit is better than a linear fit?
Both are wrong - I'd be surprised if you started a bit earlier and measured a negative population :)

Post Reply