Dates – SortDate Algorithm #date #sortdate

Initial SortDate Algorithm

The SortDate column in the EventTable contains very large numbers that have defied until now any easy method of decoding. After many experiments with date entry and inspection of the stored values, I now have deduced an algorithm that works fine for encoding single, pure dates and decoding corresponding SortDates. Date modifiers such as Bef, Aft, From-To, etc. have effects that I do not fully understand but there are patterns emerging which can be more completely identified only with more comprehensive testing.

My current model of the RM4 SortDate encoding algorithm begins with a base year of 10000BC that has an already pretty large number as a constant, C. From the date, the year number is extracted, added to 10000 and the result multiplied by a very large coefficient, Ky. Added to that is the month number multiplied by the month coefficient, Km, plus the day number multiplied by the day coefficient, Kd. Algebraically, the encoding expression looks like this:

SortDate (Ds) = Ky*(Y + 10000) + Km*M + Kd*D + C,
where
Ky = 562949953421312  (2^49)
Km = 35184372088832   (2^45)
Kd = 549755813888     (2^39)
C  = 17178820620      (2^33.9999119432032)
Y  = yyyy
M  = mm
D  = dd

The decoding expressions are the inverse:

Y = (Ds - C) / Ky - 10000
M = (Ds - C) % Ky / Km
D = (Ds - C) % Ky % Km / Kd
R = (Ds - C) % Ky % Km % Kd
where
Ds = SortDate
R  = remainder
%  = modulus operator

The remainder, R, is significant for the decoding of modifiers while M=15, and D=63 are results when modifiers are used, as can be seen in the filtered result set below from a RM4 database, using the current decoder:

Date                        SortDate               Year  Month  Day   Remainder
Single Dates
D.+19470000..+00000000..    6725563110703235084    1947    0     0     0
D.+19470100..+00000000..    6725598295075323916    1947    1     0     0
D.+19470101..+00000000..    6725598844831137804    1947    1     1     0
D.+19470102..+00000000..    6725599394586951692    1947    1     2     0
D.+18501214..+00000000..    6671386874267828236    1850    12    14    0
 
Modified Dates
After
DA+18280000..+00000000..    6659134466444753951    1828    15    63    1047571
DA+18710000..+00000000..    6683341314441870367    1871    15    63    1047571
Range
DR+18100000..+18120000..    6648438962291474447    1809    15    63    544962772995
DR+19160000..+19180000..    6708111657465282575    1915    15    63    545073922051
DS+18980000..+19500000..    6697978558337253394    1897    15    63    545107476486
Before
DB+18250000..+00000000..    6656883216385835008    1824    15    63    549755813876
DB+19130000..+00000000..    6706422812286910464    1912    15    63    549755813876
 

It appears that the Before and After modifiers result in remainders that are constants, independent of the date, while the Range modifiers result in varying remainders. Thus inspection of the remainder and testing for known constants should result in ascertaining the type of modifier that was used.

Download

SortDateEncodeDecode.sql
Contains both the encoder query and the decoder query. Separate as required. Most GUI SQLite managers allow you to select the statements you want to execute. This script is deprecated; see below for more complete solutions.

Advanced Date Algorithms by R. Steven Turley

Here are the algorithms for decoding RM5 dates. These may be the same as RM4, but I haven’t tested them (ve3meo Dec 31, 2012 tested in RM6 and okay). They work for all normal dates, but I haven’t tested them on old dates or Quaker dates. I’m hoping those will just translate to their Julian equivalents and work okay. I’ll report those tests later. These dates are best understood in terms of bit fields on a long (64 bit) integer. Bit 0 is the least significant bit.

bitsvalue
0-9flag indicating date modifiers (see table below)
10-15day for the second date; 0 if no second date
16-19month for the second date (1=Jan, 2=Feb, etc), 0 if no second date
20-33year for the second date+10000; 0x3fff if no second date
34-38these were always zero in my tests, must be reserved for something else
39-44day for the first date; 0 if no day is supplied
45-48month for the first date; 0 if no month is supplied
49-64year for the first date + 10000

If the date is missing, the value 0xffffffff (2^65-1) is stored. This makes an event with no date, the last date in the list as far as the sort order is concerned.

The flag fields tell what kind of modifiers the date has. Certainty modifiers are not included in SortDates since they don’t affect the sort order. Here are the possible flag values:

flag valuemeaning
0before
3by
6to
9until
12normal date (no modifier)
15between/and
18from/to
21dash (25 Oct 2011-28 Oct 2011)
24or
27from
30since
31after

Note that the value of these flags determines the sort order for dates as listed above. Since dates that don’t have ranges store their year as 0x3fff, they will always come after dates with ranges that have a year less than this.

Algorithms

For almost all compilers, shift and bit-wise mask operators are faster than the division and remainder operations outlines above. If your system supports it, I recommend the following algorithms to extract the dates and flag from sort dates. In these formulas i>>j is the operation of shifting the number i j bits to the right. The operation i<<j shifts the number i j bits to the left. The operation i&j performs a bitwise and of the two numbers. This is the notation used in C, C++, C#, and python. I’m not sure about other languages. Let Ds be the sort date, as above, Y1 be the year of the first date, M1 be the month of the first date, D1 be the day of the first date, Y2 be the year of the second date, M2 be the month of the second date, D2 by the day of the second date, and F by the flag.

Ds = ((Y1 + 10000)<<49) & (M1<<45) & (D1<<39) & (Y2<<20) & (M2<<16) & (D2<<10) & F ??
Ds = If no Date then
9223372036854775807
else
((Y1 + 10000)<<49) + (M1<<45) + (D1<<39) + (If Date2 then (Y2 + 10000)<<20 else
17178820608)
+ (M2<<16) + (D2<<10) + F

Y1 = (Ds>>49) – 10000
M1 = (Ds>>45) & 0xf
D1 = (Ds>>39) & 0x3f
Y2 = (Ds>>20) & 0x3fff – 10000
M2 = (Ds>>16) & 0xf
D2 = (Ds>>10) & 0x3f
F = Ds & 0x3ff

Downloads

SortDateDecodeDev.sql an example decoder using Steve’s algorithm
SortDateEncoderDev2.sql an encoding example using the modified version of Steve’s algorithm; streamlined, faster and more readable code than SortDateEncoderDev.sql.
SortDates (2013-01-01).rmgbRM6 database with samples of most date formats for testing with above.

Discussions & comments from Wikispaces site


ve3meo

Extended modifier codes and revised algorithm

ve3meo
03 November 2012 15:54:28

Great addition, Steve! Opens the way for a whole bunch of things.


ve3meo

ve3meo
04 November 2012 02:40:45

Steve, I thought your statement about the comparative speed of division versus bit-shifting operations was right on and would have a big difference. Much to my surprise, it works out to be only about 20-25% faster on a very large record set (over 317,000). On my computer, the algorithm executed as division and modulus took around 2s while the same using bit-shifting took around 1.6s with SQLiteSpy.

/* CompareSortDateDecoderSpeeds */

SELECT Date,
(SortDate – 17178820620) / 562949953421312 – 10000 AS Year,
(SortDate – 17178820620) % 562949953421312 / 35184372088832 AS Month,
(SortDate – 17178820620) % 562949953421312 % 35184372088832 / 549755813888 AS Day,
(SortDate – 17178820620) % 562949953421312 % 35184372088832 % 549755813888 AS Remainder
FROM EventTable
;

SELECT Date,
(Ds >> 49) – 10000 AS Y1,
(Ds >> 45) & 15 AS M1,
(Ds >> 39) & 63 AS D1,
Ds & 1023 AS F
FROM (SELECT Date, SortDate AS Ds FROM EventTable)
;


ve3meo

Comment: “If the date is missing, the value 0xffffffff (2^65-1) is stored‍”

ve3meo
03 September 2018 20:19:22

ve3meo Dec 31, 2012

I get
9223372036854775807 or
2^63-1 or
x’7F FF FF FF FF FF FF FF’


ve3meo

Comment: flags “27 from 30 since 31 after‍”

ve3meo
03 September 2018 20:20:56

ve3meo Dec 31, 2012

For encoding, I find the following:
From (27 + x’FFC00′)
Since (30 + x’FFC00′)
After (31 + x’FFC00′)
That is the month and day bits 10-19 are set to 1 for these 3 modifiers.


ve3meo

Comment: “Ds = If no Date then 9223372036854775807 else ((Y1 + 10000)<<49) + (M1<<45) + (D1<<39) + (If Date2 then (Y2 + 10000)<<20 else 17178820608) + (M2<<16) + (D2<<10) + F ‍"

ve3meo
03 September 2018 20:22:38

ve3meo Dec 31, 2012

As implemented successfully in SortDateEncoderDev.sql; maybe the &s were a typo error.

Inline comments


ve3meo

Comment: second year Y2 bits

ve3meo
31 December 2012 20:57:56

second year Y2 bits

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.