Cypher Sleuthing: Dealing with Dates, Part 4
- 10 minutes read - 2120 words*Updated Dec 15, 2021
My previous part 1, part 2, and part 3 posts on this topic introduced Cypher dates, translated formats to the Cypher-supported ISO 8601 format, calculated durations, accessed components, and translated durations to certain values.
If you read those, then this post is the next post showing how to convert durations across component categories. If you haven’t read the previous posts leading up to this one, feel free to catch up (recommend at least Part 3 as this post’s prequel) - though it isn’t required. 🙂
In our Part 3 post, we covered three duration categories (and related functions) for converting duration values to whole values - Months, Days, Seconds. We also looked at components and how to use them to access other values outside of those months, days, and seconds. In this post, we will combine the three duration functions along with various components to translate values from a unit in one category to a unit in another category. Then, we will take a look at a couple of fun puzzles having to do with Cypher and dates to close up the post.
Rewind: Quick catchup
Let’s review a couple of principles on duration categories and components from our last post before we dive into using them together. There are 3 component groups (shown again in the screenshot below), and these form the foundation for duration units and conversions.
Table can be found in the Cypher manual section.
As a reminder: the column on the right is the key - values in one column can be converted to any other unit in that same cell, but not another row. For instance, I could convert a duration from days to weeks, but not to months or minutes. Notice, also, that these categories correspond to our duration functions of inMonths()
, inDays()
, and inSeconds()
, which we will discuss in this post! Finally, here are the rules I discovered for the Cypher component groups.
You can only convert among units in a component grouping, not across groups.
There must be whole values to convert to larger units. Components do not mix whole and fraction values (i.e.
36 hours
→1.5 days
). There are specific components that handle whole values of the unit, and there are separate components that handle only fractions of the unit.
Now, per our rules, we are limited in expressing certain durations as numbers that aren’t easy to understand or read. For instance, saying that my flight leaves in 28 hours
(while precise) probably triggers a mental calculation to 1 day 4 hours
. Because our brains are used to allocating resources into the largest buckets first, then remainders into smaller categories, our perception of time and planning seems to operate better in these formats.
As another example, most things operate on a 12-hour clock (3:00pm), except for where precision matters. In those cases, you will see a 24-hour clock (15:00). But telling a random person on the street that the time is fifteen hundred
will most likely return confused looks. Think smaller numbers, larger units (1 day vs 24 hours, 2 months vs 60 days, etc).
Ok, so how can we translate some of these larger-number-smaller-unit values into something more easily understood by others? Earlier, I mentioned that the duration categories (Months, Days, Seconds) align with the duration functions (inMonths()
, inDays()
, inSeconds()
). This is to allow conversions across component groups!
Combining Duration functions and components
Using durations functions with components means we can convert our duration values from one component category into another, and then translate among the components within that group.
Note that precision may not always be 100% accurate, as days in a month or hours in a day are not consistent throughout the year. However, the conversions use general standards (30 days in a month, 24 hours in a day).
Let’s expand the power of converting durations across categories by combining functions and components!
Example 1: Translate duration from months/days to weeks/days
MATCH (v:Vacation)
RETURN
duration.between(date(),v.startDate) as preciseDuration,
duration.inDays(date(),v.startDate).weeks as weeks,
duration.inDays(date(),v.startDate).daysOfWeek as daysOfWeek;
In our example above, we are translating a duration in months and days into days and weeks. First, we need to calculate the entire value into days using the duration.inDays()
method. Then, we can access the weeks and days components from that.
Note that, depending on the input dates (and year), the results can vary - i.e. durations with those same dates but in a leap year or from June 30-Sept 1 that includes two consecutive months with 31 days each.
Let’s take another example!
Example 2: Calculate projected hours a resource works on a project
MATCH (:Employee)-[rel:ASSIGNED]-(p:Project)
RETURN duration.between(rel.startDate, p.expectedEndDate) as lengthAssigned,
duration.inSeconds(rel.startDate, p.expectedEndDate).hours as lengthInHours;
You might respond that this assumes an employee spent 24 hours per day and 7 days per week on a project, right? You would be correct! Let’s correct this, since our lengthInHours
value is now just a number and not a duration value.
MATCH (:Employee)-[rel:ASSIGNED]-(p:Project)
RETURN duration.between(rel.startDate, p.expectedEndDate) as lengthAssigned,
duration.inDays(rel.startDate, p.expectedEndDate).weeks as lengthInWeeks,
duration.inDays(rel.startDate, p.expectedEndDate).weeks * 40 as projectHours;
Notice that in order to get the project hours, I took the approach to find the total number of weeks for the project, then calculate the number of hours per week worked (assuming a 40hr work week and 100% allocation of the resource). Now, this number may not be the actual total of the completed project, but would definitely be a good estimate number.
Switching gears just a bit, infant age is always confusing to me because some people go by weeks, months, or years (e.g. baby is 14 months old…cue math calculation in brain). Let’s let Cypher do the work.
Example 2: Calculate age of infant
MERGE (b:Baby)
SET b.dateOfBirth = date('2021-02-28')
RETURN b.dateOfBirth,
duration.between(b.dateOfBirth, date()) as age,
duration.inMonths(b.dateOfBirth, date()).months as months,
duration.inDays(date('2021-02-28'),date()).weeks as weeks;
In the code above, we set the baby’s birthdate as February 28, 2021
, then in the return statement, calculate the baby’s age by precise duration, months, and weeks. In order to get months, we calculate the duration directly to months with the inMonths()
function. Then, to get weeks, we first need to convert to the days/weeks category using the inDays()
function, then grab the component for weeks. No more mental calculation!
We could be even more specific with the components to preserve remainder values.
Example 3: Calculate age of infant with remainder units
MATCH (b:Baby)
RETURN b.dateOfBirth,
duration.between(b.dateOfBirth, date()) as age,
duration.inDays(b.dateOfBirth,date()).weeks as weeks,
duration.inDays(b.dateOfBirth,date()).daysOfWeek as daysOfWeek;
We could use this in a baby tracker app to tell us that a baby is exactly 41 weeks and 3 days old
. This could also be used for more precision on a product/application being live, amount of time without incidents, or many other use cases!
Cypher puzzles
In the past couple of weeks, I have come across a couple of fun puzzles with Cypher dates that I’d like to share with you. I will include answers, but I’ll post those at the bottom, so that those who want to challenge themself without peeking first can solve the puzzles.
Postgres SQL ranges
The first challenge is a calendar appointment query. Postgres received an update, which improves queries in SQL for range data. This gives us a fun opportunity to see what Cypher’s version of this looks like. Let’s take a look at the question.
Available dates in the next month (https://twitter.com/craigkerstiens/status/1402688838124802054)
Updated SQL solution:
SELECT datemultirange(daterange('2021-06-01', '2021-06-30', '[]')) -
range_agg(appointment_dates) AS availability
FROM appointments
WHERE appointment_dates && daterange('2021-06-01', '2021-06-30', '[]');
Solve away! Answer will be posted at the bottom of this post.
Weekly progress of year
Our second challenge is to write a Cypher query that visualizes progress through the year on a weekly basis. We will stick to using regular characters to visualize the progress, so it won’t be anything fancy and no extra tools will be needed. Let’s see our task.
Create a progress bar for how many weeks of the year have passed (include percentage, too)
Example characters for progress visualization:
28*'#' + 22*'-'
Have at it! Answer will be posted at the bottom of this post.
Kudos
Quick shout-out to my colleague Michael Hunger who suggested both of these challenges and provided far cleaner and efficient solutions than those I was able to draft. 😁 Michael is widely revered as the founder of the APOC library, contributor of many other core aspects of Neo4j, and guru on Cypher. He may very well be the cornerstone of Neo4j developers and maintains an impressive presence on all content platforms, so if you have needed help on anything Neo4j-related over the years, there’s a high probability that you have run into him.
Solution: Postgres SQL ranges
There are several ways you can write this query, and even more when you start considering different data models. However, we will see two solutions that work, then I’ll include a brief explanation of the logic.
WITH date('2021-06-01') as start, date('2021-06-30') as end
UNWIND [days IN range(0,duration.between(start,end).days) | start + duration({days:days})] as day
OPTIONAL MATCH (a:Appointment {date:day}) WITH * WHERE a IS NULL
RETURN day;
WITH date('2021-06-01') as start, date('2021-06-30') as end
UNWIND [days IN range(0,duration.between(start,end).days) | start + duration({days:days})] as day
WITH * WHERE NOT EXISTS { (:Appointment {date:day}) }
RETURN day;
In both of our solutions, we are first setting a start and end date of the month (you could choose any, but we just picked last month) and unwinding the days between those two dates (from start to end of the duration between) as each day. Then, we take all those days
and see if there is an appointment that already exists on any of them and return only the remaining days.
Solution: Weekly progress of year
Just as with the first challenge’s solution, there are some different ways to tackle this one. You could use a variety of characters and tools to create something intriguing, but we are keeping it simple, clean, and efficient.
WITH datetime().week as week, 52 as weeks
RETURN reduce(r='',w in range(1,weeks) | r + CASE WHEN w < week THEN '#' ELSE '-' END) + ' ' + (100*week/weeks) +'%' as progress;
WITH datetime().week as week, 52 as weeks
RETURN reduce(r='',w in range(1,weeks) | r + CASE WHEN w < week THEN '>' WHEN w=week THEN '8' ELSE '<' END) + ' ' + (100*week/weeks) +'%' as progress;
The solutions to this problem might appear a bit more complicated, but we can break them down into manageable pieces. First, we need to find out what week of the year is the current week and note the total weeks in the year as our starting and end points. The next line returns the calculation using a reduce()
function, which hops through a list of items and aggregates the current item to the current sum of all previous items. For example, if I had a list of 1,2,3
, then reduce would have final results of 6
(1+2+3).
Inside reduce()
, we establish a result variable, and loop through each week in the number of weeks in the year (range(1,weeks)
). On the right side of the pipe character, we then have our expression to aggregate our variable at each item in the list. We evaluate r
- when the week number from our loop is less than the current week (in the past), we use one character; (solution 2) when the week is current week, we use another character; and when the week is greater than current week (yet-to-come), then we use a different character. Finally, we attach a percentage to the end of the output by calculating the current week number divided by total weeks (28/52) and multiplying the resulting fraction by 100 for the result.
Wrapping up!
In this post (Part 4!), we have seen how to combine duration functions and components in order to translate durations in one component category to another one. Then, we put our new skills to the test with a couple of Cypher date challenges, stating the problems to solve, and then walking through some solutions and their logic.
If you’d like to see some more Cypher sleuthing, I presented a session at NODES (Neo4j’s online developer conference) that covered a high-level overview of these date concepts and other gotchas. The recording is now available on YouTube, so feel free to check it out!
Next post, we will step through the date procedures/functions that the APOC library offers and discuss which ones are replaceable with built-in Cypher functionality or still required to accomplish specific tasks related to temporal data.
Until next time, happy coding!
Resources
Cypher Sleuthing: Part 1
Cypher Sleuthing: Part 2
Cypher Sleuthing: Part 3
Cypher Manual: Duration functions
Cypher Manual: Duration Components
Cypher Manual: Reduce function
Cypher Manual: Range function
NODES 2021: Cypher Sleuthing presentation