/*! This crate provides a `parse` function to convert English time expressions into a pair of timestamps representing a time range. It converts "today" into the first and last moments of today, "May 6, 1968" into the first and last moments of that day, "last year" into the first and last moments of that year, and so on. It does this even for expressions generally interpreted as referring to a point in time, such as "3 PM". In these cases the width of the time span varies according to the specificity of the expression. "3 PM" has a granularity of an hour, "3:00 PM", of a minute, "3:00:00 PM", of a second. For pointwise expression the first moment is the point explicitly named. The `parse` expression actually returns a 3-tuple consisting of the two timestamps and whether the expression is literally a range -- two time expressions separated by a preposition such as "to", "through", "up to", or "until". # Example ```rust extern crate two_timer; use two_timer::{parse, Config}; extern crate chrono; use chrono::naive::NaiveDate; pub fn main() { let phrases = [ "now", "this year", "last Friday", "from now to the end of time", "Ragnarok", "at 3:00 pm today", "5/6/69", "Tuesday, May 6, 1969 at 3:52 AM", "March 15, 44 BC", "Friday the 13th", "five minutes before and after midnight", ]; // find the maximum phrase length for pretty formatting let max = phrases .iter() .max_by(|a, b| a.len().cmp(&b.len())) .unwrap() .len(); for phrase in phrases.iter() { match parse(phrase, None) { Ok((d1, d2, _)) => println!("{:width$} => {} --- {}", phrase, d1, d2, width = max), Err(e) => println!("{:?}", e), } } let now = NaiveDate::from_ymd_opt(1066, 10, 14).unwrap().and_hms(12, 30, 15); println!("\nlet \"now\" be some moment during the Battle of Hastings, specifically {}\n", now); let conf = Config::new().now(now); for phrase in phrases.iter() { match parse(phrase, Some(conf.clone())) { Ok((d1, d2, _)) => println!("{:width$} => {} --- {}", phrase, d1, d2, width = max), Err(e) => println!("{:?}", e), } } } ``` produces ```text now => 2019-02-03 14:40:00 --- 2019-02-03 14:41:00 this year => 2019-01-01 00:00:00 --- 2020-01-01 00:00:00 last Friday => 2019-01-25 00:00:00 --- 2019-01-26 00:00:00 from now to the end of time => 2019-02-03 14:40:00 --- +262143-12-31 23:59:59.999 Ragnarok => +262143-12-31 23:59:59.999 --- +262143-12-31 23:59:59.999 at 3:00 pm today => 2019-02-03 15:00:00 --- 2019-02-03 15:01:00 5/6/69 => 1969-05-06 00:00:00 --- 1969-05-07 00:00:00 Tuesday, May 6, 1969 at 3:52 AM => 1969-05-06 03:52:00 --- 1969-05-06 03:53:00 March 15, 44 BC => -0043-03-15 00:00:00 --- -0043-03-16 00:00:00 Friday the 13th => 2018-07-13 00:00:00 --- 2018-07-14 00:00:00 five minutes before and after midnight => 2019-02-02 23:55:00 --- 2019-02-03 00:05:00 let "now" be some moment during the Battle of Hastings, specifically 1066-10-14 12:30:15 now => 1066-10-14 12:30:00 --- 1066-10-14 12:31:00 this year => 1066-01-01 00:00:00 --- 1067-01-01 00:00:00 last Friday => 1066-10-05 00:00:00 --- 1066-10-06 00:00:00 from now to the end of time => 1066-10-14 12:30:00 --- +262143-12-31 23:59:59.999 Ragnarok => +262143-12-31 23:59:59.999 --- +262143-12-31 23:59:59.999 at 3:00 pm today => 1066-10-14 15:00:00 --- 1066-10-14 15:01:00 5/6/69 => 0969-05-06 00:00:00 --- 0969-05-07 00:00:00 Tuesday, May 6, 1969 at 3:52 AM => 1969-05-06 03:52:00 --- 1969-05-06 03:53:00 March 15, 44 BC => -0043-03-15 00:00:00 --- -0043-03-16 00:00:00 Friday the 13th => 1066-07-13 00:00:00 --- 1066-07-14 00:00:00 five minutes before and after midnight => 1066-10-13 23:55:00 --- 1066-10-14 00:05:00 ``` For the full grammar of time expressions, view the source of the `parse` function and scroll up. The grammar is provided at the top of the file. # Relative Times It is common in English to use time expressions which must be interpreted relative to some context. The context may be verb tense, other events in the discourse, or other semantic or pragmatic clues. The `two_timer` `parse` function doesn't attempt to infer context perfectly, but it does make some attempt to get the context right. So, for instance "last Monday through Friday", said on Saturday, will end on a different day from "next Monday through Friday". The general rules are 1. a fully-specified expression in a pair will provide the context for the other expression 2. a relative expression will be interpreted as appropriate given its order -- the second expression describes a time after the first 3. if neither expression is fully-specified, the first will be interpreted relative to "now" and the second relative ot the first The rules of interpretation for relative time expressions in ranges will likely be refined further in the future. # Clock Time The parse function interprets expressions such as "3:00" as referring to time on a 24 hour clock, so "3:00" will be interpreted as "3:00 AM". This is true even in ranges such as "3:00 PM to 4", where the more natural interpretation might be "3:00 PM to 4:00 PM". # Years Near 0 Since it is common to abbreviate years to the last two digits of the century, two-digit years will be interpreted as abbreviated unless followed by a suffix such as "B.C.E." or "AD". They will be interpreted as the the nearest appropriate *previous* year to the current moment, so in 2010 "'11" will be interpreted as 1911, not 2011. # The Second Time in Ranges For single expressions, like "this year", "today", "3:00", or "next month", the second of the two timestamps is straightforward -- it is the end of the relevant temporal unit. "1971" will be interpreted as the first moment of the first day of 1971 through, but excluding, the first moment of the first day of 1972, so the second timestamp will be this first excluded moment. When the parsed expression describes a range, we're really dealing with two potentially overlapping pairs of timestamps and the choice of the terminal timestamp gets trickier. The general rule will be that if the second interval is shorter than a day, the first timestamp is the first excluded moment, so "today to 3:00 PM" means the first moment of the day up to, but excluding, 3:00 PM. If the second unit is as big as or larger than a day, which timestamp is used varies according to the preposition. "This week up to Friday" excludes all of Friday. "This week through Friday" includes all of Friday. Prepositions are assumed to fall into either the "to" class or the "through" class. You may also use a series of dashes as a synonym for "through", so "this week - fri" is equivalent to "this week through Friday". For the most current list of prepositions in each class, consult the grammar used for parsing, but as of the moment, these are the rules: ```text up_to => [["to", "until", "up to", "till"]] through => [["up through", "through", "thru"]] | r("-+") ``` # Pay Periods I'm writing this library in anticipation of, for the sake of amusement, rewriting [JobLog](https://metacpan.org/pod/App::JobLog) in Rust. This means I need the time expressions parsed to include pay periods. Pay periods, though, are defined relative to some reference date -- a particular Sunday, say -- and have a variable period. `two_timer`, and JobLog, assume pay periods are of a fixed length and tile the timeline without overlap, so a pay period of a calendrical month is problematic. If you need to interpret "last pay period", say, you will need to specify when this pay period began, or when some pay period began or will begin, and a pay period length in days. The `parse` function has a second optional argument, a `Config` object, whose chief function outside of testing is to provide this information. So, for example, you could do this: ```rust # extern crate two_timer; # use two_timer::{parse, Config}; let (reference_time, _, _) = parse("5/6/69", None).unwrap(); let config = Config::new().pay_period_start(Some(reference_time.date())); let (t1, t2, _) = parse("next pay period", Some(config)).unwrap(); ``` # Ambiguous Year Formats `two_timer` will try various year-month-day permutations until one of them parses given that days are in the range 1-31 and months, 1-12. This is the order in which it tries these permutations: 1. year/month/day 2. year/day/month 3. month/day/year 4. day/month/year The potential unit separators are `/`, `.`, and `-`. Whitespace is optional. # Timezones At the moment `two_timer` only produces "naive" times. Sorry about that. # Optional Features The regular expression used by two-timer is extremely efficient once compiled but extremely slow to compile. This means that the first use of the regular expression will ocassion a perceptible delay. I wrote two-timer as a component of a Rust re-write of a Perl command line application I also wrote, [App::JobLog](https://metacpan.org/pod/distribution/App-JobLog/bin/job). Compiling the full time grammar required by two-timer makes the common use cases for the Rust version of the application slower than the Perl version. To address this I added an optional feature to two-timer that one can enable like so: ```toml [dependencies.two_timer] version = "~1.3.0" features = ["small_grammar"] ``` This will cause two-timer to attempt to parse a time expression initially with a simplified grammar containing only the typical expressions used with JobLog, falling back on the full grammar if this fails. These are 1. Days of the week, optionally abbreviated * Tuesday * tue * tu 2. Month names * June * Jun 3. Days, months, or fixed periods of time modified by "this" or "last" * this month * last week * this year * this pay period * last Monday 4. Certain temporal adverbs * now * today * yesterday }); */ #![recursion_limit = "1024"] #[macro_use] extern crate pidgin; #[macro_use] extern crate lazy_static; extern crate chrono; extern crate serde_json; use chrono::naive::{NaiveDate, NaiveDateTime}; use chrono::{Datelike, Duration, Local, Timelike, Weekday}; use pidgin::{Grammar, Match, Matcher}; use regex::Regex; lazy_static! { // making this public is useful for testing, but best to keep it hidden to // limit complexity and commitment #[doc(hidden)] pub static ref GRAMMAR: Grammar = grammar!{ (?ibBw) TOP -> r(r"\A") r(r"\z") // non-terminal patterns // these are roughly ordered by dependency time_expression => | particular => | one_time => two_times -> ("from")? to => | moment_or_period => | period => | specific_period => | | | modified_period -> ? modifiable_period => [["week", "month", "year", "pay period", "payperiod", "pp", "weekend"]] | | month_and_year -> year => | ("-")? year -> year_suffix => | relative_period -> count => r(r"[1-9][0-9]*") | named_period => | moment -> ? adjustment -> // two minutes before amount -> point_in_time -> ? ? | |